Statistics, as a science, is the scientific process of acquisition and management of a given set of data. In the medical field as well as in other life sciences, the term “biostatistics” is often used instead to emphasize its application to medicine and health. Statistics is used not only to provide information on the given health situation but to guide healthcare professionals in the decision-making process whether as part of the research study or as part of clinical work.

The application of statistics undergoes a series of steps creating a cycle of scientific activities. Usually, it begins with the acquisition of health data. This collection of data involves the gathering of health-related information through the use of data collection tools (e.g., survey questionnaires) to accurately acquire details which are pertinent to a given study. Collecting data directly from the respondents are termed as primary sources of data. If the researcher wants to use a given set of data that were collected beyond the scope of the study (e.g., vital statistics and health statistics), then these are termed as secondary sources of data. Before moving on to the next step of the process, accuracy and reliability of the data collection must be confirmed since any alterations or misinformation during this process would inevitably affect the analysis and interpretation of data on hand.

Data management, on the other hand, employs the organization and analysis of health data. Data can be organized in numerous ways, so, every researcher should only use methods depending on the specific goal of the study. For example, if the statistical data must be interpreted as individual units, it can be organized in the form of a raw data or data series (e.g., arranged in arrays or alphabetical order). This is usually done in studies having a small population (e.g., case studies, case series).  Otherwise, if the data needs to be described using a frequency distribution, it can be organized either as discrete or continuous data series using frequency tables. This collection process is frequently used in studies with larger study population. It is important to note that the best method of organizing statistical data primarily depends on the type of variable (e.g., qualitative or quantitative) and its level of measurement (e.g., nominal, ordinal, interval, ratio). Arriving at all possible data organization may not be necessary if, and only if, this will give the best information to the researchers about the objectives of the study.

The use of appropriate methods to organize data will lead to its accurate analysis. In descriptive data analysis, the use of narratives, tables, graphs, and charts can be sufficient to describe the study variables. In the inferential analysis, the researcher needs to either make an estimation of specific clinical or health parameters or perform a hypothesis testing. Several versions of data analysis software are available for use according to the type of research work.[1]

Eventually, accurate and reliable interpretation follows as a result of a properly carried out data analysis, as this step focuses on the generation of correct information based on the findings while relating it to the context of the topic under study. The current generation of discoveries, conclusions, and hypotheses will make future researchers capable of studying its underlying issues and restarting the statistical process, creating a continuous cycle of collecting, organizing, analyzing and interpreting data.

Issues of Concern

There are some issues on statistics as applied in the healthcare setting. Most of these issues are encountered from research studies, both from community and clinical researches. These include, but not limited to, the following points enumerated below:

Data collection

  • The integrity of the data collection[2] 
  • Data collection about “dying patients”[3]
  • Advantages and disadvantages of data collection approaches[4][5] 
  • Researcher-participant partnership[6]

Data organization (and presentation)

  • Use of relational database[7]
  • Creation of frequency distribution: from tabulation to graphical representation[8]
  • Type of charts based on the data analysis method[9]

Data analysis

  • Statistical analysis of small area health studies[10]
  • Misconceptions about data analysis and statistics[11]
  • Limitation of data and its measurement in studying health disparities[12]
  • Ethical issues on the use of secondary data analysis[13]

Data interpretation

  • The interpretation of p-values[14][15] 
  • Steps to data summarization[16]
  • Differences in the application of clinical and statistical significance[17][18] 

Clinical Significance

Although this is not a definite part of the statistical process according to previous and current references on the topic, the "utilization of data in the healthcare setting" can still be an additional part of an overall process.  The use of relevant statistical findings and conclusions are vital in the decision-making process of both internal and external stakeholders of health.

Just as there is a data analysis plan made before any study is implemented, a careful plan is required on how statistical results shall be shared with the appropriate audience. Graphical presentation as used in descriptive studies can effectively aid in the understanding of both technical and non-technical staff. The introduction of statistical output in presenting the results to the stakeholders may not always be beneficial at all times. Instead, the meaning and relevance of the statistical test and the practical application of its conclusion should be given more emphasis. Determining the recipients of the statistical findings will help in defining the method of disseminating specific statistical information.[19]

Generally, the importance of statistics as a tool for the execution of health research and the development of new knowledge and understanding in the healthcare practice has already been proven both in the past and in contemporary situations. While statistics carries its specific language, like medicine, science and other technical areas also do, each healthcare professionals in clinics, hospitals, laboratories, and health industries must not be hindered from knowing the basic concepts of statistics. It must be always emphasized that the application of statistics in health and medicine is meant to help the healthcare team to have a deeper understanding of health-related variables and events, and not to confuse them in any other way, for instance, due to the misuse and abuse of statistics[20], which could, indirectly yet clinically, impact patient outcomes in the future.

Article Details

Article Author

Marlon Bayot

Article Editor:

Ibrahim Abdelgawad


7/10/2020 9:43:47 AM

PubMed Link:




Ali Z,Bhaskar SB, Basic statistical tools in research and data analysis. Indian journal of anaesthesia. 2016 Sep;     [PubMed PMID: 27729694]


Moody LE,McMillan S, Maintaining data integrity in randomized clinical trials. Nursing research. 2002 Mar-Apr;     [PubMed PMID: 11984384]


Fowler FJ Jr,Coppola KM,Teno JM, Methodological challenges for measuring quality of care at the end of life. Journal of pain and symptom management. 1999 Feb;     [PubMed PMID: 10069151]


Saczynski JS,McManus DD,Goldberg RJ, Commonly used data-collection approaches in clinical research. The American journal of medicine. 2013 Nov;     [PubMed PMID: 24050485]


Sarkies MN,Bowles KA,Skinner EH,Mitchell D,Haas R,Ho M,Salter K,May K,Markham D,O'Brien L,Plumb S,Haines TP, Data collection methods in health services research: hospital length of stay and discharge destination. Applied clinical informatics. 2015;     [PubMed PMID: 25848416]


Holden RJ,McDougald Scott AM,Hoonakker PL,Hundt AS,Carayon P, Data collection challenges in community settings: insights from two field studies of patients with chronic disease. Quality of life research : an international journal of quality of life aspects of treatment, care and rehabilitation. 2015 May;     [PubMed PMID: 25154464]


Sestoft P, Organizing research data. Acta veterinaria Scandinavica. 2011;     [PubMed PMID: 21999359]


Manikandan S, Frequency distribution. Journal of pharmacology     [PubMed PMID: 21701652]


In J,Lee S, Statistical data presentation. Korean journal of anesthesiology. 2017 Jun;     [PubMed PMID: 28580077]


Wakefield J,Elliott P, Issues in the statistical analysis of small area health data. Statistics in medicine. 1999 Sep 15-30;     [PubMed PMID: 10474147]


Motulsky HJ, Common misconceptions about data analysis and statistics. British journal of pharmacology. 2015 Apr;     [PubMed PMID: 25134425]


Bilheimer LT,Klein RJ, Data and measurement issues in the analysis of health disparities. Health services research. 2010 Oct;     [PubMed PMID: 21054368]


Tripathy JP, Secondary Data Analysis: Ethical Issues and Challenges. Iranian journal of public health. 2013 Dec;     [PubMed PMID: 26060652]


Nahm FS, What the {i}P{/i} values really tell us. The Korean journal of pain. 2017 Oct;     [PubMed PMID: 29123617]


Tanha K,Mohammadi N,Janani L, P-value: What is and what is not. Medical journal of the Islamic Republic of Iran. 2017;     [PubMed PMID: 29445694]


Yan F,Robert M,Li Y, Statistical methods and common problems in medical or biomedical science research. International journal of physiology, pathophysiology and pharmacology. 2017;     [PubMed PMID: 29209453]


Ferrill MJ,Brown DA,Kyle JA, Clinical versus statistical significance: interpreting P values and confidence intervals related to measures of association to guide decision making. Journal of pharmacy practice. 2010 Aug;     [PubMed PMID: 21507834]


West CP,Dupras DM, 5 ways statistics can fool you--tips for practicing clinicians. Vaccine. 2013 Mar 15;     [PubMed PMID: 23246309]


Kass RE,Caffo BS,Davidian M,Meng XL,Yu B,Reid N, Ten Simple Rules for Effective Statistical Practice. PLoS computational biology. 2016 Jun;     [PubMed PMID: 27281180]


Thiese MS,Arnold ZC,Walker SD, The misuse and abuse of statistics in biomedical research. Biochemia medica. 2015;     [PubMed PMID: 25672462]