Case Control Studies

Article Author:
Steven Tenny
Article Editor:
Mary Hoffman
1/19/2019 3:31:00 PM
PubMed Link:
Case Control Studies


A case control study is a type of observational study commonly used to look at factors associated with rare diseases or outcomes.[1] The case control study starts with a group of cases. The cases are the individuals who have the outcome of interest. The researcher then tries to construct a second group of people, the controls. The controls must not have the outcome of interest. The researcher then looks as historical influences to identify if some exposure is found more commonly in the cases than the controls. If the exposure is found more commonly in the cases than the controls, the researcher can hypothesize that the exposure may be linked to the outcome of interest.

For example, a researcher may want to look at the rare cancer, Kaposi's sarcoma. They would find a group of individuals with Kaposi's sarcoma (the cases) and compare them to a group of patients who are similar to the cases but without Kaposi's sarcoma (controls). The researcher could then ask about various exposures to see if any exposure is more common in those with Kaposi's sarcoma (the cases) than those without Kaposi's sarcoma (the controls). The researcher finds that those with Kaposi's sarcoma are more likely to have HIV, and thus they conclude HIV may be a risk factor for the development of Kaposi's sarcoma.

A case control study does not do any intervention thus it is an observational study. No attempt is made to a case control study to alter the disease course or risk factors for the disease, but rather the case control study aims to try to build associations between causative or protective factors.

The case control study design is commonly used to look at rare diseases. If a disease happens to vary infrequently, then you would have to follow a large group of people for a long period of time to accrue enough cases. Such use of resources may be impractical so a case control study can help by picking out the current cases and looking at historical factors.


There are many advantages to the case control study. First, it allows us to study rare diseases.  If a disease only occurred in 1 in 1000 people per year (0.001/year) then in ten years you would expect about 10 cases of 1000 people. If the disease is much rarer, say 1 in 1,000,0000 per year (0.0000001/year) you would either have to follow 1,000,0000 for ten years or 1000 people for 1000 years to get ten total cases. As it may be impractical to follow 1,000,000 for ten years or to wait 1000 years for recruitment, the case control study allows for faster studies.

Second, the case control study design makes it possible to look at multiple different risk factors at a time. In our example above about Kaposi's sarcoma, we could ask both our cases and controls about exposures to HIV, asbestos, smoking, lead, sunburns, aniline dye, alcohol, herpes, human papilloma virus, as well as other issues to help identify the most likely causative agent or agents. 

Because of the above two advantages, case control studies are commonly used as one of the first studies to build evidence between exposure and an event or disease. In a case control study, the investigator can include more controls than cases such as 2:1 or 4:1 for controls to cases to increase the power of the study as controls is usually more easily identified than cases. 


The case control study also has many disadvantages. The most commonly cited disadvantage is the retrospective nature of a case control study and the potential for recall bias.[2] If we ask people with Kaposi's sarcoma about exposure and history (e.g., HIV, asbestos, smoking, lead, sunburn, aniline dye, alcohol, herpes, human papilloma virus), the individuals with the disease are more likely to think harder about these exposures and recall having some of the exposures. If we ask the healthy controls about past exposures, they are less likely to recall exposures to more esoteric exposures. Thus the recall bias in a case control study is the increased likelihood that those with the disease will recall and report exposures versus those without the disease are less likely to report exposures even if the exposure did happen. This recall bias may lead to finding associations between the exposure and disease which do not exist. It is due to subjects' imperfect memories of past exposures.

A second disadvantage of case control studies is the retrospective nature of a case control study. The study looks at what happened in the past to try to find a correlation to the current state; that is, it looks at previous exposures to explain the current disease state. Thus a case control study is a good start for creating hypotheses about correlation but not for testing for causation.

A third disadvantage of a case control study is finding an appropriate control group. Ideally, the case group (those with the disease) and the control group (those without the disease) will have almost the same characteristics. They will differ in the exposures at which the investigator is looking. The control group should have similar age, gender, and health as the case group. The two groups should have similar histories and live in similar environments. If, for example, our cases of Kaposi's sarcoma came from across the country but our controls were only chosen from a small community in northern latitudes where people rarely go outside or get sunburns, asking about sun burn may not be a valid question to investigate.  Similarly, if all of our cases of Kaposi's sarcoma were found to come from a small community outside a battery factory with high levels of lead in the environment, then controls from across the country with minimal lead exposure would not provide an appropriate control group.  Thus the investigator should put much effort into creating a proper control group to bolster the strength of the case control study as well as enhance their ability to find true and valid potential correlations between exposures and disease states.


The major method for analyzing case control studies is the odds ratio. The odds ratio is the odds of having a disease with exposure versus the odds of having the disease without the exposure. The most straight forward way to calculate the odds ratio is with a 2 by 2 table divided by exposure and disease status. Mathematically we can write the odds ratio as follows.

Odds ratio = [ (Number exposed with disease) / (Number exposed without disease) ] / [ (Number not exposed with disease) / (Number not exposed without disease) ]

This can be rewritten as:

Odds ratio = [ (Number exposed with disease) x (Number not exposed without disease) ] / [ (Number exposed without disease ) x (Number not exposed with disease) ] 

The odds ratio tells us how strongly the exposure is related to the disease state. An odds ratio of greater than one implies the disease is more likely with exposure. An odds ratio of less than one implies the disease is less likely with exposure and thus the exposure may be protective.  For example, a patient taking a daily aspirin has a decreased odds of having a heart attack (odds ratio less than one). An odds ratio of one implies there is no relation between the exposure and the disease process.

Issues of Concern

The main issues of concern with a cohort study are recall bias, its retrospective nature, and selection of an appropriate control group.[3] These are discussed above in the disadvantages section.

Clinical Significance

The case control study is a good test for exploring risk factors for rare diseases. Many times an investigator will hypothesize a list of possible risk factors for a disease process. They will then use a case control study to see if there are any possible associations between the risk factors and disease process. The investigator can then use the data from the case control study to hone in a few of the most likely causative factors and use basic science research and other study types (such as cohort studies, randomized clinical studies) to further support the evidence of the possible association between the risk factor and disease process.

  • (Move Mouse on Image to Enlarge)
    • Image 5601 Not availableImage 5601 Not available
      Contributed by Steven Tenny MD, MPH, MBA


[1] Methodology Series Module 2: Case-control Studies., Setia MS,, Indian journal of dermatology, 2016 Mar-Apr     [PubMed PMID: 27057012]
[2] Bias in observational study designs: case-control studies., Sedgwick P,, BMJ (Clinical research ed.), 2015 Jan 30     [PubMed PMID: 25636996]
[3] Efficient sampling in unmatched case-control studies when the total number of cases and controls is fixed., Groenwold RHH,van Smeden M,, Epidemiology (Cambridge, Mass.), 2017 Jul 4     [PubMed PMID: 28682849]
[4] Lewallen S,Courtright P, Epidemiology in practice: case-control studies. Community eye health. 1998;     [PubMed PMID: 17492047]