Richard H. Morrow

International Health, Division of Community Health and Health Systems, Johns Hopkins University School of Hygiene and Public Health, Baltimore, Maryland, USA

Keywords: Epidemiology, health and disease, epidemiological methods, epidemiological applications, risk factors and concepts of cause, descriptive epidemiology, cohort and case-control studies, randomized trials, health policy, planning and management, assessment of health programs, composite indicators of healthy life, effectiveness and equity


1. What Is Epidemiology?

2. Purposes of Epidemiology

3. Defining and Measuring Health and Disease

4. Descriptive Epidemiology

5. Epidemiological Approaches to Understanding Causal Relations

6. Experimental Epidemiology: The Randomized Trial

7. Epidemiology for Health Systems: Use in Policy, Planning, and Assessment

8. The Future of Epidemiology

Related Chapters



Biographical Sketch

3. Defining and Measuring Health and Disease   

The first step towards understanding the basis of epidemiology is to agree on definitions for health and disease. In its charter in 1948, the World Health Organization (WHO) defined health as not merely the absence of disease, but rather in positive terms as "A state of complete physical, mental and social well-being and not merely the absence of disease or infirmity." Although this is an important ideological conceptualization, it has not been operationally useful. For most epidemiological purposes, objectives of health programs are more readily defined in terms of prevention or treatment of disease.

Disease has been defined in many ways. Generally illness, sickness, and disease are used interchangeably, but some make distinctions. For example, Susser[3] recommends that disease represents a physiological or psychological dysfunction, while illness is a subjective state of the person who is experiencing a state of not feeling well, and sickness is a state of social dysfunction that the individual assumes when not feeling well. But for purposes of defining and measuring disease in general, a broad definition may be most useful. Disease is anything that an individual (or population) experiences that causes, literally, "dis-ease" (i.e. anything that leads to discomfort, pain, distress, disability, or death constitutes disease for any reason including injuries or psychiatric disabilities).

Epidemiological notions and tools are used to elucidate the natural history of disease. Most specific diseases have a characteristic pattern from onset through progression of the disease process to termination of the process either through recovery or death. There is wide variation in the patterns of disease evolution. The onset of disease usually will be dated from the start of symptoms or signs as determined by the individual afflicted, a family member, a medical practitioner, or as the result of a lab test. Figure 1 illustrates healthy life lost from disability and from premature death due to typical cases of cirrhosis, polio, and multiple sclerosis in terms of onset, extent, and duration of disability and termination. The conclusion of the disease process depends on a host of factors from correct diagnosis to appropriate treatment. Possible outcomes include clinical recovery with complete disappearance of clinical signs and symptoms, recovery from the acute phase of disease but with residual effects such as paralytic polio, or death primarily as a result of the disease. The latter includes death directly caused by the disease and that indirectly brought about as a result of complication such as heart disease as a complication of diabetes. Termination of a disease state may also be marked by recovery followed by progression to another disease such as cirrhosis following hepatitis infection.

Figure1. Patterns of healthy life lost

3.1. Disease Nomenclature and Classification

Diagnosis and classification of specific diseases is central in determining what health intervention programs would be most useful. Understanding the pathogenesis of the disease process and defining (classifying, categorizing, and/or diagnosing) disease is critical to understand and classify "causes" so that the most effective prevention and treatment strategies for reducing the effects of a disease or risk factor can be selected.

The nomenclature and classification of diseases undergo continuing change as our understanding of disease processes and their causes advances. In general, diagnostic classification is based upon two general, quite different, characterizations of disease. The first is a description of the pathological features of the disease process. Examples include pneumonitis (inflammation of the lung) or pancreatic carcinoma. The second depends upon the underlying causal factor. Examples include influenza and cholera. Often a diagnosis will include descriptive terms combined with a causal factor such as meningococcal meningitis or rheumatic heart disease.

The International Statistical Classification of Diseases and Related Health Problems (ICD) is the most widely used classification system. It is published by WHO in a book that is revised every 10 years or so by an internationally representative committee of experts that meets regularly and reviews advances in the understanding of diseases. It is now in its tenth edition (ICD-10), which came into effect in 1993 exactly 100 years after the original. Every disease entity is assigned an alphanumeric code number in a hierarchical arrangement; there are now 21 major disease divisions and each has a series of subdivisions. Some of the divisions are based on etiology, such as Chapter I (A00-B99): Certain Infectious and Parasitic Diseases; some on body organ systems, such as Chapter VI (G00-G99): Diseases of the Nervous System; and others on classes of conditions, such as Chapter XIX (S00-T98): Injury, Poisoning and Certain Other Consequences of External Causes. The classification is now increasingly used for death certification and for hospital inpatient discharge coding. When properly used, it is a very valuable tool for epidemiological studies.

3.2. Counting Disease

Counting the frequency of disease can be done in several ways; it is important to understand what these different methods of counting actually mean. The most useful way may depend upon the nature of the disease and upon the purpose for which it is being counted. The starting point for any counting must be a clear definition of the disease or event that is to be counted as discussed above. Further, it is important to be clear whether the number of cases refers to individuals or episodes or attendances. For a particular disease a person may have several separate attacks in one year and may attend a clinic two or three times for each attack. Only one person has been ill but has had several episodes and attended a health service several times for each episode.

The three basic measures of disease occurrence are the incidence density, often simply termed the incidence rate; the cumulative incidence; and the prevalence.

Incidence is the fundamental building block for epidemiological inference. Incidence is a measure of events (i.e. transition from a non-diseased to a diseased state) and can be considered as a measure of risk. This risk can be looked at in any population group, defined by age, sex, place, time, socio-demographic characteristics, occupation, or indeed, for example, by exposure to a toxin or any suspected causal factor.

The incidence rate (incidence density) is defined as the number of new cases of disease onset per person-time. There are three critical components: a definition of the onset of the event, a defined population, and a particular period of time. The essential point is new cases of disease—the disease develops in a person who did not have the disease previously. The numerator is the number of new cases of disease (the event) and the denominator is the number of person-time units at risk for developing the disease. Everyone included in the denominator must have the potential to become part of the group that is counted in the numerator. To calculate incidence of prostate cancer, the denominator must include only men, because women are not at risk of developing prostate cancer. The third component is the period of time or time-unit. Any time-unit can be used so long as all those counted in the denominator are followed for the same period as those who are counted as new cases in the numerator. Incidence density directly incorporates time into the denominator and it is generally the most useful measure of disease frequency, often expressed as new events per person-year or per 1000 person-years. For a variety of reasons, including loss to follow-up, individuals in the denominator may not be followed for the full time period specified, so that different individuals are observed for different lengths of time. In such situations, an incidence rate uses a denominator consisting of the sum of the different times each individual was at risk expressed as person-years.

Cumulative incidence or incidence proportion is the number of new cases of a disease that occur in a population at risk for developing the disease during a specified period of time (during which all of the individuals are at risk). It is the proportion of people who develop new disease in a specific period of time.

Note that the expression rate in biology or physics generally refers to time in terms of how fast a process occurs and usually is expressed per instant of time. In epidemiology, rate is often loosely used to express a proportion and to provide a denominator of the population in which new cases are occurring during a period of time, as in cumulative incidence rates above. Use of rate as in incidence density is closer to its more general meaning of how fast new cases are arising and can indeed be expressed as per instant of time, often referred to as the force of mortality or morbidity.

Occasionally, time may be implicitly rather than explicitly specified. In acute epidemics of very limited duration such as most food poisoning outbreaks, most cases occur within a few hours or days after exposure. Cases that may develop months later are not considered part of the same outbreak. But in most situations in which current knowledge of the biology and natural history of the disease does not clearly define a time frame, time must be stated explicitly.

Prevalence is a measure of present status—the number of people who currently have the disease—rather than of newly occurring disease. It measures the proportion of people who have defined disease at a specific time. Thus it is a composite measure made up of two factors: the incidence of the disease that has occurred in the past that continues to the present or to some specified point in time, that is, prevalence equals incidence rate of the disease times the average duration of the disease. For most chronic diseases prevalence rates are more commonly available than are incidence rates.

3.3. Severity of Disease

Measures of disease frequency are central for epidemiological investigations concerning etiology. To understand the importance of a disease in a population, however, it is necessary to consider not only the frequency of the disease but also its severity as indicated by the extent of disability and premature mortality that it causes. Premature mortality is defined as death before the expectation of life at the age of death, had the disease not occurred.

3.3.1. Mortality

Traditionally, mortality has been the most important indicator of the health status of the population. John Graunt developed the first known systematic collection of data on mortality with the Bills of Mortality in the early 1600s in London. He described the age pattern of deaths, categorized them by "cause" as understood at the time, and demonstrated variation from place to place and from year to year. Mortality rates according to age, sex, place, and cause continue as central information about a population’s health status and a crucial input for understanding and measuring the burden of disease; infant and maternal mortality rates are leading indicators to the assessment of the health situation in any region or country. There is considerable literature on the use of mortality to indicate health status and its application to national and sub-national levels and paradigms such as the demographic transition is based largely on the decline of mortality in the under-fives[4].

Both the fact of death by age, sex, and place, which is required by law in most countries through death certification, and cause of death, as required by law in many countries through death registration, provide essential information. Although death is a cardinal event and generally the most widely available kind of health information, in many poor countries the fact of death, let alone the cause, frequently is still not reliably available.

In technically advanced countries, vital statistics (that is, the recording of births and deaths, usually by age, sex, and place) are routinely collected and highly reliable. In most middle-level countries, their reliability and completeness have been steadily improving and often are fairly satisfactory. In the least developed countries, as in most of Africa, however, collection of vital statistics, though improving in many, remains highly incomplete. Even in these countries increasing use of household survey methods in some provides estimates, at least, of the under-five mortality. However, obtaining information about cause of death remains poorly done even in most middle-level countries; most information depends upon special surveys or studies of select populations and under specific circumstances. Verbal autopsies have been used increasingly for judging likely cause of death in under-five children. Diagnostic accuracy is satisfactory for selected causes of death with distinctive symptoms such as neonatal tetanus and severe diarrhea, but sensitivity and specificity for diseases such as malaria, whose symptoms are variable and non-specific, are limited.

Age-specific mortality profiles are a prerequisite for a burden of disease analysis. Although considerable documentation and analysis has been done on child mortality in developing countries, little information is available for adult mortality[5]. Developing countries have higher rates of adult mortality than do the technically advanced nations; mortality rates are higher for both women and men at every age and from most every cause. (In Africa the enormous increase in AIDS deaths in young and middle-aged women and men are not yet reflected in these analyses.)

Mortality can be expressed in two important quantitative measures. The mortality rate is a form of incidence and is expressed as number of deaths per person-time in a defined population in a defined time period. The numerator can be total deaths, age- or sex- specific deaths, or cause-specific deaths, and the denominator are the persons in the stated category as defined earlier for incidence. The case fatality ratio (CFR) is the proportion of those with a given disease who die of that disease (at any time unless specified). The mortality rate is equal to the CFR times the incidence rate of the disease in the population.

3.3.2. Morbidity

Measures of mortality have been the principle indicators of health status of populations for a long time. Death is clearly defined and recognized; it has been recorded in most literate populations for generations and has a long history of use. The problem with mortality-based indicators, however, is that they "note the dead and ignore the living"[6].

Measurements of the extent of morbidity are much more problematic than either frequency or mortality since there is no clearly defined endpoint, such as death provides. There are many forms of morbidity—both non-specific such as fatigue, malaise, nausea, discomfort, and fear and specific symptoms such as vomiting and diarrhea, cough and shortness of breath, pain of many levels, and loss of function. How to measure such diverse consequences of disease to achieve some level of comparability has been a challenge (see Composite Measures Combining Morbidity and Mortality).

Hospital inpatient discharge records when based upon good clinical evidence and coded by well-trained staff can provide high quality data on the major causes of morbidity serious enough to require hospitalization. Further, they provide good cause-of-death data for those hospitalized, and some sense of outcome status of those with serious conditions. Hospital data are generally improving in quality, especially in mid-level income countries and in selected sentinel, usually tertiary care, teaching hospitals in some poor countries. Such information is highly biased because of the highly skewed distribution of those using such hospitals, but it is possible to have a good understanding of those biases and make appropriate adjustments in order to draw useful conclusions.

Generally, outpatient records in most of the world are highly deficient in terms of diagnosis and often provide only patients’ chief complaint and treatment dispensed. The main value of most such records is limited to establishing the fact of using a facility. There are usually very strong biases in the use of outpatient facilities, including access factors (distance and cost of use), nature and severity of the disease problem, and opportunity for alternate services.

Visits to health care facilities, functional disability (measures of activity less than usual) and time spent away from work (absenteeism, work days lost) are used to assess the magnitude of morbidity from various conditions. A common approach to evaluating morbidity in a population has been the assessment of the impact on social roles or functional performance[6] such as days missed from work or spent in bed.

Data about morbidity presented in the literature are often based on self-perceived or observed assessments, and frequently from survey-based, interview information. The perception of morbidity, its reporting, the observation of morbidity, its impact, and other factors are responsible for the very wide variation between reported and measured prevalence of conditions[7]. The variation in morbidity data often has been interpreted to indicate that wealthy individuals and low mortality populations tend to report higher rates of morbidity[8].

The International Classification of Impairments Disabilities and Handicaps (ICIDH) was developed by WHO (1975) to classify non-fatal health outcomes. This assessment was based on a progression from disease to handicap and is analogous to the ICD series. ICIDH categories include impairment—loss or abnormality of psychological, physiological, or anatomical structure or function; disability—restriction or lack of ability to perform an activity considered normal; and handicap—disadvantage from a disability or impairment, for a given individual based on the inability to fulfill a normal role as defined by age, sex, or sociocultural factors. These distinctions clarify more than just processes, and help define the contribution of medical services, rehabilitation facilities, and social welfare to the reduction of disease.

Monitoring consequences of diseases, evaluation of utilization of services, and standardization of a classification and indexing system were originally conceived as the main objectives of the ICIDH. Since its creation the ICIDH classification has also been used to generate indicators for disability such as impairment-free, disability-free, and handicap-free life expectancies as mentioned in the next section[9]. These in turn have been used to estimate "health expectancies," analogous to life expectancy, using severity and preference weights for time spent in states less than perfect health. General measures of disability without regard to cause (often carried out by special surveys) are useful to determine the proportion of the population that is disabled and unable to carry out normal activities such as the health expectancy measure. However, these are of little use as a common denominator for resource allocation and intervention planning.

3.3.3. Composite Measures Combining Morbidity and Mortality

Composite measures that combine mortality and morbidity into one number for measuring the burden of disease have come into increasing use since the early 1990s primarily to provide a common denominator for making comparisons of disease burdens among different diseases and in different populations. They have the potential for becoming a particularly useful tool for assisting in resource allocation and policy decisions.

In most sectors, decisions on resource allocation are based on perceived value for money, but the health sector has had no coherent basis for determining the comparative value of different health outcomes. If decisions are to be made about whether to put money into programs that reduce mortality in under-fives as compared to those that reduce disabling conditions in adults, a common denominator is needed.

Since 1980, and especially in the 1990s, work has been carried out to develop composite indicators combining morbidity and mortality into a single measure that may serve as a common denominator for such purposes. Such an indicator represents the amount of healthy lifetime lost due to a disease from both disability and premature death. The common unit of measure is time lost from healthy life. For a given health budget one would want to see maximum healthy life gain per unit expenditure. An important tool for achieving improved efficiency in health spending would be one that provided a measure of cost effectiveness of health interventions or the ratio of costs to health benefits. Money can then be allocated to the interventions that produce the largest gain in healthy life for the dollar amount spent for that intervention.

In recent years, a number of approaches to measuring health status using composite indicators have been developed[10–13]. Such burden of disease methods and indicators can assist in:

The most important reason for attempting to capture the complex mix of incommensurate consequences resulting from disease into a single number, however, is the need to weigh the benefits of health interventions against their costs. Costs of health programs are expressed in a uni-dimensional measure, namely dollars; therefore the benefits to be achieved from their expenditure must also be so expressed. Healthy lifetime is a uni-dimensional measure that can be used to compress health benefits and losses into the single time dimension. An explicit, objective, quantitative approach should enable better budgetary decisions and permit resource allocation in the health sector to be undertaken in a more effective and equitable fashion.

A composite indicator is simply a tool for decision makers. Like any tool, it can be misused. Conclusions that are reached on the basis of the use of these indicators must be carefully examined and looked at from all viewpoints. Not only are there problems of trying to put so many dimensions together, which inevitably leads to distortion, there are also serious issues concerning reliability and validity of information upon which these are based. All the problems in determining cause of death, counting the number of diseases, and assessing the extent of disability can lead to great uncertainties when they are added and multiplied together. The development of a single indicator with a specific number provides very deceptive substantiality to what may be made up of very fragile data. Thus continuing vigilance in data obtained, compiled, and used is critical, and those responsible for using the tool must have a very clear technical understanding of what is behind the numbers, what are the underlying assumptions, and what are the limitations of these approaches. But with all these caveats, alternative approaches to improved decision making leave even more to be desired.

If all the various forms of disability are to be compared with mortality, they must be measured in an equivalent manner for use in health assessments. To do so, measurement of disability must quantify the duration and severity of this complex phenomenon. A defined process is needed that rates the severity of disability as compared to mortality, measures the duration of time spent in a disabled state, and converts disability from various causes into a common scale.

Three components of morbidity need to be assessed. The first is the case disability ratio (CDR), that is, the proportion of those who are diagnosed with the disease who have disability analogous to the CFR. For most diseases that are diagnosed clinically, the CDR will be one (1.00). However, when the diagnosis is based upon infection rather than disease, such with tuberculosis, or upon a genetic marker rather than the physical manifestation, such as sickle trait, the CDR is likely to be less than one.

The second component is the extent of disability (i.e. how incapacitated the person is as a result of the disease). The extent of disability is expressed from zero, which means no disability, to one (1.00), which is equivalent to death. The assessment of extent is the one component that is subjective, particularly since there are so many different types and dimensions of disability. A number of methods have been tried in efforts to achieve comparability and obtain consensus concerning the measurement of the severity component. The many methods have generated considerable controversy and there is a large and growing literature[6, 8, 14]. For most conditions, a reasonable degree of consensus can be reached within broad categories (e.g. 25% disabled as compared to 50%), but efforts to go to much finer distinctions have been equivocal. Except for high prevalence, chronic conditions, there may be little need to become more refined for purposes of health program decisions.

The third component is the duration of the disease. The duration is generally counted from onset until cure and recovery or death. Sometimes there is continuing permanent disability after the acute phase is completed and thus the duration would be life expectation from the time of onset.

Measures of health status that combine mortality and morbidity (composite indicators) are useful for comparisons within and across populations. They can estimate the quantitative health benefits from interventions and serve as tools to assist in the allocation of resources. The development of such measures entails two major processes[15]: the measurement of life including losses of time from premature mortality and disability; and the valuing of life, which incorporates issues of duration, age, extent of future life, productivity, dependency, and equity. The purpose of developing such measures and the need for refining them becomes clear if the following objectives are to be achieved:


4. Descriptive Epidemiology

©Copyright 2004 Eolss Publishers. All rights reserved. SAMPLE CHAPTERS