Article Index

3.2 ATTRIBUTES TO CONSIDER IN SELECTING SECONDARY DATA SOURCES

If the secondary sources of the data required to develop the indicators to be monitored, the characteristics of the data must be taken into consideration prior to their selection. The relevant attributes for selecting secondary data sources to generate population-based health indicators are described below. However, these attributes can and should be evaluated in the context of the purpose for which the data will be used. In addition, the selection of secondary source data should also consider the advantages and disadvantages outlined in section 3.1. The attributes are:

  • POPULATION REPRESENTATIVENESS: Representativeness is an attribute that involves the absence of selection bias with respect to the population that the indicator is intended to represent. Non-representative samples (such as convenience samples or samples based on sentinel units), samples with high rates of non-response, or samples that reflect underreporting in information systems, are examples of factors that can compromise the representativeness of a data source. For example, a country's live birth information system is a universal system, because it is supposed to include all children born alive in all types of facilities or birthing sites. However, it is known that births in conditions of greater vulnerability (poorer regions, rural areas, areas with lack of housing, indigenous ethnicity, among other factors) might not be reported to the system. In such a case, there is a bias in the representativeness of those population groups. Similarly, research on victims of violence based on sentinel unit samples (reference health services serving such victims), might not be representative of the population. One reason is that this type of sampling systematically excludes victims with less severe injuries or with fatal injuries but were not treated in a health facility.
  • PERIODICITY: Data can be compiled continuously in systems such as civil registries, cancer registries, and surveillance systems for reportable diseases. Data can also be compiled periodically, which is to say at regular intervals (for example, 10-year population census, triennial survey of schoolchildren), or without predefined periodicity; and at a particular point in time (for example, health surveys on specific subjects, academic research projects, etc.). Although specific health-related studies are recognized as useful sources of important information for developing specific indicators, their usefulness for monitoring long-term indicators is limited. Nevertheless, a combination of various specific research studies can serve to indicate trends, even if the studies do not provide for ideal methodological comparability. One example is research on the prevalence of smoking conducted with different methodologies and target populations. Such research can, nevertheless, provide the general direction of a trend, with the caveat that the relevant limitations must be taken into account.
  • VALIDITY: This refers to the ability of the source to measure what is intended to be measured (absence of distortions, bias, or systematic errors). The most relevant biases are those related to selection of the study population and the quality of the information compiled. The data source should include the variables needed to develop the indicator. An example is a live birth information system that includes data on congenital malformations (including microcephaly). In general, observations at birth without supplementary examination and monitoring of the children tend to underestimate the prevalence of congenital malformation. Although the system may be quite valid as a database for a series of other indicators, it is not valid to estimate the prevalence of congenital malformations in children.
  • TIMELINESS: The timeliness of the source involves the availability and reliability of the data at the time it is needed to construct the indicators. Thus, timely produced indicators provide better opportunities for making health-related decisions.
  • STRATIFICATION: Many health-related problems require indicators that are stratified according to population subgroups or by areas of particular interest. Multiple analytical interpretations can be derived from the level of disaggregation available in the selected data source. These considerations can significantly expand or limit the use of the indicator for decision-making.
  • SUSTAINABILITY: This attribute represents the source's potential to remain relevant and be of the quality needed to generate information over time. This depends not only on the periodicity of the data collection, but on the availability of the financial resources needed to sustain that particular source of data; the presence of a legal framework; political will, among other factors. Surveys conducted by telephone tend to be more sustainable because they require fewer resources. However, telephone surveys have limitations not found in surveys based on personal interviews and biometrical measurements.
  • PRECISION: Some well-designed probabilistic samples that ensure representativeness have some degree of imprecision-a factor to be considered in any sample-based indicator. Imprecision may arise, for example, when calculating confidence intervals that inform the user (usually with 95% confidence) of the plausible value of an indicator as applied to the population from which the sample was taken. Indicators developed from census sources, such as, population censuses, universal data sources, and vital statistics information systems, etc., are free of imprecision.
  • ACCESS TO DATA: This refers to ensuring the availability of data to the public through national data repositories and other means.