7.2 Integration of measles molecular and epidemiological data

Mick Mulders

The virologic surveillance for measles virus has expanded through the efforts of the GMRLN laboratories and the sequence data has been made available in the MeaNS database (http://www.who-measles.org). The aggregate data can be accessed to produce a global snapshot of the molecular epidemiology of measles. However, gaps in virologic surveillance, particularly in the African Region and the South-East Asia Region, leave open the possibility of the detection of a novel genotype or a genotype that had been considered inactive or even extinct.

It should be emphasized that the accurate interpretation of data obtained from molecular surveillance of measles is dependent on the availability of epidemiologic data for individual cases and outbreaks. By integration of the measles molecular surveillance data with corresponding epidemiologic information, it has been possible to track global patterns of circulating genotypes, and to document the progress of programmes to eliminate transmission of endemic virus [1-12].

However, the progress that has been made to decrease global transmission of measles virus has not only reduced the number and diversity of circulating genotypes but also the diversity within genotypes. The phylogenetic analysis conducted using the N-450 region may be insufficient to discern between very similar, but epidemiologically unrelated, viruses within a genotype. In situations that could benefit from the ability to discriminate between closely related viruses within a genotype, particularly in elimination settings, sequencing of additional regions of the viral genome may be required.

If, for example, a prolonged outbreak results from an imported case of measles, it may be difficult to demonstrate that a newly detected cluster of epidemiologically unrelated cases with the same genotype was the result of a new introduction of virus (repeated importation) from the same endemic source. The alternative hypothesis, that missed cases associated with the original outbreak were the source for the newly detected cases, would be difficult to rule out. Sequences that reveal genetic variation that supports the existence of separate transmission pathways among the similar viruses within a genotype would be required to distinguish between the two possible scenarios.

There is evidence to suggest that greater heterogeneity exists among virus strains within a genotype in settings with long-term endemic circulation compared to that associated with viruses from an outbreak that resulted from a single importation [23-27]. It is anticipated that the presence or absence of significant nucleotide variation observed by sequencing additional regions of the measles genome can help elucidate whether separate transmission pathways exist among contemporaneous outbreaks. Investigations are underway to optimize the sequencing methodologies for extended sequencing to provide increased resolution in genetic analyses. The methods and prospects for using extended window sequencing and whole genome sequencing to discriminate among genetically similar strains within a genotype for epidemiologic applications are described in section 7.11.

7.2.1 The named strain designation for circulating variants within a genotype

The 2015 WER update formalized the designation of “named strains” that describe phylogenetically similar strains observed within a genotype. The concept of a named strain provides a convenient means to identify sequences within a genotype that represent an epidemiologically significant viral lineage. A named strain offers an additional frame of reference within the recognized genotypes to describe and track lineages. The virus strain that is eligible for designation as a named strain must have been identified in numerous outbreaks in several countries over a 2-year period. A named strain can be proposed by any of the authorized MeaNS submitters but the sequences for the proposed named strains must be in the public database in MeaNS and be available through GenBank.

The name of the genotype/virus sequence that is used to represent the named strain is derived from the earliest strain within the lineage that is available through GenBank. However, there is no epidemiologic significance implied for the representative strain. In addition, the source or location associated with the WHO name for the earliest strain is incidental and does not define the source for the lineage. As of July 2017, there were 7 named strains in genotype B3, 11 in genotype D4, 11 are genotype D8 viruses, 2 in genotype D9, and 6 named strains have been designated in genotype H1. For a current listing of named strains, refer to the information available in MeaNS.