7.11 Methods and prospects for enhancing resolution of sequence data for molecular epidemiology

Mick Mulders


Sequence analysis of larger regions of the measles or rubella genome (extended sequencing), including whole genome sequencing (WGS) provides higher resolution for characterizing viral transmission pathways [23-27]. This increased resolution is needed to distinguish between local, ongoing transmission from repeated introductions of the same genotype in elimination settings.

The interpretation of the data that is generated by the extended sequencing requires an understanding of the expected rate of nucleotide substitutions from epidemiologically-linked viruses. The estimated rate would ideally be based upon studies of viruses collected from a range of settings, such as household transmission cases to outbreaks from an imported virus that continue for several generations. These studies are necessary for estimating a threshold of sequence variability that would predict independent chains of transmission between similar strains, or conversely, would support the existence of epidemiologic links between outbreaks or clusters of cases in situations where the investigation failed to identify links [23-25]. It is important to note that the approach is to provide data that would be probabilistic and not definitive.

Both WGS and extended sequencing of hypervariable regions, such as the intergenic, or noncoding, region between the M and F genes (MF-NCR) of measles have been examined [27]. Sequencing of the MF-NCR can be achieved using standard Sanger sequencing methods, while practical application of WGS will require use of next generation sequencing methods. It is possible to perform WGS from viral RNA extracted directly from patient specimens, but greater success would be achieved by sequencing virus isolates.

The variability observed in the MF-NCR is roughly equivalent to the diversity present in the corresponding whole measles genome sequence [27]. Therefore, separate transmission pathways may be apparent by analysis of sequences of the MF-NCR among measles viruses that have identical N-450 sequences. However, more studies are needed to determine the utility of the extended sequencing and to establish the minimum number of nucleotide differences that can differentiate between separate chains of transmission.

The goals for the Next generation, Extended window and Whole genome sequencing working group (N.E.W.) of the GMRLN include standardization of the methods for obtaining sequences for the full genome of measles and rubella viruses and the MF-NCR region of measles for the routine application in the GMRLN [27]. The MeaNS and RubeNS databases have been modified for deposit of the appropriate extended sequencing or WGS and to allow a search for these sequences. The database for WGS will be continually updated as additional sequences become available for the whole genome for the genotypes and named strains of measles and rubella.