7.6 Phylogenetic diversity and nomenclature for rubella genotypes

Mick Mulders


In 2005, a standardized nomenclature for wild-type rubella viruses was established following a meeting of experts in 2005 [20]. This meeting and the subsequent publication also established the sequencing window of the envelope glycoprotein E1 coding region (739 nucleotides, E-739) for determination of rubella genotypes. Updates to the nomenclature were reported in 2007 and 2013 [21, 22]. The original two clades remain the major phylogenetic groups of rubella viruses. The clades differ by 8-10% (E-739) and are represented by numerals (1, 2) which precede the letter assigned to genotypes within the two clades. Uppercase letters are used for recognized genotypes of rubella that meet the specified criteria (section 7.9). Other genotypes that have been identified but lack well-characterized progenitor wild-type viruses are designated with a lowercase letter.

The extent of genetic information available for rubella genotypes is limited compared to what has been obtained for measles. Many countries have not made significant progress in describing circulating rubella viruses. As of July 2017, there are 13 rubella genotypes. There are 10 genotypes within Clade 1(1a, 1B-1J) and Clade 2 consists of three genotypes (2A, 2B, 2C). Clade 1 includes the provisional genotype 1a. Genotype 1a includes vaccine strains from the 1960s. However, not all of the strains included in genotype 1a cluster together, making the phylogeny of this group unclear. Whole genome analysis of 30 rubella sequences from 8 of the 13 known genotypes generated a maximum likelihood phylogenetic tree with high bootstrap values for all genotypes except the provisional genotype 1a.

Provisional genotype 1a is highly diverse and was designated as a provisional genotype in part because of the historical importance of this group of viruses (for example, the RA27/3 vaccine strain is included among the 1a viruses). Vaccine viruses are not wild-type viruses but are included as reference sequences to allow vaccine viruses to be rapidly identified. Few viruses in this provisional genotype have been found since 2004, and genotype 1a remains provisional. It is important to note that the four reference viruses in this genotype often cluster in multiple groups, but the groups are distinct from other genotypes.

Clade 2 includes genotypes 2A, 2B, and 2C. Genotype 2A includes vaccine strains manufactured in China. All of the rubella genotypes have at least two reference strains. Genotypes 1B, 1C, 1G, and 2B have three reference strains and the provisional genotype 1a has four reference strains. Only four of the genotypes (1E, 1G, 1J, 2B) are commonly detected and reported. Of these four active genotypes, 1E and 2B are the most frequently detected and have a wide geographic distribution.

Four of the rubella genotypes, 1D, 1F, 1I, and 2A, have not been reported in circulation for over 10 years: the last report of 1D was in 1996, 1F was reported in 2002, 1I was reported in 1994 and wild-type 2A was reported in 1980 (although 2A vaccine-derived viruses have been reported more recently). Thus, these four genotypes are considered inactive and are probably extinct. However, considerable gaps in molecular surveillance still exist globally.

The nomenclature for rubella genotypes is similar to that for measles genotypes [20-22]. Rubella virus RNA sequences can be generated from either viral isolates or RNA extracted directly from clinical material. Sequences are designated as either:

  • RVi: sequence derived from RNA extracted from a rubella virus isolate
  • RVs: sequence derived from RNA extracted directly from clinical material

The geographic association, dates, and numbering of additional isolates from the same location, year and epidemiologic week follow the measles nomenclature. For example, RVs/Hong Kong.CHN/20.12/2[1E] designates a genotype 1E rubella virus derived directly from a clinical sample collected in Hong Kong, China, in the 20th week of 2012. This is the second sequence reported from the same week and location. For WHO names of rubella genotypes, a slash is used after the year, but not after the sequence number if there is more than 1 sequence from the same location and week.

There are special designations for sequences derived from cases with congenital rubella syndrome (CRS) and from newborns with congenital rubella infection (CRI). In both instances the geographic location specified in the name is the place of birth and the onset date of disease is the date of birth. For example, RVi/Ho Chi Minh.VNM/41.11/[2B] (CRS) designates a genotype 2B rubella virus isolate from a CRS patient in Ho Chi Minh, Vietnam, in the 41st week of 2011.

Although reporting of rubella vaccine strains to RubeNS is not encouraged, reference strains from vaccines are designated by adding “VAC” to their names. For example, the BRD2 vaccine strain is written as RVi/Beijing.CHN/80[2A]VAC.