7.10 The rubella nucleotide surveillance (RubeNS) database

Mick Mulders


The rubella sequence database is operated, as with MeaNS, through a cooperative activity under technical leadership from Public Health England (PHE), with a steering committee comprised of representatives from the Global Specialized and Regional Reference Laboratories. The corresponding database and web-application are maintained by PHE and sequence data are contributed by GMRLN member laboratories or downloaded from GenBank.

The primary objective of the RubeNS database is to collect sequences from rubella genotypes consisting of the minimum fragment required for genotyping (E-739). However, the complete E1 coding region, the complete structural polyprotein, or the complete genome are increasingly useful for comparison with other sequences collected globally. Bioinformatics tools in RubeNS allow users to find identical or similar sequences and identify a preliminary genotype designation.

7.10.1 Submission of sequences to the RubeNS database

The submission of rubella virus sequences to RubeNS, like MeaNS, is through a web interface. Although the website looks very different from MeaNS, it has the same basic functionality. Users need to register to use the website, which is only available to GMRLN members. As with MeaNS, users are encouraged to submit all rubella sequences, rather than what may be considered to provide a representative subset, and only wild-type, not vaccine, sequences should be submitted.

Rubella sequences of any length will be accepted into the RubeNS database, but analysis and genotype designation can only be carried out on the E-739 sequence or the complete E1 sequence. As is true for MeaNS, information can be submitted to RubeNS to generate a standard WHO name, or the user can supply the WHO name directly. Additional information, including more precise geographic information can also be added. Guidelines and details of the steps for entering data into the appropriate fields for submission of sequences (new record) to RubeNS are available in Annex 7.6.

7.10.2 Sequence data and analysis tools available in RubeNS

Upon login to RubeNS, the HOME page is displayed with information about the database and a series of tabs are available. In addition to the HOME page, USER EDIT allows users to change their details, including their password. The WHO Map displays the most recent WHO global map with rubella genotypes, SAMPLES and TOOLS are the main functional tabs in the database and TANDC shows the Terms and Conditions of using the site.

The SAMPLES tab displays a list of all the samples within the database below a series of fields, many with drop down boxes. These fields can be filled in as needed to create a set of filters to narrow down the search. All the fields that allow free text can accept wild-card characters, i.e. the asterisk (*) or question mark (?), which is useful if the exact spelling is not known.

For each sequence that satisfies the search criteria, the genotype, country, date of submission and region of the gene are displayed, along with other basic information. By clicking on the WHO name, the sample record is displayed, including the sequence submitted. The tools for analysis of the sequence are provided along the foot of the page: EXACT MATCHING, BLAST MATCHING, GENOTYPING, PHYLOGENY and SUBMIT TO GENBANK. Similar tools can be found by clicking on the TOOLS tab, but that approach required that the sequence for analysis would have to be copied into the PASTE SEQUENCE box.

1. EXACT MATCHING tool. By selecting this option, a search is initiated for sequences identical to the input sequence. The search results are displayed in a table similar to the Samples listing. The results of the search can be downloaded for further analysis in Excel.

2. BLAST matching tool. As the RubeNS database is much smaller than MeaNS, and rubella virus sequences are more variable, the Exact match tool will often not find identical sequences and the BLAST matching tool may be more informative. Using the BLAST matching tool will search the RubeNS database and identify identical and similar sequences, with the percent identity and length of overlap provided.

3. GENOTYPING tool. The genotyping tool uses position-specific scoring matrices (PSSM) to not only provide the genotype corresponding to the input rubella sequence but also to give the confidence of the phylogenetic analysis which generated the genotype. Almost all currently circulating rubella sequences will be assigned a genotype with high confidence (Z score >3). This is not true of some of the historic rubella virus sequences which may require analysis with all the reference genotypes using a programme that uses Bayesian inference (e.g., MrBayes).

4. PHYLOGENY tool. This option plots a phylogeny of the displayed sequence (labelled input) and the sequences of the WHO reference strains. This tool can also be used to confirm the genotyping prediction using the genotyping tool.

7.10.3 Interpretation of information from RubeNS analyses

As noted above, the RubeNS database contains far fewer sequences compared to MeaNS. There are many countries, including those with endemic rubella, for which there are no representative rubella sequences in RubeNS. Therefore, the use of the database for understanding the context of a particular sequence in terms of its association with geographic or epidemiologic significance, is challenging. Sequences with the same E-739 or complete E1 gene are rarely identified unless they are part of the same outbreak. However, identifying similar sequences may provide some insights into potential sources of an importation.

For rubella, the important aspect of molecular analysis is not so much for the purpose of identification of a particular genotype as much as it is to provide the sequences that can then be compared to other rubella sequences belonging to the same genotype in the database. Demonstration that sequences differ between or among some cases may also be useful in settings where separate introductions of virus may produce contemporaneous, but unrelated, outbreaks.