Yum, beautiful regulatory variants to spot...

Regul@tionSpotter

Tutorial

Single variant analysis VCF file upload Synopsis Contact

In our tutorial, we will exemplarily lead you through RegulationSpotter's analysis of sequencing results. As you can see on the homepage, you can query RegulationSpotter in two different ways: If you are interested in a single regulatory variant, you can query it simply by clicking on the query single variant square and then by entering its genomic location (GRCh37). If you have a vcf file containing a large number of variants, you can upload it by clicking on the upload vcf file square, which will lead you to RegulationSpotter's upload interface.

By the way, if you should ever feel lost when opening RegulationSpotter's tutorial, don't panic! You can always find links to the relevant sections of the tutorial and documentation on the home page and the navigation bar :)

Example case: hepatic vein thrombosis and seizure patient

In our example, a patient presenting with portal and hepatic vein thrombosis and seizures is suspected to suffer from a glycosylphosphatidylinositol deficiency. A panel sequencing with all genes known to cause human hereditary diseases including up- and downstream extragenic regions was carried out. To improve the chances of finding the causative mutation, candidate genes were determined based on the condition of the presenting patient and the sequencing results were filtered to exclude variants located in other genes.
The candidate genes for our case were: PIGV, PIGN, PIGA, PIGL, PIGO, PIGT, PGAP2, PGAP1, PGAP3, PIGW, PIGY, PIGG and PIGM.



Let's start with querying a single variant!

Analysing a single variant

Unfortunately, after doing all the tests mentioned above, you have not been able to identify the disease-causing mutation. However, a variant located on chromosome 1 catches your eye. The genomic location is chr1:160001799, and it is a G>C SNP. You have found it in ClinVar as a known disease causing mutation and suspect that it might be a regulatory mutation for one of your candidate genes. Maybe RegulationSpotter can tell you something more about the alteration? Starting with RegulationSpotter's home page, please just click on the query single variant box. It will lead you to our single query interface, where you can enter the genomic location (GRCh37) of the variant:

Please, simply enter the chromosome (1), the position (160001799), the reference allele (G) and the alternative allele (C) and hit continue. Easy, huh?
Alternatively, you can also find the analysis here.

RegulationSpotter will now run for a couple of seconds. Once it is done, you will find all sorts of information on the variant in a table format. Here, you can find RegulationSpotters decision about the variant - in this case, it is disease causing as the mutation is a known disease mutation in ClinVar (which you can also see in the known variant section).
Moreover, RegulationSpotter shows you everything it knows about the variant at the given location. To read up more about the information provided by RegulationSpotter, feel free to refer to our documentation.
In this case, RegulationSpotter is able to list a whole lot of data:
It knows that the alteration is located in a Promoter according to Ensembl multicell regulatory features, a DNase1 hypersensitive site in a promoter, a H3K4me3 site in a promoter and so on. In addition, the conservation at the site is relatively high. Thus - independently of the fact that the mutation is known to be disease causing - taken together, the information points at a regulatory function!

Analysing a vcf file

Now, let's switch to RegulationSpotter's second analysis mode - analysing an entire vcf file.

As you remember, in our hypothetical case, we have already analysed intragenic variants in the list of candidate genes but did not find a convincing disease-causing mutation. Hence, we assume that the disease is caused by a mutation in a regulatory region and decide to examine our vcf with RegulationSpotter.

vcf file

Let's go back to RegulationSpotter's home page. This time, please click on the upload vcf file box, which brings you to our upload page. Here, you can find the vcf file (small blue link, upper right area of the page saying sample file). Please save it on your computer and then upload it to RegulationSpotter.

In this file, you can find all the variants which in our assumed sequencing project were found in the panel genes. Additionally, you can find some regulatory variants there as well, which we believe could be involved in the development of the disease. Of course, most of this genetic variation is most likely harmless, but we expect one of the alterations to be causative for the disease.

Analysis settings

When uploading the vcf file, you can specify a number of settings for the analysis. For our purposes, please stick with the default settings. Once you are more familiar with RegulationSpotter, feel free to test and compare the different options we offer.

Analyse the following regions or genes

In a typical case, you might end up with a rather large vcf file, especially if you are looking at entire exomes or genomes. Therefore, you would usually start with rather stringent filtering options.
In our tutorial, the vcf file is rather small to decrease waiting times for you. However, it still makes sense to restrict the analysis to the list of candidate genes.
Thus, please select analyse custom genes (select to enter) at the bottom of the page and enter our candidate genes (copy/paste is fine) :
PIGV, PIGN, PIGA, PIGL, PIGO, PIGT, PGAP2, PGAP1, PGAP3, PIGW, PIGY, PIGG, PIGM


If you want to know more about the settings options, please refer to our documentation.

Once you are done, just hit "submit" and wait for RegulationSpotter to work on your file.

Hint: Alternatively, you can find the pre-analysed file here without having to undergo all the previous steps.

Synopsis and display settings

RegulationSpotter will lead you now to its first landing page. On the left side, you will find a synopsis of your vcf file. The center of the page allows you to filter and sort your results for display. For detailed information on this page, please also refer to our documentation.

If you are planning to access your project later on, please record your project ID. This enables you to just enter it whenever you want to have a look at your results again.

Note: Please DO NOT change or delete this ID as RegulationSpotter requires it!

Display settings

Here, you can filter and sort your results for display. In a real-life case, you would most likely start with a strict filter to avoid being swamped by your data. Even though our tutorial set is rather small, we should still hide extragenic variants without annotation, so please select this option.

As we suspect that our causative alteration is located in a regulatory region, we are interested in variants with a high likelihood for regulatory location.
Therefore, please select 'sort by effect, chromosome, position'.
When you are done, just hit display to get your results.

Hint: In case you prefer to just skip ahead to the sorted results, you can just click here.

Results

RegulationSpotter first gives you an overview of its findings. In the left, text-based, part you can find all sort of useful information about the variant, such as chromosomal position, reference and alternative allele, and connected gene. The rest of the summary is a colour-coded table indicating different types of regulatory features which might be affected. Further information on the summary table can be found in our documentation.

As we suspect that the causative mutation is located in a regulatory region, first of all we now focus on variants which are most likely located in a regulatory region. This is indicated in the likely effect column.
RegulationSpotter calculates a so-called Xscore, which is a measure for the amount of evidence that a variant is located in a regulatory region. To calculate this score, RegulationSpotter compiles and integrates all the information or annotations it can find about the location of a variant. The higher the score, the more evidence exists for the altered location to be a regulatory region. We will have a closer look at the top variant: A G to C SNP at 160001799 on chromosome 1. It might seem familiar to you - it is the alteration we queried in the single variant section of the tutorial. As you can see in the list, it is an extragenic, disease-causing variant located in the promoter of one of the candidate genes (PIGM). This variant is recognised by RegulationSpotter as being disease-causing because it is annotated as such in ClinVar. Regardless from this annotation in ClinVar, the ReguationSpotter XScore of 38 reflects a high amount of evidence that this variant is located in functionally relevant, regulatory region. There is no other strictly extragenic variant that gets a higher score. Taken together, we would assume that this mutation is our most likely candidate. We can now click on each of these variants to get more information on the regulatory region and end up with the detailed result page we introduced in the single query section. You can also find the analysis for the top variant of the tutorial here.

For more detailed information on RegulationSpotter's output, please also refer to our documentation.

Interactions view

Imagine that during your research, you became interested in another variant in your vcf file. You notice that a C to T SNP located on chromosome 1 at position 27113734 gets a quite high XScore of 35. It is not located within any of the candidate genes, but according to RegulationSpotter, there seems to be some interaction going on. Let's have a closer look at this! Please find the variant in RegulationSpotter's list (For example by searching for position 27113734) and click on extragenic results to be referred to the detailed results view. You can also click here to continue.

When scrolling through the detailed results page, you notice that RegulationSpotter considers this variant to be located in a likely functional region. Moreover, you receive all sorts of information about what is annotated in various databases and datasets about the location. (As described above and in our documentation.) But now, something is a little bit different:
RegulationSpotter found interaction data for the location and thus generates a link for you to have this interaction displayed in a graphic. Please click on show interactions as plot to try it out, or just find the plot here.

In the plot, you will see the variant symbolised as a thin red line. Interaction elements are depicted as blue rectangles. You can find genes in the region as red rectangles and pseudogenes marked with a little green box. Moreover, You will find a link to explore the variant in Ensembl.
Below the plot, RegulationSpotter tells you more about its evidence for the interaction: In this case, the variant was found to be located in or in the vicinity of promoters for two genes, PIGV and ARID1A. As this was the case in datasets for three different cell lines, RegulationSpotter considers this information to be relevant for your quest of identifying the disease causing mutation.

Contact

Now, we hope you have fun with familiarizing yourself with RegulationSpotter! Enjoy playing around with our tutorial data set by trying out different settings. In case you discover bugs, have suggestions or questions, please write an e-mail to
Jana Marie Schwarz (jana-marie.schwarz AT charite.de) or to
Dominik Seelow
(dominik.seelow AT charite.de).
We also appreciate hearing about your general experiences using RegulationSpotter.