Settings


Macro Synteny


Micro Synteny

Input

Switch On to Use the Result From MCscan Pipeline Tab
Show Setting Panel on the left
Or use your own uploaded files

Choose Chromosomes:

MCscan Pipeline

Settings

Query Species

Subject Species

Please refer to the documentation of MCscan pipeline for details:

https://github.com/tanghaibao/jcvi/wiki/MCscan-%28Python-version%29

If you use the MCscan pipeline, please also cite its original paper:

Tang, H., Bowers, J. E., Wang, X., Ming, R., Alam, M., & Paterson, A. H. (2008). Synteny and collinearity in plant genomes. Science, 320(5875), 486-488.

And the paper of last:

Kiełbasa, S. M., Wan, R., Sato, K., Horton, P., & Frith, M. C. (2011). Adaptive seeds tame genomic sequence comparison. Genome research, 21(3), 487-493.

ShinySyn Usage

Introduction

ShinySyn could use mcscan output files directly to visualize synteny blocks ( Main View) or invoke mcscan pipeline ( MCscan Pipeline) with genome sequence (fasta file) and gene coordinates (gff3 file containing CDS annotation) as input.

This document will compare Arabidopsis thaliana TAIR10 and Arabidopsis lyrata v2.1 as an example to demonstrate usage of ShinySyn.

Usage Demo

Macro/micro-synteny

Dot Plot

Input Files

Firstly we will need to downlaod the genmoe fasta and the gene gff3 file from phytozome:

  • Click “Download” button and click “proceed to data”. Please remember to cite genome’s original paper if you downlaod the data and use it in your research.
  • Choose genome fasta file and gene annotation gff3 file, and click download
  • Similiar steps for Arabidopsis lyrata v2.1:

MCscan Pipeline

With the input files ready, user could navigate to MCscan Pipeline page and start running MCscan with several clicks, please note all input files could be compressed by gzip (with .gz suffix):

  1. Input query species name (e.g. Athaliana167)
  2. Input query fasta file (e.g. Athaliana_167_TAIR10.fa.gz)
  3. Input query gff3 file (e.g. Athaliana_167_gene_exons.gff3.gz)
  4. Input subject species name (e.g. Alyrata384)
  5. Input subject fasta file (e.g. Alyrata_384_v1.fa.gz)
  6. Input subject gff3 file (e.g. Alyrata_384_v2.1.gene.gff3.gz)
  7. Adjust cscore cut-off
  8. Click Run Pipeline button

Please note the default cscore cut-off is 0.7, users could use 0.99 to retrieve the reciprocal best hits (RBHs) of the orthologous genes as suggested in MCscan’s document. More explanation of csocre could refer to this discussion mentioned in the MCscan’s wiki page.

After pipeline finished, user could download all the result file by clicking Download Result. The result will be a tarball named as mcscan_result.<time>.tgz. User could extract all the files with tar in linux or 7-zip in windows.

The result files include:

  • query BED file (e.g. Athaliana167.bed)
  • subject BED file (e.g. Alyrata384.bed)
  • anchor file (e.g. Athaliana167.Alyrata384.anchors)
  • anchor lifted file (e.g. Athaliana167.Alyrata384.lifted.anchors)

Main View

User could either use the output files generated externally by running Mcscan pipeline on command line, or the output files returned in the previous section as the inputs of ShinySyn’s visualization.

Main Page View

Macro-synteny

After inputing query/subject species name, as well as uploading the four output file mentioned above, user could generate macro-synteny plot with just one click. We provide two layouts of macro-synteny: parallel and circular, which could be switched in the setting block.

Parallel Layout:

Circular Layout:

If user put mouse over the ribbons (blocks), they will be highlighted and detial information containing start/end query/subject genes will be displayed. When moving over chromosomes/contigs, all the ribbons associated to the selected one will be highlighted. User could make the highlight persist by hovering for more than 8 seconds.

Micro-synteny

If user click the ribbon, the micro-synteny associated with the selected region will be rendered in the lower panel. There will a heatmap indicating gene density of the selected region (from query species), and user could brush a small region from the heatmap, the genes pairs within will be shown as a typical micro-synteny view as mcscan, but with a zoom in/out capability. The micro-synteny will be automatically updated when user choose a different “focused” region from heatmap.

By default, for each query anchor gene, only the best hit from subject genes will be retained. User could unpick Extract one best Subject in the setting block to disable this and keep all the possible orthologous pairs. This will retain the “multiple-to-multiple” relationship.

Additionally, a table containing all the gene pairs in the selected macro-synteny will be shown. The gene pairs are extracted from .lifted.anchors file, and contains “low quality anchor genes close to high quality anchors”. User could search any gene of interest, and clicking any row from table will automatically update micro-synteny view with the anchor highlighted.

Colors

A tuned color scheme was used in ShinySyn, however, user is able to simply customize this in the setting block.

Dot View

If Generate Dot Viw was picked in the setting block, a dot plot will be generated at the same time. User could navigate to Dot View page to check the result.

There will be a table containing all high quality anchors (from .anchors file) aside of the plot. User could select a small rectangle region to zoom in on the dot plot, and the table will be updated as well. Double clicking will reset the zoom level.