fbpx
Genome Analysis
De-Novo Assembly, Gene Finding, Repeat Masking

Genome Analysis Module

The OmicsBox Genome Analysis module allows to characterize and analyze newly sequenced genomes, from raw reads to gene structures in an efficient and user-friendly way.

Quality Control And Assessment

Use FastQC and Trimmomatic to perform the quality control of your samples, to filter reads and to remove low quality bases.

Genome Browser

Visualize your annotations in form of tracks to combine the genome sequences (.fasta) with alignments (.bam), intron-exon structure (.gff) and variant data (.vcf).

De-Novo Assembly

The assembly feature allows to reconstruct whole genome sequences without a reference genome or specific hardware requirements. Assemble sequencing data from both, short and long read technologies with 3 different algorithms: ABySS, SPAdes and Flye.

Repeat Masking and Gene Finding

Mask repeats of genome assemblies with RepeatMasker to improve downstream gene predictions.

Perform prokaryotic (Glimmer) and eukaryotic (Augustus) gene predictions to characterize genome structure. 

Alignment and Polishing

Align short sequencing reads against large sequences with BWA or Bowtie 2, and correct draft assemblies from long reads with Pilon

Multi-Locus Sequence Typing (MLST)

Characterize bacterial isolates unambiguously. This procedure considers the alleles present in (usually) seven well-characterized housekeeping genes.

  • DNA-Seq de novo assembly with ABySS 2
  • DNA-Seq de novo assembly with SPAdes
  • DNA-Seq de novo assembly with Flye
  • DNA-Seq alignment with BWA
  • DNA-Seq polishing with Pilon
  • Repeat masking with RepeatMasker
  • Eukaryotic gene finding with Augustus
  • Prokaryotic gene finding with Glimmer

Genome Browser

Visualize different file types in a side-scrolling way.

 

Statistics

Different statistical charts and reports allow to evaluate the genome assembly and characterization processes, as well as to help with the biological interpretation of the results. 

Exploratory Analysis

The rich user interface allows to process large genome annotations with ease. Gene annotations, in General Feature Format style, can be filtered, sorted and combined with other result sets. Select the genes to display directly from the table.

Workflows

Eukaryotic Genome Analysis

Generate a genome draft of eukaryotic species by assembling DNA-Seq reads without additional prior information. Detect and mask repetitive sequences and improve the gene prediction by providing RNA-Seq data.

Prokaryotic Genome Analysis

Genomes of bacteria and other prokaryotic organisms can be assembled and characterized in a fast and sensitive way. Proceed with the functional annotation of the resulting gene sequences.

Long Reads Genome Analysis

Generate a genome draft by assembling DNA-Seq long reads with Flye and use short reads to polish contigs with BWA and Pilon. Detect and mask repetitive sequences, predict genes and find homologous to fully characterize the assembled sequences.