Sequence alignment software

This list of sequence alignment software is a compilation of software tools and web portals used in pairwise sequence alignment and multiple sequence alignment. See structural alignment software for structural alignment of proteins.

Database search only

Name DescriptionSequence Type*Link
FASTAk-tuple local searchBothEBI GenomeNet PIR (protein only)
BLASTk-tuple local search (Basic Local Alignment Search Tool)BothNCBI EBI GenomeNet PIR (protein only)
SSEARCHSmith-Waterman search (more sensitive than FASTA)Bothserver
HMMerHidden Markov profile searchProtein/DNAdownload (S. Eddy)
SAMHidden Markov profile searchProtein/DNASAM (K. Karplus, A. Krogh)
Combinatorial ExtensionStructural alignment searchProteinserver
IDFInverse Document FrequencyBothDownload
*Sequence Type: Protein or nucleotide

Pairwise alignment

Name DescriptionSequence Type*Alignment Type**LinkAuthorYear
BLASTZSeeded pattern-matchingNucleotideLocaldownloadSchwartz et al.2003
DNADotWeb-based dot-plot toolNucleotideGlobalserverR. Bowen1998
DOTLETJava-based dot-plot toolBothGlobalappletM. Pagni and T. Junier1998
JAlignerOpen source Java implementation of Smith-WatermanBothLocalJWSA. Moustafa2005
LALIGNLocal similarity (same algorithm as SIM)BothLocal (default) or globalserverB. Pearson1991 (algorithm)
matcherMemory-optimized needleman but slow dynamic programming (based on LALIGN)BothLocalserverI. Longden (modified from B. Pearson)1999
MCALIGN2explicit models of indel evolutionDNAGlobalserverJ. Wang et al.2006
MUMmerSuffix-Tree basedNucleotideGlobaldownloadS. Kurtz et al.2004
needleNeedleman-Wunsch dynamic programmingBothGlobalserverA. Bleasby1999
Ngilalogarithmic and affine gap costs and explicit models of indel evolutionBothGlobaldownloadR. Cartwright2007
PatternHunterSeeded pattern-matchingNucleotideLocaldownloadB. Ma et al.2002-2004
ProbA (also propA)Stochastic partition function sampling via dynamic programmingBothGlobaldownloadU. Mückstein2002
PyMOL"align" command aligns sequence & applies it to structureProteinGlobal (by selection)siteW. L. DeLano2007
REPuterSuffix-Tree basedNucleotideLocaldownloadS. Kurtz et al.2001
SEQALNVarious dynamic programmingBothLocal or GlobalserverM.S. Waterman and P. Hardy1996
SIM, GAP, NAP, LAPLocal similarity with varying gap treatmentsBothLocal or globalserverX. Huang and W. Miller1990-6
SIMLocal similarityBothLocalserversX. Huang and W. Miller1991
SLIM SearchUltra-fast blocked alignmentBothBothsiteL. Bloksberg2004
stretcherMemory-optimized but slow dynamic programmingBothGlobalserverI. Longden (modified from G. Myers and W. Miller)1999
tranalignAligns nucleic acid sequences given a protein alignmentNucleotideNAserverG. Williams (modified from B. Pearson)2002
waterSmith-Waterman dynamic programmingBothLocalserverA. Bleasby1999
wordmatchk-tuple pairwise matchBothNAserverI. Longden1998
YASSSeeded pattern-matchingNucleotideLocalserver downloadL. Noe and G. Kucherov2003-2006
BioPerl dpAligndynamic programmingBothBoth + Ends-freesiteY. M. Chan2003
*Sequence Type: Protein or nucleotide. **Alignment Type: Local or global

Multiple sequence alignment

Name DescriptionSequence Type*Alignment Type**LinkAuthorYear
ABAA-Bruijn alignmentProteinGlobaldownloadB.Raphaelet al.2004
ClustalWProgressive alignmentBothLocal or GlobalEBI PBIL EMBNet GenomeNetThompson et al.1994
AMAPSequence annealingBothGlobalserverA. Schwartz and L. Pachter2006
CodonCode AlignerMulti alignment; ClustalW & Phrap supportNucleotidesLocal or GlobaldownloadP. Richterich et al.2003 (latest version 2007)
BAli-PhyTree+Multi alignment ; Probabilistic/Bayesian ; Joint EstimationBothGlobalWWW+downloadBD Redelings and MA Suchard2005 (latest version 2007)
DNA BaserMulti alignmentBothLocal or Global + Post processingDNA Baser (commercial)M. Gabrielreleased 2005
Ed'NimbusSeeded filtrationNucleotidesLocalserverP. Peterlongo et al.2006
GeneiousProgressive/Iterative alignment; ClustalW pluginBothLocal or GlobaldownloadA.J. Drummond et al.2005 / 2006
CHAOS/DIALIGNIterative alignmentBothLocal (preferred)serverM. Brudno and B. Morgenstern2003
KalignProgressive alignmentBothGlobalserverEBI MPItoolkitT. Lassmann2005
PRRN/PRRPIterative alignment (especially refinement)ProteinLocal or GlobalPRRP PRRNY. Totoki (based on O. Gotoh)1991 and later
POAPartial order/hidden Markov modelProteinLocal or GlobaldownloadC. Lee2002
MSADynamic programmingBothLocal or GlobaldownloadD.J. Lipman et al.1989 (modified 1995)
SAMHidden Markov modelProteinLocal or GlobalserverA. Krogh et al.1994 (most recent version 2002)
ProbConsProbabilistic/consistencyProteinLocal or GlobalserverC. Do et al.2005
MULTALINDynamic programming/clusteringBothLocal or GlobalserverF. Corpet1988
MAVIDProgressive alignmentBothGlobalserverN. Bray and L. Pachter2004
Multi-LAGANProgressive dynamic programming alignmentBothGlobalserverM. Brudno et al.2003
MUSCLEProgressive/iterative alignmentBothLocal or GlobalserverR. Edgar2004
MAFFTProgressive/iterative alignmentBothLocal or GlobalGenomeNet MAFFTK. Katoh et al.2005
PSAlignAlignment preserving non-heuristicBothLocal or GlobaldownloadS.H. Sze, Y. Lu, Q. Yang.2006
SAGASequence alignment by genetic algorithmProteinLocal or GlobaldownloadC. Notredame et al.1996 (new version 1998)
T-CoffeeMore sensitive progressive alignmentBothLocal or GlobalserverC. Notredame et al.2000
RevTransCombines DNA and Protein alignment, by back translating the protein alignment to DNA.DNA/Protein (special)Local or GlobalserverWernersson and Pedersen2003 (newest version 2005)
*Sequence Type: Protein or nucleotide. **Alignment Type: Local or global

Genomics analysis

Name Description Sequence Type* Link
SLAMGene finding, alignment, annotation (human-mouse homology identification)Nucleotideserver
MauveMultiple alignment of rearranged genomesNucleotidedownload
MGAMultiple Genome AlignerNucleotidedownload
MulanLocal multiple alignments of genome-length sequencesNucleotideserver
SequeromeProfiling sequence alignment data with major servers/servicesNucleotide/peptide
AVIDPairwise global alignment with whole genomesNucleotideserver
SIBsim4 / Sim4A program designed to align an expressed DNA sequence with a genomic sequence, allowing for intronsNucleotidedownload
Shuffle-LAGANPairwise glocal alignment of completed genome regionsNucleotideserver
ACT (Artemis Comparison Tool)Synteny and comparative genomicsNucleotideserver
*Sequence Type: Protein or nucleotide


Motif finding

Name DescriptionSequence Type*Link
MEME/MASTMotif discovery and searchBothserver
BLOCKSUngapped motif identification from BLOCKS databaseBothserver
eMOTIFExtraction and identification of shorter motifsBothservers
Gibbs motif samplerStochastic motif extraction by statistical likelihoodBothserver (one of many implementations)
TEIRESIASMotif extraction and database searchBothserver
PRATTPattern generation for use with ScanPrositeProteinserver
ScanPrositeMotif database search toolProteinserver
PHI-BlastMotif search and alignment toolBothserver
I-sitesLocal structure motif libraryProteinserver
*Sequence Type: Protein or nucleotide




Benchmarking

Name Link Authors
BAliBASEdownloadThompson, Plewniak, Poch
PrefabdownloadEdgar

External links

In bioinformatics, a sequence alignment is a way of arranging the primary sequences of DNA, RNA, or protein to identify regions of similarity that may be a consequence of functional, structural, or evolutionary relationships between the sequences.
..... Click the link for more information.
multiple sequence alignment (MSA) is a sequence alignment of three or more biological sequences, generally protein, DNA, or RNA. In general, the input set of query sequences are assumed to have an evolutionary relationship by which they share a lineage and are descended from a
..... Click the link for more information.
This list of structural alignment software is a compilation of software tools and web portals used in pairwise or multiple structural alignment

Structural alignment


NAME Description Class Type Link Author Year
MAMMOTH MAtching Molecular Models
..... Click the link for more information.
Structural alignment is a form of sequence alignment based on comparison of shape. These alignments attempt to establish equivalences between two or more polymer structures based on their shape and three-dimensional conformation.
..... Click the link for more information.
FASTA is a DNA and Protein sequence alignment software package first described (as FASTP) by David J. Lipman and William R. Pearson in 1985 in the article Rapid and sensitive protein similarity searches .
..... Click the link for more information.
A blast is an explosion. Blast can also refer to:

Entertainment:
  • BBC Blast, a programme, website and tour for 13 - 19 year olds getting creative
  • Blast (venue), an Irish concert organiser for alternative music groups
  • Blast!

..... Click the link for more information.
JAligner is an open source Java implementation of the Smith-Waterman algorithm[1] with Gotoh's improvement[2] for biological local pairwise sequence alignment using the affine gap penalty model. Is was written by Ahmed Moustafa.
..... Click the link for more information.
Clustal is a widely used multiple sequence alignment computer program. The latest version is 1.83. There are two main variations:
  • ClustalW: command line interface
  • ClustalX: This version has a graphical user interface.

..... Click the link for more information.
AMAP is a multiple sequence alignment program based on a new approach to multiple alignment called sequence annealing. This approach consists of building up the multiple alignment one match at a time, thereby circumventing many of the problems of progressive alignment.
..... Click the link for more information.
ProbCons is an open source probabilistic consistency-based multiple alignment of amino acid sequences. It is an efficient protein multiple sequence alignment program, which has demonstrated a statistically significant improvement in accuracy compared to several leading alignment
..... Click the link for more information.
MAVID is a multiple sequence alignment program suitable for the alignment of large numbers of DNA sequences. The sequences can be small mitochondrial genomes or large genomic regions up to megabases long. The latest version is 2.0.4.
..... Click the link for more information.
MUSCLE (multiple sequence comparison by log-expectation) is public domain, multiple sequence alignment software for protein and nucleotide sequences.
..... Click the link for more information.
MAFFT is a multiple sequence alignment program for amino acid or nucleotide sequences.

MAFFT is freely available for academic use, without any warranty.

See also

  • Sequence alignment software
  • Clustal

References


..... Click the link for more information.
T-Coffee (Tree-based Consistency Objective Function For alignment Evaluation) is a multiple sequence alignment software using a progressive approach. It generates a library of pairwise alignments to guide the multiple sequence alignment.
..... Click the link for more information.
Sequerome is a web-based Sequence profiling tool developed by the Bioinformatics and Computational Biosciences Unit ( BCBU ) at Georgetown University. This tool, which is the only one of its kind, provides the unique and useful functionality of seamlessly integrating an entire
..... Click the link for more information.
Sim4 is a nucleotide sequence alignment program akin to BLAST but specifically tailored to DNA to cDNA/EST (Expressed Sequence Tag) alignment (as opposed to DNA-DNA or protein-protein alignment). It was written by Florea et al.
..... Click the link for more information.
PubMed Central is a free digital database of full-text scientific literature in biomedical and life sciences. It can be reached at [1] .

It grew from the online Entrez PubMed biomedical literature search system. PubMed Central was developed by the U.S.
..... Click the link for more information.


This article is copied from an article on Wikipedia.org - the free encyclopedia created and edited by online user community. The text was not checked or edited by anyone on our staff. Although the vast majority of the wikipedia encyclopedia articles provide accurate and timely information please do not assume the accuracy of any particular article. This article is distributed under the terms of GNU Free Documentation License.