Sequence alignment software
Information about Sequence alignment software
This list of sequence alignment software is a compilation of software tools and web portals used in pairwise sequence alignment and multiple sequence alignment. See structural alignment software for structural alignment of proteins.
Database search only
| Name | Description | Sequence Type* | Link |
|---|---|---|---|
| FASTA | k-tuple local search | Both | EBI GenomeNet PIR (protein only) |
| BLAST | k-tuple local search (Basic Local Alignment Search Tool) | Both | NCBI EBI GenomeNet PIR (protein only) |
| SSEARCH | Smith-Waterman search (more sensitive than FASTA) | Both | server |
| HMMer | Hidden Markov profile search | Protein/DNA | download (S. Eddy) |
| SAM | Hidden Markov profile search | Protein/DNA | SAM (K. Karplus, A. Krogh) |
| Combinatorial Extension | Structural alignment search | Protein | server |
| IDF | Inverse Document Frequency | Both | Download |
| *Sequence Type: Protein or nucleotide | |||
Pairwise alignment
| Name | Description | Sequence Type* | Alignment Type** | Link | Author | Year |
|---|---|---|---|---|---|---|
| BLASTZ | Seeded pattern-matching | Nucleotide | Local | download | Schwartz et al. | 2003 |
| DNADot | Web-based dot-plot tool | Nucleotide | Global | server | R. Bowen | 1998 |
| DOTLET | Java-based dot-plot tool | Both | Global | applet | M. Pagni and T. Junier | 1998 |
| JAligner | Open source Java implementation of Smith-Waterman | Both | Local | JWS | A. Moustafa | 2005 |
| LALIGN | Local similarity (same algorithm as SIM) | Both | Local (default) or global | server | B. Pearson | 1991 (algorithm) |
| matcher | Memory-optimized needleman but slow dynamic programming (based on LALIGN) | Both | Local | server | I. Longden (modified from B. Pearson) | 1999 |
| MCALIGN2 | explicit models of indel evolution | DNA | Global | server | J. Wang et al. | 2006 |
| MUMmer | Suffix-Tree based | Nucleotide | Global | download | S. Kurtz et al. | 2004 |
| needle | Needleman-Wunsch dynamic programming | Both | Global | server | A. Bleasby | 1999 |
| Ngila | logarithmic and affine gap costs and explicit models of indel evolution | Both | Global | download | R. Cartwright | 2007 |
| PatternHunter | Seeded pattern-matching | Nucleotide | Local | download | B. Ma et al. | 2002-2004 |
| ProbA (also propA) | Stochastic partition function sampling via dynamic programming | Both | Global | download | U. Mückstein | 2002 |
| PyMOL | "align" command aligns sequence & applies it to structure | Protein | Global (by selection) | site | W. L. DeLano | 2007 |
| REPuter | Suffix-Tree based | Nucleotide | Local | download | S. Kurtz et al. | 2001 |
| SEQALN | Various dynamic programming | Both | Local or Global | server | M.S. Waterman and P. Hardy | 1996 |
| SIM, GAP, NAP, LAP | Local similarity with varying gap treatments | Both | Local or global | server | X. Huang and W. Miller | 1990-6 |
| SIM | Local similarity | Both | Local | servers | X. Huang and W. Miller | 1991 |
| SLIM Search | Ultra-fast blocked alignment | Both | Both | site | L. Bloksberg | 2004 |
| stretcher | Memory-optimized but slow dynamic programming | Both | Global | server | I. Longden (modified from G. Myers and W. Miller) | 1999 |
| tranalign | Aligns nucleic acid sequences given a protein alignment | Nucleotide | NA | server | G. Williams (modified from B. Pearson) | 2002 |
| water | Smith-Waterman dynamic programming | Both | Local | server | A. Bleasby | 1999 |
| wordmatch | k-tuple pairwise match | Both | NA | server | I. Longden | 1998 |
| YASS | Seeded pattern-matching | Nucleotide | Local | server download | L. Noe and G. Kucherov | 2003-2006 |
| BioPerl dpAlign | dynamic programming | Both | Both + Ends-free | site | Y. M. Chan | 2003 |
| *Sequence Type: Protein or nucleotide. **Alignment Type: Local or global | ||||||
Multiple sequence alignment
| Name | Description | Sequence Type* | Alignment Type** | Link | Author | Year |
|---|---|---|---|---|---|---|
| ABA | A-Bruijn alignment | Protein | Global | download | B.Raphaelet al. | 2004 |
| ClustalW | Progressive alignment | Both | Local or Global | EBI PBIL EMBNet GenomeNet | Thompson et al. | 1994 |
| AMAP | Sequence annealing | Both | Global | server | A. Schwartz and L. Pachter | 2006 |
| CodonCode Aligner | Multi alignment; ClustalW & Phrap support | Nucleotides | Local or Global | download | P. Richterich et al. | 2003 (latest version 2007) |
| BAli-Phy | Tree+Multi alignment ; Probabilistic/Bayesian ; Joint Estimation | Both | Global | WWW+download | BD Redelings and MA Suchard | 2005 (latest version 2007) |
| DNA Baser | Multi alignment | Both | Local or Global + Post processing | DNA Baser (commercial) | M. Gabriel | released 2005 |
| Ed'Nimbus | Seeded filtration | Nucleotides | Local | server | P. Peterlongo et al. | 2006 |
| Geneious | Progressive/Iterative alignment; ClustalW plugin | Both | Local or Global | download | A.J. Drummond et al. | 2005 / 2006 |
| CHAOS/DIALIGN | Iterative alignment | Both | Local (preferred) | server | M. Brudno and B. Morgenstern | 2003 |
| Kalign | Progressive alignment | Both | Global | serverEBI MPItoolkit | T. Lassmann | 2005 |
| PRRN/PRRP | Iterative alignment (especially refinement) | Protein | Local or Global | PRRP PRRN | Y. Totoki (based on O. Gotoh) | 1991 and later |
| POA | Partial order/hidden Markov model | Protein | Local or Global | download | C. Lee | 2002 |
| MSA | Dynamic programming | Both | Local or Global | download | D.J. Lipman et al. | 1989 (modified 1995) |
| SAM | Hidden Markov model | Protein | Local or Global | server | A. Krogh et al. | 1994 (most recent version 2002) |
| ProbCons | Probabilistic/consistency | Protein | Local or Global | server | C. Do et al. | 2005 |
| MULTALIN | Dynamic programming/clustering | Both | Local or Global | server | F. Corpet | 1988 |
| MAVID | Progressive alignment | Both | Global | server | N. Bray and L. Pachter | 2004 |
| Multi-LAGAN | Progressive dynamic programming alignment | Both | Global | server | M. Brudno et al. | 2003 |
| MUSCLE | Progressive/iterative alignment | Both | Local or Global | server | R. Edgar | 2004 |
| MAFFT | Progressive/iterative alignment | Both | Local or Global | GenomeNet MAFFT | K. Katoh et al. | 2005 |
| PSAlign | Alignment preserving non-heuristic | Both | Local or Global | download | S.H. Sze, Y. Lu, Q. Yang. | 2006 |
| SAGA | Sequence alignment by genetic algorithm | Protein | Local or Global | download | C. Notredame et al. | 1996 (new version 1998) |
| T-Coffee | More sensitive progressive alignment | Both | Local or Global | server | C. Notredame et al. | 2000 |
| RevTrans | Combines DNA and Protein alignment, by back translating the protein alignment to DNA. | DNA/Protein (special) | Local or Global | server | Wernersson and Pedersen | 2003 (newest version 2005) |
| *Sequence Type: Protein or nucleotide. **Alignment Type: Local or global | ||||||
Genomics analysis
| Name | Description | Sequence Type* | Link |
|---|---|---|---|
| SLAM | Gene finding, alignment, annotation (human-mouse homology identification) | Nucleotide | server |
| Mauve | Multiple alignment of rearranged genomes | Nucleotide | download |
| MGA | Multiple Genome Aligner | Nucleotide | download |
| Mulan | Local multiple alignments of genome-length sequences | Nucleotide | server |
| Sequerome | Profiling sequence alignment data with major servers/services | Nucleotide/peptide | |
| AVID | Pairwise global alignment with whole genomes | Nucleotide | server |
| SIBsim4 / Sim4 | A program designed to align an expressed DNA sequence with a genomic sequence, allowing for introns | Nucleotide | download |
| Shuffle-LAGAN | Pairwise glocal alignment of completed genome regions | Nucleotide | server |
| ACT (Artemis Comparison Tool) | Synteny and comparative genomics | Nucleotide | server |
| *Sequence Type: Protein or nucleotide | |||
Motif finding
| Name | Description | Sequence Type* | Link |
|---|---|---|---|
| MEME/MAST | Motif discovery and search | Both | server |
| BLOCKS | Ungapped motif identification from BLOCKS database | Both | server |
| eMOTIF | Extraction and identification of shorter motifs | Both | servers |
| Gibbs motif sampler | Stochastic motif extraction by statistical likelihood | Both | server (one of many implementations) |
| TEIRESIAS | Motif extraction and database search | Both | server |
| PRATT | Pattern generation for use with ScanProsite | Protein | server |
| ScanProsite | Motif database search tool | Protein | server |
| PHI-Blast | Motif search and alignment tool | Both | server |
| I-sites | Local structure motif library | Protein | server |
| *Sequence Type: Protein or nucleotide | |||
Benchmarking
| Name | Link | Authors |
|---|---|---|
| BAliBASE | download | Thompson, Plewniak, Poch |
| Prefab | download | Edgar |
External links
- Pollard et al. (2004) (PubMed Central free fulltext): The authors discuss LAGAN, CHAOS, and Dialign as the most effective tools tested for certain uses.
In bioinformatics, a sequence alignment is a way of arranging the primary sequences of DNA, RNA, or protein to identify regions of similarity that may be a consequence of functional, structural, or evolutionary relationships between the sequences.
..... Click the link for more information.
..... Click the link for more information.
multiple sequence alignment (MSA) is a sequence alignment of three or more biological sequences, generally protein, DNA, or RNA. In general, the input set of query sequences are assumed to have an evolutionary relationship by which they share a lineage and are descended from a
..... Click the link for more information.
..... Click the link for more information.
This list of structural alignment software is a compilation of software tools and web portals used in pairwise or multiple structural alignment
NAME Description Class Type Link Author Year
MAMMOTH MAtching Molecular Models
..... Click the link for more information.
Structural alignment
NAME Description Class Type Link Author Year
MAMMOTH MAtching Molecular Models
..... Click the link for more information.
Structural alignment is a form of sequence alignment based on comparison of shape. These alignments attempt to establish equivalences between two or more polymer structures based on their shape and three-dimensional conformation.
..... Click the link for more information.
..... Click the link for more information.
FASTA is a DNA and Protein sequence alignment software package first described (as FASTP) by David J. Lipman and William R. Pearson in 1985 in the article Rapid and sensitive protein similarity searches .
..... Click the link for more information.
..... Click the link for more information.
A blast is an explosion. Blast can also refer to:
Entertainment:
..... Click the link for more information.
Entertainment:
- BBC Blast, a programme, website and tour for 13 - 19 year olds getting creative
- Blast (venue), an Irish concert organiser for alternative music groups
- Blast!
..... Click the link for more information.
JAligner is an open source Java implementation of the Smith-Waterman algorithm[1] with Gotoh's improvement[2] for biological local pairwise sequence alignment using the affine gap penalty model. Is was written by Ahmed Moustafa.
..... Click the link for more information.
..... Click the link for more information.
Clustal is a widely used multiple sequence alignment computer program. The latest version is 1.83. There are two main variations:
..... Click the link for more information.
- ClustalW: command line interface
- ClustalX: This version has a graphical user interface.
..... Click the link for more information.
AMAP is a multiple sequence alignment program based on a new approach to multiple alignment called sequence annealing. This approach consists of building up the multiple alignment one match at a time, thereby circumventing many of the problems of progressive alignment.
..... Click the link for more information.
..... Click the link for more information.
ProbCons is an open source probabilistic consistency-based multiple alignment of amino acid sequences. It is an efficient protein multiple sequence alignment program, which has demonstrated a statistically significant improvement in accuracy compared to several leading alignment
..... Click the link for more information.
..... Click the link for more information.
MAVID is a multiple sequence alignment program suitable for the alignment of large numbers of DNA sequences. The sequences can be small mitochondrial genomes or large genomic regions up to megabases long. The latest version is 2.0.4.
..... Click the link for more information.
..... Click the link for more information.
MUSCLE (multiple sequence comparison by log-expectation) is public domain, multiple sequence alignment software for protein and nucleotide sequences.
..... Click the link for more information.
..... Click the link for more information.
MAFFT is a multiple sequence alignment program for amino acid or nucleotide sequences.
MAFFT is freely available for academic use, without any warranty.
..... Click the link for more information.
MAFFT is freely available for academic use, without any warranty.
See also
- Sequence alignment software
- Clustal
References
..... Click the link for more information.
T-Coffee (Tree-based Consistency Objective Function For alignment Evaluation) is a multiple sequence alignment software using a progressive approach. It generates a library of pairwise alignments to guide the multiple sequence alignment.
..... Click the link for more information.
..... Click the link for more information.
Sequerome is a web-based Sequence profiling tool developed by the Bioinformatics and Computational Biosciences Unit ( BCBU ) at Georgetown University. This tool, which is the only one of its kind, provides the unique and useful functionality of seamlessly integrating an entire
..... Click the link for more information.
..... Click the link for more information.
Sim4 is a nucleotide sequence alignment program akin to BLAST but specifically tailored to DNA to cDNA/EST (Expressed Sequence Tag) alignment (as opposed to DNA-DNA or protein-protein alignment). It was written by Florea et al.
..... Click the link for more information.
..... Click the link for more information.
PubMed Central is a free digital database of full-text scientific literature in biomedical and life sciences. It can be reached at [1] .
It grew from the online Entrez PubMed biomedical literature search system. PubMed Central was developed by the U.S.
..... Click the link for more information.
It grew from the online Entrez PubMed biomedical literature search system. PubMed Central was developed by the U.S.
..... Click the link for more information.
This article is copied from an article on Wikipedia.org - the free encyclopedia created and edited by online user community. The text was not checked or edited by anyone on our staff. Although the vast majority of the wikipedia encyclopedia articles provide accurate and timely information please do not assume the accuracy of any particular article. This article is distributed under the terms of GNU Free Documentation License.