Sequence pattern search

 

fuzzpro/fuzznuc/fuzztran: emboss prosite-style pattern search in amino/nucleic sequences

Fuzz-pro/nuc/tran uses PROSITE style patterns to search protein sequences.
Patterns are specifications of a (typically short) length of sequence to be found. They can specify a search for an exact sequence or they can allow various ambiguities, matches to variable lengths of sequence and repeated subsections of the sequence.

The standard IUPAC one-letter codes for the amino acids are used. The symbol 'x' is used for a position where any amino acid is accepted. Ambiguities are indicated by listing the acceptable amino acids for a given position, between square parentheses '[ ]'. For example: [ALT] stands for Ala or Leu or Thr. Ambiguities are also indicated by listing between a pair of curly brackets '{ }' the amino acids that are not accepted at a given position. For example: {AM} stands for any amino acid except Ala and Met. Each element in a pattern is separated from its neighbor by a '-'. (Optional in fuzzpro). Repetition of an element of the pattern can be indicated by following that element with a numerical value or a numerical range between parenthesis. Examples: x(3) corresponds to x-x-x, x(2,4) corresponds to x-x or x-x-x or x-x-x-x. When a pattern is restricted to either the N- or C-terminal of a sequence, that pattern either starts with a '<' symbol or respectively ends with a '>' symbol. A period ends the pattern. (Optional in fuzzpro). For example, [DE](2)HS{P}X(2)PX(2,4)C

DataBase administrator


 Genomapper BLAST  psiBLAST  Mulalbla  Multalin  Genes & Genomes  BLAST (restricted)  Genome Guts  COG Guess  INTERPROScan  CDD search  Pattern search  Sequence Patterns  COG Trees  Genome Syntenizer

Enter pattern here: (raw text)


Enter sequence to match : (fasta)


or upload a file :

(please make sure the file is in valid fasta format)
Submit your Query Start Over                
Search type Sequence type on which to perform pattern match on
Frames to use fuzztran specific options
Genetic code fuzztran specific options
Allowed mismatches Fuzz pro/nuc/tran option
Output option Fuzz pro/nuc/tran output option (available :default [custom summary display], excel, gff, pir,trace,dbmotif, feattable, motif, simple, tagseq)
Other common options

Complete Genomes DataBase selection

(Javascript must be activated)
Genome list last revision on Wed Jul 21 2010 11:56:27

Full Phylogenetic Domain selection: Select full phylogenetic domains or open genomes window to make custom selection

 ALL       BACTERIA                ARCHAEA                    EUKARYA            THERMOPHILES      
 
Taxonomy Name selection: Type valid taxonomy name - Suggested names appear as you type

         
Matches only full names
(Names for lineages are from NCBI Taxonomy data)
Taxa Tree selection: Browse taxonomy trees . Click on items to open/close nodes or open lineage selections

(Lineage data for tree construction is from NCBI Taxonomy data)

Tree Help
Expand Bacteria tree   Expand Archaea tree Expand Eukarya tree Expand Thermophiles tree























Y.Zivanovic. cnrs/ups 1998-2010