vefwholesale.blogg.se - Adding features to sequence in codoncode aligner

seqs: the maximum number of aligned sequences to keep.

E-value: the E-value threshold for saving hits.

Then, three parameters allow to control the query execution:

From external file: this option allows to select an external FASTA file to be used as query file.

From selected file: this option allows to select one of the selected files in SEDA using the ‘File query’ combobox.

Thirdly, the ‘Query source’ allows to select the source of the query file: By selecting the BLAST type: ( i) the type of database is automatically determined, and ( ii) if blastx or tblastn types are selected, then you will only be allowed to select a query from an external file because the selected files used to construct the database cannot be used as query (blastx uses a database of proteins and a query of nucleotides and tblastn uses a database of nucleotides and a query of proteins). Secondly, you must choose the BLAST type that you want to perform using the ‘BLAST type’ parameter. As explained before, first you must choose the query mode in the ‘Query against’ parameter. To create these output files, the sequences where hits were found are retrieved from the database.įinally, the ‘Query configuration’ area allows to control how queries are performed. As a result, this mode creates as many output files as sequences in the FASTA file. Finally, each sequence in the FASTA file used as query source is executed against the alias. Then, one alias referencing to all the databases created before is created. Firstly, one BLAST database is created for each selected FASTA file.

The figure below illustrates the process followed when a query against all selected FASTA files is performed. When performing this operation, one BLAST query is executed for each sequence in the FASTA file. Regarding the query, there are also two possibilities: using the sequences in one of the selected FASTA as queries or using the sequences in an external FASTA file as queries. Regarding the database to use in the queries, there are two possible modes: querying against all the selected FASTA files or querying against each FASTA file separately. This operation allows performing different BLAST queries using the selected FASTA files. Reformat output file: allows to specify the format parameters of the output FASTA containing the consensus sequence (see section Reformat file to learn more about this formatting). On the other hand, when this option is selected, then all amino acids in such positions are reported (e.g. Verbose: in protein sequences, when this option is unselected then X is used for ambiguous positions in the consensus sequence.

Read the Consensus bases description to understand how this option is used in each case. Minimum presence: the minimum presence for a given nucleotide or amino acid in order to be part of the consensus sequence. Those positions where all base frequencies are below the Minimum presence threshold are represented by an N (nucleotide sequences) or X (protein sequences) in the consensus sequence.

Above threshold: considers all nucleotide (DNA) or amino acid (protein) bases with a frequence above the Minimum presence threshold at each position.

Those positions where the most frequent base is under the Minimum presence threshold are represented by an N (nucleotide sequences) or X (protein sequences) in the consensus sequence. Most frequent: considers the most frequent nucleotide (DNA) or amino acid (protein) bases at each position.Conserved Genome Annotation (CGA) Pipeline.