Department of Botany, Masaryk University, Faculty of Science
Kotlárská 2, CZ-611 37 Brno, Czech Republic, tel. +420 549491434, fax +420 541211214

 
The InDeVal insertion/deletion evaluation tool
by Jason A. Holt and Sierra D. Stoneberg Holt

 
A program for finding target regions in DNA sequences and for aiding in sequence comparison
InDeVal (with Poaceae trnL–F input files; ZIP file, 3,3 MB)
InDeVal Source Code (in Visual Basic 6.0; ZIP file, 57 kB)

Gaps caused by insertion and deletion events are an important feature of DNA sequence data. They can sometimes provide valuable phylogenetic information, but can in other cases be misleading. Some regions of DNA are prone to repeated, overlapping insertion and deletion events, which can seriously hinder sequence alignment and evaluation. The opportunity to compare length variations across a wide spectrum of related species can improve their overall interpretation by indicating which indels may be highly homoplasious and what transitional states may have been involved in a complex of indel events. However, because indels complicate sequence alignment, pin-pointing a specific indel region for comparison within an appropriately large number of sequences can be very time consuming.

InDeVal is a computer program designed to aid in the interpretation of insertion and deletion events. Because these events tend to be restricted to specific length-variable regions, InDeVal searches new sequences for length-conserved regions using the LPAM (Length Preserving Alignment Method) algorithm. It then compares each intervening length-variable region with a file containing all of its known variations. The program is supplied with input files based on the 530 Poaceae trnL–F sequences in the NCBI Entrez Nucleotides database, and the documentation includes instructions for preparing template and variable region files for other sequence regions. Although it was designed to find indel regions, InDeVal can be used to quickly target any region of particular interest (a specific codon or a given loop region from rDNA, for instance) that a researcher wishes to compare across sequences. InDeVal is available for Windows.
 


Stoneberg Holt, Sierra D. & Jason A. Holt.  2004.  The InDeVal insertion/deletion evaluation tool: a program for finding target regions in DNA sequences and for aiding in sequence comparison.  BMC Bioinformatics 5:173.
http://www.biomedcentral.com/content/pdf/1471-2105-5-173.pdf
DOI: 10.1186/1471-2105-5-173

Abstract

Background

The program InDeVal was originally developed to help researchers find known regions of insertion/deletion activity (with the exception of isolated single-base indels) in newly determined Poaceae trnL–F sequences and compare them with 533 previously determined sequences. It is supplied with input files designed for this purpose. More broadly, the program is applicable for finding specific target regions (referred to as "variable regions") in DNA sequence. A variable region is any specific sequence fragment of interest, such as an indel region, a codon or codons, or sequence coding for a particular RNA secondary structure.

Implementation
InDeVal input is DNA sequence and a template file (sequence flanking each variable region). Additional files contain the variable regions and user-defined messages about the sequence found within them (e.g., taxa sharing each of the different indel patterns).
Variable regions are found by determining the position of flanking sequence (referred to as "conserved regions") using the LPAM (Length-Preserving Alignment Method) algorithm. This algorithm was designed for InDeVal and is described here for the first time.
InDeVal output is an interactive display of the analyzed sequence, broken into user-defined units. Once the user is satisfied with the organization of the display, the information can be exported to an annotated text file.

Conclusions
InDeVal can find multiple variable regions simultaneously (28 indel regions in the Poaceae trnL–F files) and display user-selected messages specific to the sequence variants found. InDeVal output is designed to facilitate comparison between the analyzed sequence and previously evaluated sequence. The program's sensitivity to different levels of nucleotide and/or length variation in conserved regions can be adjusted. InDeVal is currently available for Windows in Additional file 1 or from http://www.sci.muni.cz/botany/elzdroje/indeval/ (this page).
 

This and related pages were last updated on November 26, 2004.

Any comments and suggestions are welcome at sierra@sci.muni.cz