- Analyses/Tools
- SNPs
Solanaceae Single Nucleotide Polymorphisms (SNPs)
Using assembled EST sequences from the PlantGDB PUTs database, we have identified high confidence SNPs within the assemblies.
Our pipeline takes as input the multiple sequence alignments (MSA) of the members of each assembled PUT. However, our pipeline adds back into the MSAs ESTs that are near-exact substring matches to PUT member ESTs. These sequences have previously been excluded from the final assembly in the PlantGDB assembly pipeline.
The requirement for SNP discovery were:
- high level of sequence coverage of at least four reads per position
- stringent assembly of sequences
As false discovery of SNPs can be caused by low quality sequence or the assembly of paralogous sequences into a single contig, we have selected SNPs with at least two alternative base reads when compared to consensus.
Please click here for versioning information related to the current data set.
Data from our SNP database is also available for download from our FTP site.