Show simple item record

dc.contributor.authorWang, J.zh_CN
dc.contributor.authorZou, Q.zh_CN
dc.contributor.authorGuo, M. Z.zh_CN
dc.contributor.author王军zh_CN
dc.date.accessioned2013-12-12T02:08:29Z
dc.date.available2013-12-12T02:08:29Z
dc.date.issued2010zh_CN
dc.identifier.citationGenetics and Molecular Research,9(2):820-834zh_CN
dc.identifier.issn1676-5680zh_CN
dc.identifier.otherISI:000280396600020zh_CN
dc.identifier.urihttps://dspace.xmu.edu.cn/handle/2288/60540
dc.descriptionChinese Natural Science Foundation [60932008, 60871092]; Natural Science Foundation of Heilongjiang Province in China [ZJG0705]zh_CN
dc.description.abstractAbundant single nucleotide polymorphisms (SNPs) provide the most complete information for genome-wide association studies. However, due to the bottleneck of manual discovery of putative SNPs and the inaccessibility of the original sequencing reads, it is essential to develop a more efficient and accurate computational method for automated SNP detection. We propose a novel computational method to rapidly find true SNPs in public-available EST (expressed sequence tag) databases; this method is implemented as SNPDigger. EST sequences are clustered and aligned. SNP candidates are then obtained according to a measure of redundant frequency. Several new informative biological features, such as the structural neighbor profiles and the physical position of the SNP, were extracted from EST sequences, and the effectiveness of these features was demonstrated. An ensemble classifier, which employs a carefully selected feature set, was included for the imbalanced training data. The sensitivity and specificity of our method both exceeded 80% for human genetic data in the cross validation. Our method enables detection of SNPs from the user's own EST dataset and can be used on species for which there is no genome data. Our tests showed that this method can effectively guide SNP discovery in ESTs and will be useful to avoid and save the cost of biological analyses.zh_CN
dc.language.isoen_USzh_CN
dc.source.urihttp://dx.doi.org/10.4238/vol9-2gmr765zh_CN
dc.subjectSINGLE-NUCLEOTIDE POLYMORPHISMSzh_CN
dc.subjectTAG DATAzh_CN
dc.subjectDATABASEzh_CN
dc.subjectDISCOVERYzh_CN
dc.subjectPROGRAMzh_CN
dc.subjectMAPzh_CN
dc.titleMining SNPs from EST sequences using filters and ensemble classifierszh_CN
dc.typeArticlezh_CN


Files in this item

Thumbnail

This item appears in the following Collection(s)

Show simple item record