Extension of Partial Gene Transcripts by Iterative Mapping of RNA-Seq Raw Reads

Abstract

Many non-model organisms lack reference genomes and the sequencing and de novo assembly of an organisms transcriptome is an affordable means by which to characterize the coding component of its genome. Despite the advances that have made this possible, assembling a transcriptome without a known reference usually results in a collection of full-length and partial gene transcripts. The downstream analysis of genes represented as partial transcripts then often requires further experimental work in the laboratory in order to obtain full-length sequences. We have explored whether partial transcripts, encoding genes of interest present in de novo assembled transcriptomes of a model and non-model insect species, could be further extended by iterative mapping against the raw transcriptome sequencing reads. Partial sequences encoding cytochrome P450s and carboxyl/cholinesterase were used in this analysis, because they are large multigene families and exhibit significant variation in expression. We present an effective method to improve the contiguity of partial transcripts in silico that, in the absence of a reference genome, may be a quick and cost-effective alternative to their extension by laboratory experimentation. Our approach resulted in the successful extension of incompletely assembled transcripts, often to full length. We experimentally validated these results in silico and using real-time PCR and sequencing.

Publication
IEEE/ACM Transactions on Computational Biology and Bioinformatics
Click the Cite button above to demo the feature to enable visitors to import publication metadata into their reference management software.
Create your slides in Markdown - click the Slides button to check out the example.
Kumar Saurabh Singh
Kumar Saurabh Singh
Assistant Professor