ABSTRACT
Background: Deconvolution is a mathematical process of resolving an observed function into its constituent elements from which the signal was formed. In the field of biomedical research, deconvolution analysis is applied to obtain single cell-type or tissue specific signatures from a mixed signal. Recent development of next generation sequencing technology suggests RNA-seq as a fast and accurate method for obtaining transcriptomic profiles. Although linearity assumption is required in most of deconvolution algorithms, few studies have been conducted to investigate which RNA-seq quantification methods yield the optimum linear space for deconvolution analysis.
Results: Using a benchmark RNA-seq dataset, we investigated the linearity of abundance estimated from seven most popular RNA-seq quantification methods both at the gene and isoform levels. Linearity is evaluated through parameters estimation, concordance analysis and residual analysis based on a multiple linear regression model. Results show that count data gives poor parameter estimations, large intercepts and high inter-sample variability; while TPM value from Kallisto and Salmon shows high linearity in all analyses.
Conclusions: Salmon and Kallisto TPM data gives the best fit to the linear model studied. This suggests that TPM values estimated from Salmon and Kallisto are the ideal RNA-seq measurements for deconvolution studies.
Recommendations
Strand specific RNA-seq data for higher specificity
RACS '15: Proceedings of the 2015 Conference on research in adaptive and convergent systemsHigh-throughput RNA Sequencing (RNA-seq) has become a popular tool for transcriptome analysis. An important application of RNA-seq is to detect differential alternative splicing, that is, differences in exon splicing patterns under different biological ...
A robust method for transcript quantification with RNA-seq data
RECOMB'12: Proceedings of the 16th Annual international conference on Research in Computational Molecular BiologyThe advent of high throughput RNA-seq technology allows deep sampling of the transcriptome, making it possible to characterize both the diversity and the abundance of transcript isoforms. Accurate abundance estimation or transcript quantification of ...
Towards Reliable Isoform Quantification Using RNA-Seq Data
BIBM '09: Proceedings of the 2009 IEEE International Conference on Bioinformatics and BiomedicineIn eukaryotes, alternative splicing often generates multiple splice variants from a single gene. Here we explore the use of RNA sequencing (RNA-Seq) datasets to address the isoform quantification problem. Given a set of known splice variants, the goal ...
Comments