Simple Summary Transposable elements (TEs) are repetitive sequences comprising more than one third of the human genome with the original ability to change their location within the genome. Owing to their repetitive nature, the quantification of TEs results often challenging. RNA-seq is a useful tool for genome-wide TEs quantification, nevertheless it also presents technical issues, including low reads mappability and erroneous quantification derived from the transcription of TEs fragments embedded in canonical transcripts. Fragments derived from TEs are found within the introns of most genes, which led to the hypothesis that intron retention (IR) can affect the unbiased quantification of TEs expression. Performing meta-analysis of public RNA-seq datasets, here we observe that IR can indeed impact the quantification of TEs by increasing the number of reads mapped on intronic TE copies. Our work highlights a correlation between IR and TEs expression measurement by RNA-seq that should be taken into account to achieve reliable TEs quantification, especially in samples characterized by extensive IR, because differential IR might be confused with differential TEs expression. Transposable elements (TEs), also known as "jumping genes", are repetitive sequences with the capability of changing their location within the genome. They are key players in many different biological processes in health and disease. Therefore, a reliable quantification of their expression as transcriptional units is crucial to distinguish between their independent expression and the transcription of their sequences as part of canonical transcripts. TEs quantification faces difficulties of different types, the most important one being low reads mappability due to their repetitive nature preventing an unambiguous mapping of reads originating from their sequences. A large fraction of TEs fragments localizes within introns, which led to the hypothesis that intron retention (IR) can be an additional source of bias, potentially affecting accurate TEs quantification. IR occurs when introns, normally removed from the mature transcript by the splicing machinery, are maintained in mature transcripts. IR is a widespread mechanism affecting many different genes with cell type-specific patterns. We hypothesized that, in an RNA-seq experiment, reads derived from retained introns can introduce a bias in the detection of overlapping, independent TEs RNA expression. In this study we performed meta-analysis using public RNA-seq data from lymphoblastoid cell lines and show that IR can impact TEs quantification using established tools with default parameters. Reads mapped on intronic TEs were indeed associated to the expression of TEs and influence their correct quantification as independent transcriptional units. We confirmed these results using additional independent datasets, demonstrating that this bias does not appear in samples where IR is not present and that differential TEs expression does not impact on IR quantification. We concluded that IR causes the over-quantification of intronic TEs and differential IR might be confused with differential TEs expression. Our results should be taken into account for a correct quantification of TEs expression from RNA-seq data, especially in samples in which IR is abundant.

Meta-Analysis Suggests That Intron Retention Can Affect Quantification of Transposable Elements from RNA-Seq Data / Gualandi, Nicolò; Iperi, Cristian; Esposito, Mauro; Ansaloni, Federico; Gustincich, Stefano; Sanges, Remo. - In: BIOLOGY. - ISSN 2079-7737. - 11:6(2022), pp. 1-14. [10.3390/biology11060826]

Meta-Analysis Suggests That Intron Retention Can Affect Quantification of Transposable Elements from RNA-Seq Data

Gualandi, Nicolò;Esposito, Mauro;Ansaloni, Federico;Gustincich, Stefano;Sanges, Remo
2022-01-01

Abstract

Simple Summary Transposable elements (TEs) are repetitive sequences comprising more than one third of the human genome with the original ability to change their location within the genome. Owing to their repetitive nature, the quantification of TEs results often challenging. RNA-seq is a useful tool for genome-wide TEs quantification, nevertheless it also presents technical issues, including low reads mappability and erroneous quantification derived from the transcription of TEs fragments embedded in canonical transcripts. Fragments derived from TEs are found within the introns of most genes, which led to the hypothesis that intron retention (IR) can affect the unbiased quantification of TEs expression. Performing meta-analysis of public RNA-seq datasets, here we observe that IR can indeed impact the quantification of TEs by increasing the number of reads mapped on intronic TE copies. Our work highlights a correlation between IR and TEs expression measurement by RNA-seq that should be taken into account to achieve reliable TEs quantification, especially in samples characterized by extensive IR, because differential IR might be confused with differential TEs expression. Transposable elements (TEs), also known as "jumping genes", are repetitive sequences with the capability of changing their location within the genome. They are key players in many different biological processes in health and disease. Therefore, a reliable quantification of their expression as transcriptional units is crucial to distinguish between their independent expression and the transcription of their sequences as part of canonical transcripts. TEs quantification faces difficulties of different types, the most important one being low reads mappability due to their repetitive nature preventing an unambiguous mapping of reads originating from their sequences. A large fraction of TEs fragments localizes within introns, which led to the hypothesis that intron retention (IR) can be an additional source of bias, potentially affecting accurate TEs quantification. IR occurs when introns, normally removed from the mature transcript by the splicing machinery, are maintained in mature transcripts. IR is a widespread mechanism affecting many different genes with cell type-specific patterns. We hypothesized that, in an RNA-seq experiment, reads derived from retained introns can introduce a bias in the detection of overlapping, independent TEs RNA expression. In this study we performed meta-analysis using public RNA-seq data from lymphoblastoid cell lines and show that IR can impact TEs quantification using established tools with default parameters. Reads mapped on intronic TEs were indeed associated to the expression of TEs and influence their correct quantification as independent transcriptional units. We confirmed these results using additional independent datasets, demonstrating that this bias does not appear in samples where IR is not present and that differential TEs expression does not impact on IR quantification. We concluded that IR causes the over-quantification of intronic TEs and differential IR might be confused with differential TEs expression. Our results should be taken into account for a correct quantification of TEs expression from RNA-seq data, especially in samples in which IR is abundant.
2022
11
6
1
14
826
10.3390/biology11060826
https://pubmed.ncbi.nlm.nih.gov/35741347/
Gualandi, Nicolò; Iperi, Cristian; Esposito, Mauro; Ansaloni, Federico; Gustincich, Stefano; Sanges, Remo
File in questo prodotto:
File Dimensione Formato  
biology-11-00826-v2-1.pdf

accesso aperto

Descrizione: pdf editoriale
Tipologia: Versione Editoriale (PDF)
Licenza: Creative commons
Dimensione 3.65 MB
Formato Adobe PDF
3.65 MB Adobe PDF Visualizza/Apri

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/20.500.11767/132110
Citazioni
  • ???jsp.display-item.citation.pmc??? 5
  • Scopus 7
  • ???jsp.display-item.citation.isi??? 7
social impact