Skip to content

Karimi-Lab/TE_CHIP

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

22 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

TE_CHIP

Preprocessing

Notes:

This folder contains the main Bash script and the R scripts for each processing step. Details of each step are explained in the workflow figure and description below.

Image

The data_processing.sh script processes raw FASTQ files using the nf-core/rnavar pipeline, which performs quality control and recalibration, alignment with STAR, and variant calling with GATK4 HaplotypeCaller.

Gene and transposable element (TE) expressions are quantified from the aligned BAM files using featureCounts v2.0, with genes mapped to GRCh38 and TEs mapped to RepeatMasker. Gene counts are normalised to transcripts per million (TPM), and TE counts are normalised using variance stabilizing transformation (VST) at class and family levels.

Variant call files (VCFs) generated by HaplotypeCaller are annotated for clinical relevance and mutation information using VEP, with reference to ClinVar (20241111) and COSMIC (v102). The annotated VCFs are then converted into MAF files and merged at the cohort level. Finally, to reduce artefacts and low-confidence calls, we applied the filtering strategy described by Jakobsen et al.1, retaining only high-confidence pathogenic somatic variants.

1Jakobsen, N. A. et al. Selective advantage of mutant stem cells in human clonal hematopoiesis is associated with attenuated response to inflammation and aging. Cell Stem Cell 31, 1127–1144.e1117 (2024).

Statistical_Analysis_Figures

This folder contains the scripts and data used to generate the figures for the project.

Notes:

  • Each figure has its corresponding script and data file(s).
  • For detailed explanations of the analyses, please refer to the Methods section and the figure legends in the manuscript.

Citation

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages