2023-06-21

ReguloML: Learning the regulatory code of Alzheimer’s Disease Genetics

Presentation by Chirag Lakhani

Apply DL to predict and interpret the functional effects of AD-associated variants on DNA regulation and RNA processing. How are they linked to pathways and networks?

  • AD is a polygenic disease.
  • Latest GWAS: Bellenguez et al, Nature Genetics, 2022
  • Controversy - drugs focus on particular pathways, but some kind of fails, expensive.

Categories of pathogenic variants:

  • coding variants
  • transcriptional variants (need to know causal cell types) - disrupt TF binding (cell-type specific enhancers/promoter) - modify “epigenome” of a cell type (chromatin accessibility, histone modifications, 3D architecture - differential topology - via disrupting CTCF?)
  • post-transcriptional variants (splicing, other activity)

Functional annotations help in finding molecular features that best explain a polygenic trait. Cano-Gomez et al, Frontier in Genetics, 2020.

Idea: Use DL delta scores to predict the functional effect of any variant. Input: nucleotide sequence + (DeepSEA: chromatin, SpliceAI: splicing, etc) \(\rightarrow\) Delta score: functional importance.

Microglia regulatory regions contain largest percentage of AD GWAS heritability.

Ways to construct DL delta score annotations:

  • Integrate DL-based TF binding disruption scores with cell-type specific regulatory genomic regions. – Enformer model trained on ENCODE cell types (K562, HepG2, etc) – use epigenomic data from neuronal cell types.
  • Train DL models predicting cell-type specific assay read counts as a function of genomic DNA sequence – functional genomic assays like ATAC-Seq, H3K27ac and H3K4me3 – cell types microglia, astrocytes, oligodendrocytes and neuron – access to microglia ATAC-Seq data

Enformer: predicts 1000s of functional genomic assays from long input sequence (~200kb). Input: DNA sequence \(\rightarrow\) convolution + transformation layes on human + mouse tracks \(\rightarrow\) Output: genomic tracks. Avsec et al, Nature Methods, 2021

Overlap regions of predicted TF disruption with promoter/enhancer regions. Enformer predicted TF disruption help localize AD GWAS signal.

DL models can help finemapping.

Training DL models of chromatin accessibility for microglia

  • Multi-task DL model using chromatin accessibility data from 4 individuals. DNA sequence \(\rightarrow\) ATAC-Seq read counts.
  • Pampari et al, ChromBPNet, 2023

Notes:

  • Is there any other mechanism by which a variant can influence outcome except TF regulation?