2023-05-10
Presentation by Brielin Brown
Multiset Correlation and Factor Analysis (MCFA)
MESA (Multi-ethnic study of Atherosclerosis) - NHLBI TOPMed - Multi-omic analysis but also multi-phenotype, multi-tissue, multi-ancestry, multi-center. NHLBI TOPMed generate multi-omic data for tens of thousands of patients. MESA multi-omic pilot. Frozen blood samples - plasma for proteins, metabolites - PBMCs (monocytes, T-cells) for gene expression - Whole blood for methylation.
\[X_m \sim N(0, I_{k_m})\] \[Z \sim N(0, I_d)\] \[Y_m \sim N(X_m L_m^T + ZW_m^T, \Phi_m)\]
hidden private - \(X_m\), hidden shared - \(Z\), observed multimodal data - \(Y_m\)
ChatGPT for rapid gene function interpretation
ChatGPT could succesfully provide functions of genes, but provided wrong answers about the methylation markers.
Comparison to other methods
- Multi-modal auto-encoder. Could not find factors reliably.
- MOFA (Multi-omics factor analysis). Problem with unbalanced weight. Puts more weight on dataset with more features (e.g. methylation data contains ~50,000 features compared to expression data with ~20,000 features).
bimmer/inspre
Directed graph on 149 UK Biobank phenotypes is highly connected.