Deep learning of 2D-Restructured gene expression representations for improved low-sample therapeutic response prediction

Published in Computers in Biology and Medicine, 2023

Recommended citation:Cheng K P*, Shen W X*, Jiang Y Y, et al. Deep learning of 2D-Restructured gene expression representations for improved low-sample therapeutic response prediction[J]. Computers in Biology and Medicine, 2023, 164: 107245. https://www.sciencedirect.com/science/article/abs/pii/S0010482523007102

Journal

PoD

Clinical outcome prediction is important for stratified therapeutics. Machine learning (ML) and deep learning (DL) methods facilitate therapeutic response prediction from transcriptomic profiles of cells and clinical samples. Clinical transcriptomic DL is challenged by the low-sample sizes (34-286 subjects), high-dimensionality (up to 21,653 genes) and unordered nature of clinical transcriptomic data. The established methods rely on ML algorithms at accuracy levels of 0.6-0.8 AUC/ACC values. Low-sample DL algorithms are needed for enhanced prediction capability. Here, an unsupervised manifold-guided algorithm was employed for restructuring transcriptomic data into ordered image-like 2D-representations, followed by efficient DL of these 2D-representations with deep ConvNets. Our DL models significantly outperformed the state-of-the-art (SOTA) ML models on 82% of 17 low-sample benchmark datasets (53% with >0.05 AUC/ACC improvement). They are more robust than the SOTA models in cross-cohort prediction tasks, and in identifying robust biomarkers and response-dependent variational patterns consistent with experimental indications.