论文
您当前的位置 :
Integrative multi-omics machine learning reveals novel driver genes associations in lung adenocarcinoma
论文作者 Yuan, F; Huang, FM; Cao, XY; Zhang, YH; Feng, KY; Bao, YS; Huang, T; Cai, YD
期刊/会议名称 BIOCHIMICA ET BIOPHYSICA ACTA-PROTEINS AND PROTEOMICS
论文年度 2026
论文类别
摘要 Lung adenocarcinoma remains a major challenge in cancer research due to its complex molecular underpinnings. In this study, we developed an integrated machine learning framework to identify novel driver genes associated with lung adenocarcinoma by leveraging multi-omics data. We curated gene candidates from methylation, RNAseq, mutation, and miRNA levels, and mapped them onto a protein-protein interaction network from STRING to generate informative feature vectors using a node2vec method. Furthermore, they were also represented by GO and KEGG enrichment features. All features were then refined through a multi-step process, beginning with the Boruta algorithm for filtering and followed by the minimum redundancy maximum relevance method for ranking. An incremental feature selection strategy was employed to determine the optimal feature subsets, which were used to build predictive models with random forest and support vector machine classifiers. To address class imbalance, synthetic sampling was applied, and ten-fold cross-validation ensured model robustness. Consequently, we predicted 428, 105, 1039, and 1748 potential lung adenocarcinoma driver genes for RNAseq, methylation, mutation, and miRNA levels, respectively. Integrated analysis of overlapping gene sets further highlighted key candidates, including PQLC3, FAM192A, FAM83D, SPRED1, SFTPB, and TM4SF5, with high composite probability scores. Some identified genes may be the driver genes of lung adenocarcinoma and have some druggable potential. These findings provide new insights into the molecular mechanisms of lung adenocarcinoma and suggest promising targets for future diagnostic and therapeutic strategies.
1874
影响因子 2.3