High-Dimensional Cancer Genomic Data Modeling Using Distributed Machine Learning Algorithms for Genetic and Transcriptomic Pattern Discovery

Authors

  • K. Anitha Vallurupalli Nageswara Rao Vignana Jyothi Institute of Engineering and Technology, Vignana Jyothi Nagar, Pragathi Nagar, Nizampet(S.O, Hyderabad, Telangana Author
  • Omar Elkalesh College of engineering and applied science, American University of Kuwait, Kuwait. Author
  • Ali Bostani Associate Professor, College of Engineering and Applied Sciences, American University of Kuwait, Salmiya, Kuwait. Author
  • Dr. Chitra K Associate Professor, Dept of MCA, Dayananda Sagar Academy of Technology and Management, Udayapura, Bangalore -82, India. Author
  • Arivukkodi R Department of Computer Science, Meenakshi College of Arts and Science, Meenakshi Academy of Higher Education and Research, Chennai, Tamilnadu, India. Author
  • E. Elakkiya Assistant Professor, Department of Information Technology, St. Joseph's College of Engineering, Old Mahabalipuram Road, Chennai, Tamil Nadu, India. Author
  • Makhfirat Kibriyeva Department of Medicine, Termez University of Economics and Service, Termez, Uzbekistan. Author

DOI:

https://doi.org/10.4238/nq9ks444

Abstract

There has been a surge in the development of large-scale cancer genomic and transcriptomic datasets as a result of the rapid development of high-throughput sequencing technologies, posing serious analytic problems, as this data is high-dimensional, has a lot of noise, and multiple nonlinear interactions. The classic statistical methods might not be able to discern patterns of biologic significance of such data thus constraining their application in extensive cancer genomics. In this paper, we introduce a distributed machine learning-based modelling system that is proposed to analyse high-dimensional cancer genomic and transcriptomic data to yield genetic and transcriptomic patterns related to cancer-related biological processes. Cancer datasets with gene expression and transcriptomic profiles in a publicly available format were used and they were processed systematically by preprocessing data, normalising and dimensionality reduction based on the need to reduce redundancy and noise to an acceptable extent. The algorithms of machine learning were then implemented in a distributed computing framework to effectively manage extensive spaces of features as well as supporting scalable pattern discovery. The proposed system was effective in determining clear gene expression patterns and transcriptomic signatures, which showed significant correlation with the well-known cancer-associated pathways, together with the cell proliferation, apoptotic, and signal transduction pathways. The identified features were also found to be biologically relevant using the functional enrichment and statistical validation techniques. On the whole, the findings demonstrate that distributed machine learning methods can be successfully used to conduct comprehensive cancer genomic analysis without sacrificing biological interpretability. It is an economical, high-quality analytical strategy to reveal meaningful genetic and transcriptomic patterns, which has a potential use value in cancer research and precision medicine implementation.

Downloads

Published

2026-01-06

Issue

Section

Articles

How to Cite

High-Dimensional Cancer Genomic Data Modeling Using Distributed Machine Learning Algorithms for Genetic and Transcriptomic Pattern Discovery. (2026). Genetics and Molecular Research, 25(1), 1-9. https://doi.org/10.4238/nq9ks444

Most read articles by the same author(s)