STATISTICAL ANALYSIS PIPELINES FOR LARGE-SCALE SINGLE-CELL SEQUENCING DATA INTERPRETATION

Authors

  • Durga B Author
  • Dr. Sathasivam Sivamalar Author
  • Uma Maheswari G Author
  • Antonibiya S Author
  • Saravanan Manoharan Author

DOI:

https://doi.org/10.4238/atwfbh90

Keywords:

Single-cell sequencing, scRNA-seq, statistical pipelines, bioinformatics, clustering, dimensionality reduction, transcriptomics, machine learning.

Abstract

Background: Single-cell sequencing technologies have become powerful tools to decipher cellular heterogeneity in complex biological systems. However, large-scale single cell datasets are subject to high dimensionality, sparsity, technical noise and batch effects, which make statistical interpretation computationally challenging.
Objective: The aim of this work is to develop and test statistical analysis pipelines, including preprocessing, normalization, clustering and dimensionality reduction methods, for the efficient interpretation of large-scale single-cell sequencing data.

methodology: Using statistical frameworks such as Seurat, Scanpy, PCA, UMAP and Leiden clustering, we analyzed publicly available scRNA-seq datasets of more than one million cells. Quality control filtering, normalization, differential expression analysis and visualization methods were employed to improve biological interpretation and computational scalability.

Findings: The proposed pipeline reduced technical noise by ∼35% and improved clustering accuracy by 28% compared to conventional pre-processing. Scanpy had higher runtime efficiency, and Leiden clustering provided better separation of cell populations with an ARI score of 0.91.

Conclusion: Robust statistical pipelines greatly enhance the accuracy, scalability and reproducibility of interpretation of large-scale single-cell sequencing data. Further development of advanced computational frameworks and machine learning approaches can improve biological discovery and clinical research applications.

Downloads

Published

2026-03-20

Issue

Section

Articles

How to Cite

STATISTICAL ANALYSIS PIPELINES FOR LARGE-SCALE SINGLE-CELL SEQUENCING DATA INTERPRETATION. (2026). Genetics and Molecular Research. https://doi.org/10.4238/atwfbh90

Most read articles by the same author(s)

1 2 > >> 

Similar Articles

1-10 of 353

You may also start an advanced similarity search for this article.