STATISTICAL ANALYSIS PIPELINES FOR LARGE-SCALE SINGLE-CELL SEQUENCING DATA INTERPRETATION

Durga  B; Dr. Sathasivam  Sivamalar; Uma Maheswari G; Antonibiya S; Saravanan Manoharan

doi:10.4238/atwfbh90

Authors

Durga B Author
Dr. Sathasivam Sivamalar Author
Uma Maheswari G Author
Antonibiya S Author
Saravanan Manoharan Author

DOI:

https://doi.org/10.4238/atwfbh90

Keywords:

Single-cell sequencing, scRNA-seq, statistical pipelines, bioinformatics, clustering, dimensionality reduction, transcriptomics, machine learning.

Abstract

Background: Single-cell sequencing technologies have become powerful tools to decipher cellular heterogeneity in complex biological systems. However, large-scale single cell datasets are subject to high dimensionality, sparsity, technical noise and batch effects, which make statistical interpretation computationally challenging.
Objective: The aim of this work is to develop and test statistical analysis pipelines, including preprocessing, normalization, clustering and dimensionality reduction methods, for the efficient interpretation of large-scale single-cell sequencing data.

methodology: Using statistical frameworks such as Seurat, Scanpy, PCA, UMAP and Leiden clustering, we analyzed publicly available scRNA-seq datasets of more than one million cells. Quality control filtering, normalization, differential expression analysis and visualization methods were employed to improve biological interpretation and computational scalability.

Findings: The proposed pipeline reduced technical noise by ∼35% and improved clustering accuracy by 28% compared to conventional pre-processing. Scanpy had higher runtime efficiency, and Leiden clustering provided better separation of cell populations with an ARI score of 0.91.

Conclusion: Robust statistical pipelines greatly enhance the accuracy, scalability and reproducibility of interpretation of large-scale single-cell sequencing data. Further development of advanced computational frameworks and machine learning approaches can improve biological discovery and clinical research applications.

STATISTICAL ANALYSIS PIPELINES FOR LARGE-SCALE SINGLE-CELL SEQUENCING DATA INTERPRETATION

Authors

DOI:

Keywords:

Abstract

Downloads

Published

Issue

Section

License

Make a Submission

side

INDEXING

right