-
CiteScore
5.0
Impact Factor
IECE Transactions on Internet of Things, 2024, Volume 2, Issue 1: 20-25

Free Access | Research Article | 12 February 2024
1 College of Informatics, Huazhong Agricultural University, Wuhan 430070, China
2 School of Computer, BaoJi University of Arts and Sciences, Baoji 721016, China
* Corresponding author: Zilin Wang, email: [email protected]
Received: 13 December 2023, Accepted: 28 January 2024, Published: 12 February 2024  

Abstract
The volume and complexity of data in various fields, particularly in biology, are increasing exponentially, posing a challenge to existing analytical methods, which often struggle with high-dimensional data such as single-cell Hi-C data. To address this issue, we employ unsupervised methods, specifically Principal Component Analysis (PCA) and t-Distributed Stochastic Neighbor Embedding (t-SNE), to reduce data dimensions for visualization. Furthermore, we assess the information retention of the decomposed components using a Linear Discriminant Analysis (LDA) classifier model. Our findings indicate that these dimensionality reduction techniques effectively capture and present information not readily apparent in the original high-dimensional data, facilitating the visualization and interpretation of complex biological data. The LDA classifier's performance suggests that PCA and t-SNE maintain critical information necessary for accurate classification. In conclusion, our study demonstrates that PCA and t-SNE are powerful tools for visualizing and analyzing high-dimensional biological data, enabling researchers to gain new insights and understandings that are challenging to achieve with traditional approaches.

Graphical Abstract
Application of Dimension Reduction Methods to High-Dimensional Single-Cell 3D Genomic Contact Data

Keywords
Dimensionality reduction
Single-cell Hi-C
PCA
t-SNE
LDA

References

[1] Rosenthal, M., Bryner, D., Huffer, F., Evans, S., Srivastava, A., & Neretti, N. (2019). Bayesian estimation of three-dimensional chromosomal structure from single-cell Hi-C Data. Journal of Computational Biology, 26(11), 1191–1202.

[2] Yang, T., Zhang, F., Yardımci, G. G., Song, F., Hardison, R. C., Noble, W. S., Yue, F., & Li, Q. (2017). HiCRep: assessing the reproducibility of Hi-C data using a stratum-adjusted correlation coefficient. Genome Research, 27(11), 1939–1949.

[3] Ursu, O., Boley, N., Taranova, M., Wang, Y. R., Yardimci, G. G., Stafford Noble, W., & Kundaje, A. (2018). GenomeDISCO: a concordance score for chromosome conformation capture experiments using random walks on contact map graphs. Bioinformatics, 34(16), 2701-2707.

[4] Yan, K. K., Yardımcı, G. G., Yan, C., Noble, W. S., & Gerstein, M. (2017). HiC-spector: a matrix library for spectral and reproducibility analysis of Hi-C contact maps. Bioinformatics, 33(14), 2199-2201.

[5] Sauria, M. E., & Taylor, J. (2017). QuASAR: quality assessment of spatial arrangement reproducibility in Hi-C data. BioRxiv, 204438.

[6] Yu, M., Abnousi, A., Zhang, Y., Li, G., Lee, L., Chen, Z., ... & Hu, M. (2020). Snaphic: a computational pipeline to map chromatin contacts from single cell hi-c data. BioRxiv, 2020-12.

[7] Lindsay, R. J., Pham, B., Shen, T., & McCord, R. P. (2018). Characterizing the 3D structure and dynamics of chromosomes and proteins in a common contact matrix framework. Nucleic acids research, 46(16), 8143-8152.

[8] Zhou, J., Ma, J., Chen, Y., Cheng, C., Bao, B., Peng, J., ... & Ecker, J. R. (2019). Robust single-cell Hi-C clustering by convolution-and random-walk–based imputation. Proceedings of the National Academy of Sciences, 116(28), 14011-14018.

[9] Liu, J., Lin, D., Yardımcı, G. G., & Noble, W. S. (2018). Unsupervised embedding of single-cell Hi-C data. Bioinformatics, 34(13), i96-i104.

[10] Lee, D. S., Luo, C., Zhou, J., Chandran, S., Rivkin, A., Bartlett, A., ... & Ecker, J. R. (2019).Simultaneous profiling of 3D genome structure and DNA methylation in single human cells. Nature methods, 16(10), 999-1006.

[11] Imakaev, M., Fudenberg, G., McCord, R. P., Naumova, N., Goloborodko, A., Lajoie, B. R., ... & Mirny, L. A. (2012). Iterative correction of Hi-C data reveals hallmarks of chromosome organization. Nature methods, 9(10), 999-1003.

[12] Hu, M., Deng, K., Selvaraj, S., Qin, Z., Ren, B., & Liu, J. S. (2012). HiCNorm: removing biases in Hi-C data via Poisson regression. Bioinformatics, 28(23), 3131-3133.

[13] Knight, P. A., & Ruiz, D. (2013). A fast algorithm for matrix balancing. IMA Journal of Numerical Analysis, 33(3), 1029-1047.

[14] Y. Hua & X. Wang (2023). Forest Fire Assessment and Analysisin Liangshan, Sichuan Province Based on Remote Sensing. IECE Transactions on Internet of Things, 1(1), 15-21.

[15] Yardımcı, G. G., Ozadam, H., Sauria, M. E., Ursu, O., Yan, K. K., Yang, T., ... & Noble, W. S. (2019). Measuring the reproducibility and quality of Hi-C data. Genome biology, 20, 1-19.

[16] Li, Y., & Cao, J. (2023). Adaptive Binary Particle Swarm Optimization for WSN Node Optimal Deployment Algorithm. IECE Transactions on Internet of Things, 1(1), 1-8.

[17] Wang, N., Fang, F., & Feng, M. (2014, May). Multi-objective optimal analysis of comfort and energy management for intelligent buildings. In The 26th Chinese control and decision conference (2014 CCDC) (pp. 2783-2788). IEEE.

[18] Lv, Y., Fang, F. A. N. G., Yang, T., & Romero, C. E. (2020). An early fault detection method for induced draft fans based on MSET with informative memory matrix selection. ISA transactions, 102, 325-334.

[19] Fang, F. A. N. G., Tan, W., & Liu, J. Z. (2005). Tuning of coordinated controllers for boiler-turbine units. Acta Automatica Sinica, 31(2), 291-296.

[20] Fang, F., Jizhen, L., & Wen, T. (2004). Nonlinear internal model control for the boiler-turbine coordinate systems of power unit. PROCEEDINGS-CHINESE SOCIETY OF ELECTRICAL ENGINEERING, 24(4), 195-199.


Cite This Article
APA Style
Wang, Z., Zhang, P., Sun, W., & Li, D. (2024). Application of Dimension Reduction Methods to High-Dimensional Single-Cell 3D Genomic Contact Data. IECE Transactions on Pattern Recognition and Intelligent Systems, 2(1), 20–25 https://doi.org/10.62762/TIOT.2024.186430

Article Metrics
Citations:

Crossref

0

Scopus

0

Web of Science

0
Article Access Statistics:
Views: 1687
PDF Downloads: 162

Publisher's Note
IECE stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions
IECE or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
IECE Transactions on Internet of Things

IECE Transactions on Internet of Things

ISSN: 2996-9298 (Online)

Email: [email protected]

Portico

Portico

All published articles are preserved here permanently:
https://www.portico.org/publishers/iece/

Copyright © 2024 Institute of Emerging and Computer Engineers Inc.