BACK TO INDEX
Publications about 'UMAP'
Articles in journal or book chapters
and D.A. Lauffenburger.
What cannot be seen correctly in 2D visualizations of single-cell 'omics data?.
Note: Submitted.Keyword(s): visualization,
Single-cell -omics datasets are high-dimensional and difficult to visualize. A common strategy for exploring such data is to create and analyze 2D projections. Such projections may be highly nonlinear, and implementation algorithms are designed with the goal of preserving aspects of the original highdimensional shape of data such as neighborhood relationships or metrics. However, important aspects of high-dimensional geometry are known from mathematical theory to have no equivalent representation in 2D, or are subject to large distortions, and will therefore be misrepresented or even invisible in any possible 2D representation. We show that features such as quantitative distances, relative positioning, and qualitative neighborhoods of high-dimensional data points will always be misrepresented in 2D projections. Our results rely upon concepts from differential geometry, combinatorial geometry, and algebraic topology. As an illustrative example, we show that even a simple single-cell RNA sequencing dataset will always be distorted, no matter what 2D projection is employed. We also discuss how certain recently developed computational tools can help describe the high-dimensional geometric features that will be necessarily missing from any possible 2D projections.
BACK TO INDEX
This material is presented to ensure timely dissemination of
scholarly and technical work. Copyright and all rights therein
are retained by authors or by other copyright holders.
Last modified: Fri Jul 29 12:19:44 2022
This document was translated from BibTEX by