Exploratory Visualization of Higher Dimensional Data


Visualization is an important asset to data analysis, both in communicating results and in explicating the analysis narrative which led to them. However, it is sometimes at its most powerful when used prior to commitment to any analysis narrative, simply to explore the data with minimal prejudice. This is exploratory visualization and its goal is to reveal structure in the data, especially unanticipated structure. Insights gained from exploratory visualization can inform and possibly significantly affect any subsequent analysis narrative. The size of modern data, in dimensionality and in numbers of observations, poses a formidable challenge for exploratory visualization. First, dimensionality is limited to at most three physical dimensions both by the visual system and by modern display technology. Second, the number of observations that can be individually displayed on any device is constrained by the magnitude and resolution of its display screen. The challenge is to develop methods and tools that enable exploratory visualization of modern data in the face of such constraints. Some methods and software which we have designed to address this challenge will be presented in this talk Hurley). Most of the talk will focus on the problem of exploring higher dimensional spaces, largely through defining, following, and presenting “interesting” low dimensional trajectories through high dimensional space. Both spatial and temporal strategies will be used to allow visual traversal of the trajectories. Software which facilitates exploration via these trajectories will be demonstrated (based mainly on the interactive and extendible exploratory visualization system called loon, and zenplots, each of which are available as an R package from CRAN). If time permits, our methodology (and software) for reducing the number of observations (without compromising too much either the empirical distribution or important geometric features of the high dimensional point-cloud) will also be presented.

Institute for Statistics and Mathematics
Wien University of Economics and Business, Vienna Austria
R. Wayne Oldford
Professor of Statistics

My research interests include data visualization, exploratory data analysis, and interactive high dimensional data analysis.