IEEE VIS 2024 Content: VMC: A Grammar for Visualizing Statistical Model Checks

VMC: A Grammar for Visualizing Statistical Model Checks

Ziyang Guo - Northwestern University, Evanston, United States

Alex Kale - University of Chicago, Chicago, United States

Matthew Kay - Northwestern University, Chicago, United States

Jessica Hullman - Northwestern University, Evanston, United States

Room: Bayshore V

2024-10-17T12:54:00ZGMT-0600Change your timezone on the schedule page
2024-10-17T12:54:00Z
Exemplar figure, described by caption below
Example model check visualizations authored with VMC, using data from [ 46 ]. From left to right: checks on the density curves of the distributions of model predictions and observed data from (A) response variable to (B) distributional parameter; follow-up checks conditional on the quantitative predictor, where VMC is used to specify (C) Hypothetical Outcome Plots and (D) a line + ribbon plot; (E) a facet check stratifying the random effects and (F) a multilevel check; more checks for the random effects specified by VMC, including (G) raincloud plots and (H) multiple-interval plots; and residual checks specified by VMC, including (I) residual plots revealing the heteroskedasticity of the model and (J) Q-Q plots, validating the normality of residuals.
Fast forward
Keywords

Model checking and evaluation; Uncertainty visualization; Grammar of Graphics

Abstract

Visualizations play a critical role in validating and improving statistical models. However, the design space of model check visualizations is not well understood, making it difficult for authors to explore and specify effective graphical model checks. VMC defines a model check visualization using four components: (1) samples of distributions of checkable quantities generated from the model,including predictive distributions for new data and distributions of model parameters; (2) transformations on observed data to facilitate comparison; (3) visual representations of distributions; and (4) layouts to facilitate comparing model samples and observed data. We contribute an implementation of VMC as an R package. We validate VMC by reproducing a set of canonical model check examples, and show how using VMC to generate model checks reduces the edit distance between visualizations relative to existing visualization toolkits. The findings of an interview study with three expert modelers who used VMC highlight challenges and opportunities for encouraging exploration of correct, effective model check visualizations.