IEEE VIS 2024 Content: Generating Analytic Specifications for Data Visualization from Natural Language Queries using Large Language Models

Generating Analytic Specifications for Data Visualization from Natural Language Queries using Large Language Models



 Subham Sah - UNC Charlotte, Charlotte, United States

Rishab Mitra - Georgia Institute of Technology, Atlanta, United States

 Arpit Narechania - Georgia Institute of Technology, Atlanta, United States

 Alex Endert - Georgia Institute of Technology, Atlanta, United States

 John Stasko - Georgia Institute of Technology, Atlanta, United States

 Wenwen Dou - UNC Charlotte, Charlotte, United States

 Screen-reader Accessible PDF

Recently, large language models (LLMs) have shown great promise in translating natural language (NL) queries into visualizations, but their “black-box” nature often limits explainability and debuggability. In response, we present a comprehensive text prompt that, given a tabular dataset and an NL query about the dataset, generates an analytic specification including (detected) data attributes, (inferred) analytic tasks, and (recommended) visualizations. This specification captures key aspects of the query translation process, affording both explainability and debuggability. For instance, it provides mappings from the detected entities to the corresponding phrases in the input query, as well as the specific visual design principles that determined the visualization recommendations. Moreover, unlike prior LLM-based approaches, our prompt supports conversational interaction and ambiguity detection capabilities. In this paper, we detail the iterative process of curating our prompt, present a preliminary performance evaluation using GPT-4, and discuss the strengths and limitations of LLMs at various stages of query translation.

Generating Analytic Specifications for Data Visualization from Natural Language Queries using Large Language Models

 Subham Sah - UNC Charlotte, Charlotte, United States

Rishab Mitra - Georgia Institute of Technology, Atlanta, United States

 Arpit Narechania - Georgia Institute of Technology, Atlanta, United States

 Alex Endert - Georgia Institute of Technology, Atlanta, United States

 John Stasko - Georgia Institute of Technology, Atlanta, United States

 Wenwen Dou - UNC Charlotte, Charlotte, United States

 Screen-reader Accessible PDF

 Download preprint PDF

 Room: Bayshore II

2024-10-14T16:00:00ZGMT-0600Change your timezone on the schedule page
2024-10-14T16:00:00Z

Fast forward

Abstract

Generating Analytic Specifications for Data Visualization from Natural Language Queries using Large Language Models

 Subham Sah - UNC Charlotte, Charlotte, United States

Rishab Mitra - Georgia Institute of Technology, Atlanta, United States

 Arpit Narechania - Georgia Institute of Technology, Atlanta, United States

 Alex Endert - Georgia Institute of Technology, Atlanta, United States

 John Stasko - Georgia Institute of Technology, Atlanta, United States

 Wenwen Dou - UNC Charlotte, Charlotte, United States

 Screen-reader Accessible PDF

 Download preprint PDF

 Room: Bayshore II

2024-10-14T16:00:00ZGMT-0600Change your timezone on the schedule page2024-10-14T16:00:00Z

Fast forward

Abstract

2024-10-14T16:00:00ZGMT-0600Change your timezone on the schedule page
2024-10-14T16:00:00Z