1. Introduction

Privacy-Preserving Visual Exploration of Healthcare Data with a Language Model Integration

Michael Grabner

grabner@tugraz.at 0

Peter Waldert

peter.waldert@tugraz.at 0

Benedikt Kantz

benedikt.kantz@tugraz.at 0

Tobias Schreck

tobias.schreck@tugraz.at 0 0 Institute of Visual Computing, Graz University of Technology , Austria

2026

Healthcare records, measurements and datasets can quickly become highly complex in terms of storage, anonymisation and interpretation. With Neurodegen-Vis, we present a practical toolkit to anonymise, visualise and help interpret medical data in a Visual Analytics (VA) dashboard, with seamless integration of a Large Language Model (LLM). Our tool primarily aims to facilitate the discovery of relationships within the data while lowering usage entry barriers for users, such as medical doctors or researchers. The LLM integration can help to make the dashboard more accessible, by clarifying and explaining the meaning of features, giving an overview of the dashboard components and its interactions, and can help exploring the data by suggesting features to analyse and provide example prompts. Moreover, the language model integration can help the user interpret correlations in this specific dataset based on the meaning of certain features provided. The anonymisation is carefully performed using techniques from Diferential Privacy, as we demonstrate this tool on Parkinson's Disease (PD) patient data.

eol>Visual Analytics Healthcare Data Diferential Privacy Language Models

1. Introduction

Clear History

Suggest Features 1.1.

Dataset dataset contains information on patients with

Parkinson’s Disease. It comprises

approximately features for 50 patients.

It includes demographic data such as patient age and duration of PD diagnosis, alongside results from various cognitive tests and treatment details. Several columns contain redundant information, with some measurements represented both numerically and categorically.

Additionally, cognitive tests were conducted using multiple methodologies, leading to potential domain shifts.

Binary indicators in the dataset denote the use of specific test methods.

After accounting 2. Related Work

The visualisation and analysis of high-dimensional data is a long-standing challenge in data science, and to date many approaches for interactive exploration of this data have been proposed. The Rank by Feature framework [ 1 ] was among the first approaches to identify and rank data features by their correlation, hence supporting the user in selecting interesting and relevant data. It used a heatmap to show the feature importance. In general, tabular data can be interactively visualised by general tools like spreadsheet software, and specialized development suites like Tableau or Microsoft PowerBI. Lineup [ 2 ] is a tabular data visualisation specifically for comparing and ranking the rows of a multivariate table, having the user interactively find appropriate weights for the diferent features. Correlation matrices and scatter plots are widely used tools to show an overview of pairwise correlations, amenable for overviewing and searching [ 3 ]. Recently, Large Language Model technology has drastically advanced the capability of Natural Language Processing, and existing implementations like Llama or ChatGPT allow to integrate LLM-based Natural Language Interfaces (NLIs) into interactive visualisation applications. There are many ways in which LLMs can help users to navigate and understand visualisations and data, as presented in the framework in [ 4 ].

In our work, we design a dashboard for exploring tabular data of numerical and categorical values. We make use of existing visualisations like correlation heatmaps, scatter plots and principal component plots. While these techniques are not novel, they can greatly aid user understanding and serve as an expert tool for data exploration. Our main contribution is the integration of an LLM interface to assist the user and to apply it to an anonymized real-world health data set. Our use case shows the dashboard design and LLM integration are helpful and hence can allow for explaining and guiding users.

3. Anonymisation using Diferential Privacy

In order to protect patients’ private data, our tool anonymises the incoming dataset before displaying it to a (potentially unauthorised) user. In a first step, names and directly personally identifiable information is stripped away, which only requires users to specify the afected columns.

language_z_comp possible dementia probable dementia z_diagnosis 's disease analysis ion for selected features .42 -0.37 -0.31 -0.37 -0.16 0.15 0.31 0.58 -0.02 .06 0.08 0.21 0.09 -0.20 -0.18 -0.00 -0.09 0.88 33 0.35 0.25 0.17 0.04 -0.26 -0.08 -0.48 0.22 02 0.04 0.13 -0.00 -0.08 0.06 0.09 -0.07 0.07 .33 -0.50 -0.37 -0.39 -0.14 -0.03 -0.05 0.67 0.01 39 0.50 0.39 0.45 0.04 -0.24 -0.21 -0.72 0.10 -4 -3 -2 -1 0 1 langua2ge_z_comp3→ 40 0.53 0.23 0.35 -0.15 -0.39 -0.06 -0.66 -0.16 38 0.47 0.23 0.26 F0.12igu-r0.0e9 3:0.0T6he-0.f55eat-u0.4r6e language z-compoPnCeAntAinatlhyesihsistogram view.nTpshid_eddcuro_vl,oovuerralle_dn..codes the categorical feature 4050 10..0405 00..4306 00..5305 z-00..113-1di--00a..2185gn--o00..11s66is--00,..8619upo--00..01n48 selection in the dropdown on theNutmobperroigfChltu.sters (k): 2 Run 3356 00..4500 01..3010 10..0301 -00..1077 --00..0098 --00..0180 --00..6606 00..0375 ↑ 2P.r5incipal Component 2 13 -0.11 -0.07 0.17 1.00 W0e.16 th-e0.0n6 g-e0.0n6er-0a.18te (up2.0to an arbitrary amount nopsfid)_ddsuyr_vnthetic tabular data based on the distribution ...116569 ---000...218861 ---000...006980 ---000...016806 co--000...100o6f66nfeti100a...n012057tuuor010ue...101507ss faec001a...210r770tousr--s0e00...010015st,hweei nh11..50pavuet tdoattraesaettt.hAe msssuecphartnapatsibed_lureypl_aminrmsetd_hcaetaanisonoyftmenisaatimonixphoafsoer.dTihnealc,actaetgeogroicraicladlataanids .18 -0.04 0.35 0.07 s-0.y18 nth-0.1e1tic-0a.05lly 0g.00ene1.0r0ated b0.5ased on the MultiplicativenpWsid_ereipg_mhocta_sc Exponential Mechanism (MWEM) [ 5 ], with i__sszcvuooppm r__zcyeoopmmm l__zcaaenuopggm ttattr__saeeddhhneeiiimrrlttr__seeedrfpeolserpmmr__s3unopdeaeclntliytvc1paevtiieonrnc2ppifieuidnt dthio−−−se0101pt....0055rteihbniurddtpi-oplniabsr.rtayFropyra)s.cikNomvaeerpgaxllel_dtiocsmuiamtpinya_,suortmhftetnhcoiosindsteeimn-usoonduskst,rcdaoteolvure,mlwonpeseadasrsbeuymtOheepnaennoDonrlPmy1 as(aalmnddipstblreaidbseufdrtoioomnn in all numerical column−2s.0(also taking bounds and, for example, integer constraints into account). This still faces our generati−o2n.5 procedure with the challenge of preserving correlations. Correlation can be preserved using randomis−e4d-or−d3er, c−o2 rrela−1tion-i0ntegr1ated 2sParinmcipal Component 1 → pl3ing fr4om the respective feature distributions. One starts by obtaining a (Pearson-) correlation matrix ∈ R× of the respective input features (in our case, a roughly 60x60 matrix) and passing it through the MWEM implementation to skew it with noise and making it diferentially private with respect to the input data. During the sampling procedure, in random order, the columns are sampled from their respective (normal) distributions parametrised earlier. Each sample value is then skewed according to the correlation between all previously sampled values and the feature correlation contained in . The output dataset then approximately follows the input distribution and keeps correlations consistent. The dashboard and corresponding LLM interaction (and, of course, the figures included in this work) now only rely on the synthetically generated data.

4. Visualisation Approach

Currently, there is a high amount of interest in the interaction of LLM and VA interfaces in the Visual Analytics community. The framework to describe our dashboard to the Generative Pre-trained Transformer (GPT) was inspired by [ 4 ]. Hutchinson briefly describes what challenges and opportunities Large Language Models provide for Visual Analytics integrations [ 6 ].

In Neurodegen-Vis, a correlation heatmap (cf. Figure 1), scatter plots (cf. Figures 4 and 5), and histograms (cf. Figure 3) are used to provide an overview of the data and enable the exploration of variable dependencies. For cognitive z-scores, which were obtained using diferent test methods, the visualisation colour-codes these methods and shows average values, to identify potential shifts due to methodological changes. Furthermore, a Principal Component Analysis (PCA) can be applied to detect underlying group structures. The results are presented as biplots for interpretability. The diferent clusters are colour coded in the biplot, and can also be selected to be colour coded in the scatter plot, to further analyse the influence of the original features and how they could be grouped.

4.1. Natural Language Interface

The LLM should enhance the understanding and exploration of the dashboard knowledge, including information about the datasets, feature descriptions, details about the dashboard itself, and general world knowledge. The user can interact with the LLM on top of a familiar chatbot interface. The LLM implementation can also highlight cells of the correlation heatmap and change the scatter plot view.

Can you give a short overview of the data and the dashboard?

The data consists of various features from patients with Parkinson’s Disease, including demographic information, cognitive assessments, and clinical visits. It assesses cognitive status, test results, disease duration, and other relevant factors. The Parkinson’s Disease Visual Analytics Dashboard consists of three main views: a Pearson Correlation Heatmap to explore correlations between features, a Feature Scatter Plot for visualising selected feature relationships, and a PCA Biplot to display data clustering and feature loadings. Users can interact with the heatmap to iflter and drill down into specific feature relationships.

Example 2: The initial prompt, shown and answered when the application loads, provides an overview of the dashboard and the dataset.

The chatbot can explain the diferent features (see Example 4). This is helpful for clarifying ambiguous terms to experts and can help non-domain experts to understand the data. It is not only convenient to have the feature descriptions within the dashboard itself (rather than in a separate document), but with the chatbot’s world knowledge, the chatbot is also capable of providing more complex answers to follow-up questions and dashboard-specific queries. Moreover, it ensures that everyone who has access to the dashboard also has access to this knowledge base, reducing the chance of misunderstandings or inconsistencies. It also enables interactive explanation and exploration of the dashboard (see Example 2). It can explain individual views in detail (e.g. what a biplot is) as well as their interactions and the deeper meaning behind the visualisations on the dashboard.

How can I change the shown features in the scatter plot?

To change the shown features in the scatter plot, you need to interact with the Pearson Correlation Heatmap. Here’s how: 1. Click on a non-diagonal cell in the heatmap to select the corresponding feature pair (Feature X and Feature Y). This will update the scatter plot to display the selected features. 2. If you click on a diagonal cell in the heatmap, it will change the

histogram instead of the scatter plot.

Additionally, you can select diferent colour encoding options for the scatter plot using the dropdown provided in the scatter plot view to visualise data based on diferent categories or clusters.

Example 3: Explaining the interaction between heatmap and scatter plot using the LLM integration.

As seen in Example 1, the chatbot also has access to results of the correlation analysis, enabling it to make more informed suggestions and assist in analysing and understanding the dashboard and data. There is an initial prompt that provides a brief overview of the data and the dashboard. Combined with the suggested follow-up prompts, this ofers the user guidance on how to formulate efective queries and helps them craft meaningful prompts. The suggest features function supports users in exploring the dataset and identifying interesting aspects of the data. It not only proposes notable features but also 's diseasehigahnligahtlsythseimsvisually. As shown in Figure 1, the corresponding cell in the correlation heatmap is highlighted, and the scatter plot updates to reflect the suggested features. ion for selected features .42 -0.37 -0.31 -0.37 -0.16 0.15 0.31 0.58 -0.02 .06 0.08 0.21 0.09 -0.20 -0.18 -0.00 -0.09 0.88 33 0.35 0.25 0.17 0.04 -0.26 -0.08 -0.48 0.22 02 0.04 0.13 -0.00 -0.08 0.06 0.09 -0.07 0.07 .3393 -00..5500 -00..3397 -00..4359 -00..0144 --00..2043 --00..2015 -00..6772 00..1001 0 2 4 6 overall_d8omain_sum → 30448005 0001....44507530 0000....23243630 0000....23356550 cF--0000....1111l23i51ugsut----er0000....0132re9598 r4e:---s0000....0101uS6666 cltast----.0000t....5668T5961erhep----0000l....4110co6864otloofu↑rt2Pwi.nr5inogcipfcaelaCanotmubproeneescnthi2nPanCthgAeedAanntooanloyynsmeisiosNfeudtmhbdeearctoaaftsCeelugts,oterrriisgcn(phaks)iltd:_fnded3ouarw_tvu,ocrveoersaloll_tudhR..rurenodugahccaorddrionpgdtoowtnh.e k-means 3365 00..4500 10..0301 01..3010 -00..1077 Fu--00r..0098the--00r..0180mo--r00..66e06, w00..30h57en re2l.0evant to the chatbot’s resnppsido_dndusr_ev , elements of the visualisation are dynamically 13 -0.11 -0.07 0.17 h1.00igh0l.1i6ght-0e.06d. C-0.06urr-0e.18ntly, c1.e5lls in the correlation heatmap can be emphasised, as illustrated in Example 1 ..1156 --00..2186 --00..0098 --00..0180 a-00..10n66d F10..0105igu01r..1050e 2.00..2177Th--i00..s1015 aspe1c.0t is especially import anpnsidt_retpo_mmuses_c, as the user should continue focusing on the .69 -0.81 -0.60 -0.66 n-0.0u6m0e.27rica0.l17 an1d.00 gr a0.0p0hical0.i5nformation to draw insignhpstids_r,epr_maotcha_cer than relying solely on the LLM’s analysis. .18 -0.04 0.35 0.07 -0.18 -0.11 -0.05 0.00 1.00 0.0 i__sszcvuooppm r__zcyeoopmmm l__zcaaeunopggm ttr__saeedd aWnlttr__seeedhaalytsr__s3nuopdies? thc1pe feac2pture −−−o110v...505erall_domain_souvemrall_daomnaidn_suwm hy could it be interesting to −2.0 The feature "overall_domain_sum" represents the total number of cog−2.5 nitive−a4sses−s3men−t2 test−s1 in w0hich 1a pat2ient d3id no4t achieve adequate

Principal Component 1 → scores across various cognitive domains (such as attentional, executive, visuospatial, memory, and language). It is a cumulative measure reflecting the severity of cognitive impairment. Analysing this feature can be interesting because it helps to quantify the extent of cognitive deficits in patients with Parkinson’s disease. A higher overall domain sum indicates more significant cognitive impairment, which could impact the patient’s daily functioning and quality of life. It can also help in understanding correlations with other clinical measures, treatment responses, and overall disease progression, aiding in personalised medical decision-making.

Example 4: Feature explanation using the LLM. 5. Implementation

The dashboard was implemented as a React2 application using TypeScript. Plots were created with the ObservableHQ3 and D34 libraries to enhance interactivity. All calculations were performed in the TypeScript frontend. The dataset’s small size (less than 100 KB) made this feasible. For the heatmap, the standard Pearson correlation was used, and in the scatter plot the k-means++ algorithm was applied. The OpenAI API was used to facilitate usage based on the model "gpt-4o-mini". Replacing the default Language Model integration with an alternative model, most notably a locally hosted one, requires close to no efort, the user only needs to supply a diferent API endpoint configuration. The chatbot was given a system prompt instructing it as a helpful chatbot for a Visual Analytics dashboard and that it should give short and precise answers. 's disease Aadndiatiolnyasllyi,sto gain domain knowledge, the chatbot was given the feature descriptions of all features in the data, the Pearson correlations and a JSON file which describes the components and interactions between the diferent views. The framework was adopted from Zhao et al [ 4 ].

language_z_comp

Two dedicated functions enable the dynamic visual highlightingz_doiagfnosspisecific cells in the correlation heatmap during chatbot cMoCI nvedermesnatiations. is

possible TdemhenetiaLLMprobable depmernotiampted to call these functions by passing relevant feature names as argu m↑F2er4enqutesncywhen such highlighting would support answering a user’s query. These ion for selectefdunfcetaiotnusretrsigger visual u22pdates in the dashboard to guide the user’s attention (cf. Figure 2). To ensure all_domain_sum npsid_reps_maofcea_tcy annpsdid_sretpa_mbmislei_tcy, no co20de generated by the LLM is executed directly. Instead, only the two predefined language_z_comp st_ter_fduaednctisto_tenr_sleecdanupbders_3c_oanlled,11a68nd the output of the LLM is verified to ensure it contains only valid feature phon_flu_done pc1 rpecf2erences. After each u1s4er query, the LLM is also prompted to generate a short, engaging follow-up question, which is then d12isplayed to the user in the form of a ‘bubble’.

The suggest features f u10nction changes the view of the scatter plot and highlights the features on the .42 -0.37 -0.31 -0.37 h-0.1e6atm0.15 ap.0.3T1he0.58LL M-0.02 outpu8t needs to be a list of two features. This works robustly with a standard prompt, .3036 00..3058 00..2251 00..1079 w-00..0240hic--00h..2168 st--r00..00i80ctly--00..4089defin00..2828es the 46format and emphasises that no additional text should be used. Afterwards, 02 0.04 0.13 -0.00 t-0.h08 e c0.h06 atb0.0o9t s-h0.07oul0.d07 expla2in why the suggested features are interesting, which is shown to the user in .3393 -00..5500 -00..3397 -00..4359 t-00.h.0144 e c--00h..2043at b--00..o2015 t w-00..67i72ndo00..1001w (cf. F0-i6gure-15). -4 -3 -2 -1 0 1 langua2ge_z_comp3→ 40 0.53 0.23 0.35 -0.15 -0.39 -0.06 -0.66 -0.16 38 0.47 0.23 0.26 0.12 -0.09 0.06 -0.55 -0.46 PCA Analysis npsid_ddur_v, overall_d.. 00 0.45 0.36 0.35 0.13 -0.15 -0.16 -0.69 -0.18 45 1.00 0.40 0.50 -0.11 -0.28 -0.16 -0.81 -0.04 Number of Clusters (k): 2 Run 36 0.40 1.00 0.31 -0.07 -0.09 -0.08 -0.60 0.35 35 0.50 0.31 1.00 0.17 -0.08 -0.10 -0.66 0.07 13 -0.11 -0.07 0.17 1.00 0.16 -0.06 -0.06 -0.18 npsid_ddur_v .15 -0.28 -0.09 -0.08 0.16 1.00 0.15 0.27 -0.11 .16 -0.16 -0.08 -0.10 -0.06 0.15 1.00 0.17 -0.05 .69 -0.81 -0.60 -0.66 -0.06 0.27 0.17 1.00 0.00 npsid_rep_mmse_c .18 -0.04 0.35 0.07 -0.18 -0.11 -0.05 0.00 1.00 npsid_rep_moca_c i__sszcvuooppm r__zcyeoopmmm l__zcaaeunopggm ttr__saeedd lttr__seeed r__s3nuopd c1p c2p overall_domain_sum

2https://react.dev/ 3https://observablehq.com/documentation/cells/observable-javascript 4https://d3js.org/ 6. Use Case

We demonstrate the applicability of our system on a use case involving the exploration of relationships between attributes of patient’s information. This explorative approach is enabled through heatmaps and scatter plots, allowing the identification of linear dependencies. One such observation found through our system might be that cognitive performance declines with increasing age (insnpsi_age), while the duration of PD diagnosis has little efect on test scores. Figure 1 illustrates a scatter plot comparing the overall_domain_sum (number of cognitive z-tests where patients did not achieve an adequate score) with patient age. It reveals that patients who took the VOSP test exhibit diferent average scores, which could potentially skew the data. This should be accounted for in subsequent analyses.

In the PCA biplot, users can select features of interest to visualise their influence on the principal components. Scatter plots and histograms are dynamically generated by clicking on of-diagonal and diagonal cells of the heatmap, respectively, with the colour coding selectable via a dropdown menu. Use of the LLM We also present an example of how the LLM can assist users in understanding the data and navigating the dashboard. First, the user receives an overview of the dashboard through an initial prompt (see Example 2). With a follow-up prompt (Example 3), the user learns how to change the scatter plot view by interacting with the heatmap. While exploring the data, the chatbot can be used to clarify the meaning of specific features (see Example 4) and to help interpret correlation values. Additionally, through the suggested feature function and example prompts, the user is guided toward potentially interesting aspects of the data. Highlighted cells allow the user to verify insights visually and encourage further interaction with the visualisations.

7. Discussion 7.1. Risks of LLM Integrations

While chatbots and LLMs can be highly useful, their integration also presents certain risks and challenges [ 7 ]. These include potentially misleading or incorrect explanations, hallucinations and the risk of users becoming overly reliant on the chatbot’s output rather than engaging with the visualisations directly.

One advantage of integrating an LLM into a Visual Analytics dashboard is that users can cross-check the chatbot’s responses against the underlying data and visualisations. Additionally, interactions with the chatbot can dynamically influence the visualisations, for example by highlighting a relevant cell in the correlation heatmap. This can help users to verify results and encourage them to engage with the visual interface rather than relying solely on textual explanations.

Currently, the scope of visualisation manipulation through chatbot interaction is limited. Expanding the range of supported interactions could help mitigate these risks further. However, while such enhancements may reduce the impact of these issues, they cannot eliminate them entirely. Users must remain aware of the limitations and capabilities of LLMs when interpreting their outputs.

7.2. Limitations

The current LLM integration can help users gain a broad overview of the visualisations and interactions (see Example 2). In its current version, it does not have direct access to the raw data and therefore cannot provide detailed information about the visualisations. However we do provide statistical properties about the individual columns and correlations between the diferent features.

The dashboard is designed using standard visualisations to help identify correlations and ofer an overview of the data. However, it lacks more advanced, application tailored tools such as zooming, ifltering or brushing, which are necessary for more detailed analysis. Overall, the examples in this work show promising results, however their efectiveness was not explicitly validated in user studies. Reliance on Good Prompting: The LLM integration is mainly focused on the chatbot interface, which requires users to write efective prompts. While we provided some prompt guidance by suggesting follow-up prompts and an initial prompt, the user is still responsible for writing good prompts. Reproducibility of LLMs: In our paper, we evaluated only the model “gpt-4o-mini”, and did not evaluate the reproducibility of prompts and changes of answers based on minor changes in the questions, which can be problematic [ 8, 9 ]. For more general conclusions about the applicability and usefulness of the tool, more analysis on these concerns would have to be done.

Although the generalisability of the paper is somewhat limited, and the efectiveness of the LLM has to be further evaluated, the paper still shows the potential of LLM integrations for Visual Analytics dashboards and applications.

8. Conclusion & Future Work

In this work, we presented a comprehensive data exploration dashboard integrated with a Large Language Model to make the dashboard more accessible. As our use case focuses on patient data, we also explained how to anonymise it beforehand, resulting in a synthetically generated dataset of 100 entries that matches the original dataset’s characteristics, which we then visualise in the dashboard.

In the current state, there is only limited interaction between the chatbot and the dashboard, namely with the suggest features and correlation heatmap highlighting functions. The latter is done dynamically, where with every prompt it is checked whether highlighting or interacting with the dashboard provides more insight, and the tool then does so if needed.

Despite certain problems, much of the interaction with the tool happens via prompting. As a user guidance measure, we provide prompt suggestions within the dashboard which meaningfully accelerate the dataset exploration process.

The dashboard could be generalised for diferent datasets, and in a perfectly integrated data environment, the most interesting dataset or data view could be loaded using the LLM. Future work should also evaluate the efectiveness and precision provided by the various available LLMs.

Acknowledgments

This work was supported by the HEREDITARY project as part of the European Union’s Horizon Europe research and innovation programme under grant agreement No GA 101137074. Part of this work has already been outlined in a technical report, Deliverable 5.1 [ 10 ].

Declaration on Generative AI The author(s) have not employed any Generative AI tools. Acronyms DP Diferential Privacy. 1 GPT Generative Pre-trained Transformer. 4 LLM Large Language Model. 1 LM Language Model. 7 MWEM Multiplicative Weights Exponential Mechanism. 4 NLIs Natural Language Interfaces. 3 PCA Principal Component Analysis. 4 PD Parkinson’s Disease. 1 VA Visual Analytics. 1 VOSP Visual Object and Space Perception. 8

[1]

Seo ,

Shneiderman , Knowledge discovery in high-dimensional data: Case studies and a user survey for the rank-by-feature framework , IEEE Trans. Vis. Comput. Graph . 12 ( 2006 ) 311 - 322 . doi: 10 .1109/TVCG. 2006 . 50 .

[2]

Gratzl ,

Lex ,

Gehlenborg ,

Pfister ,

Streit , Lineup: Visual analysis of multi-attribute rankings , IEEE Trans. Vis. Comput. Graph . 19 ( 2013 ) 2277 - 2286 . doi: 10 .1109/TVCG. 2013 . 173 .

[3]

Chegini ,

Shao ,

Gregor ,

D. J.

Lehmann ,

Andrews , T. Schreck, Interactive visual exploration of local patterns in large scatterplot spaces , Comput. Graph. Forum 37 ( 2018 ) 99 - 109 . doi: 10 . 1111/cgf.13404.

[4]

Zhao ,

Zhang ,

Zhao ,

Wang ,

Shao ,

Turkay ,

Chen , Leva: Using large language models to enhance visual analytics , IEEE Transactions on Visualization and Computer Graphics 31 ( 2025 ) 1830 - 1847 . doi: 10 .1109/tvcg. 2024 . 3368060 .

[5]

Hardt ,

Ligett ,

McSherry , A simple and practical algorithm for diferentially private data release , in: Guide Proceedings , volume 2 , Curran Associates Inc., 2012 , pp. 2339 - 2347 . doi: 10 .5555/2999325.2999396.

[6]

Hutchinson ,

Jianu ,

Slingsby ,

Madhyastha , Llm-assisted visual analytics: Opportunities and challenges , arXiv preprint arXiv:2409.02691 ( 2024 ).

[7]

M. R.

Morris , Prompting Considered Harmful, Commun. ACM 67 ( 2024 ) 28 - 30 . doi: 10 .1145/ 3673861.

[8]

Kosch ,

Feger , Prompt-Hacking: The New p-Hacking? , arXiv ( 2025 ). doi: 10 .48550/arXiv. 2504.14571. arXiv: 2504 . 14571 .

[9]

Kosch ,

Feger , Risk or Chance? Large Language Models and Reproducibility in HCI Research, interactions 31 ( 2024 ) 44 - 49 . doi: 10 .1145/3695765.

[10]

Schreck ,

Lengauer ,

Waldert ,

Kantz ,

Grabner ,

Schilcher ,

Percic , C. van Leeuwen , M.

Lissandrini , A.

Romanovych , Deliverable 5.1: Visualization components for sequences, networks, text, and high- dimensional data , 2024 . doi: 10 .5281/zenodo.14628086.