<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta>
      <journal-title-group>
        <journal-title>Xiv:</journal-title>
      </journal-title-group>
      <issn pub-type="ppub">1613-0073</issn>
    </journal-meta>
    <article-meta>
      <title-group>
        <article-title>Expertise Fog on the GPT Store: Deceptive Design Patterns in User-Facing Generative AI</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Robert Wolfe</string-name>
          <email>rwolfe3@uw.edu</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Alexis Hiniker</string-name>
          <email>alexisr@uw.edu</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="editor">
          <string-name>Deceptive Design, Generative AI, GPT Store, Expertise</string-name>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Information School, University of Washington</institution>
          ,
          <addr-line>Seattle, WA</addr-line>
          ,
          <country country="US">USA</country>
        </aff>
      </contrib-group>
      <pub-date>
        <year>2303</year>
      </pub-date>
      <volume>08774</volume>
      <fpage>02</fpage>
      <lpage>25</lpage>
      <abstract>
        <p>We argue that the OpenAI GPT Store, a marketplace containing thousands of customized generative AI models known as GPTs, can exacerbate problems surrounding users' ability to recognize the expertise (or lack thereof) of generative AI models. We contend that OpenAI models are anthropomorphized and presented as authority figures, and that many community developers of GPTs adopt or imitate the model established by OpenAI. We define Expertise Fog, a deceptive design pattern wherein the interface presents a facade of expertise despite the lack of evidence that such models possess expertise, or indeed possess any appreciable differences from a base model with respect to expertise. Finally, we propose that the deceptive design research community proactively address deceptive patterns in GPTs to prevent real-world harm. We offer four mechanisms for doing so: transparently disclosing GPT components, robustly evaluating expert GPTs, clearly labeling GPTs as tools, and foregrounding shortcomings of expert GPTs.</p>
      </abstract>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>-</title>
      <p>CEUR
ceur-ws.org</p>
    </sec>
    <sec id="sec-2">
      <title>1. Introduction</title>
      <p>
        Generative AI technologies such as ChatGPT [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ] have already altered the way many users access
information, and they stand poised to reshape knowledge work in high-education professions [
        <xref ref-type="bibr" rid="ref2">2</xref>
        ].
Yet despite the unfolding transfer of authority over human information [
        <xref ref-type="bibr" rid="ref3">3</xref>
        ], problems continue to
arise from users placing inappropriate trust in generative AI. Taking law as an example, attorneys
have been reprimanded for using ChatGPT, including in cases wherein the model produced
non-existent legal citations [
        <xref ref-type="bibr" rid="ref4 ref5">4, 5</xref>
        ], or was used to draft justifications for exorbitant fees [
        <xref ref-type="bibr" rid="ref6">6</xref>
        ]. Even
Michael Cohen, former personal lawyer to Donald Trump, was implicated in sending AI-generated
case citations to his defense attorney, which were subsequently included in a motion to a court [
        <xref ref-type="bibr" rid="ref7">7</xref>
        ].
      </p>
      <p>
        One might reasonably ask why users of ChatGPT continue to rely on a generalist conversational
agent to automate expert work. Scholars have contended that OpenAI’s presentation of ChatGPT
as a milestone on the path to Artificial General Intelligence [
        <xref ref-type="bibr" rid="ref8">8</xref>
        ] and its presentation of the model
as humanlike [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ] may lead to inappropriate trust by users [
        <xref ref-type="bibr" rid="ref10 ref9">9, 10</xref>
        ], and that ChatGPT can be
thought of a deceptive ecosystem in its provision of “fabricated” information to end users [
        <xref ref-type="bibr" rid="ref11">11</xref>
        ].
Reprimanded lawyers said that they misunderstood that ChatGPT was not a “super search engine,”
not realizing that it could fabricate cases out of thin air (a problem known as hallucination) [
        <xref ref-type="bibr" rid="ref12">12</xref>
        ].
https://wolferobert3.github.io/ (R. Wolfe); https://www.alexishiniker.com/ (A. Hiniker)
      </p>
      <p>© 2024 Copyright for this paper by its authors. Use permitted under Creative Commons License Attribution 4.0 International (CC BY 4.0).</p>
      <p>
        Concerning though they are, the instances of inappropriate reliance on generative AI discussed
above largely precede the release of the GPT Store, a marketplace for customized versions
of ChatGPT released in January 2024 by OpenAI, which made available tens of thousands
of “GPTs” to purchasers of ChatGPT Plus subscriptions [
        <xref ref-type="bibr" rid="ref13">13</xref>
        ]. In this position paper, we will
argue that the design of the GPT Store could exacerbate problems surrounding users’ ability
to understand the expertise of a custom generative AI model. We make three contentions:
– OpenAI presents its GPTs as experts, as do many builders of Community GPTs. We
argue that OpenAI positions its GPTs as authorities, anthropomorphizes GPTs, and provides little
information about the human work that produced a GPT. Many GPT builders, including those
creating GPTs for domains such as law and medicine, have adopted the presentation of OpenAI,
resulting in authoritatively marketed GPTs with little empirical support for their capabilities.
– The design of the GPT Store produces a deceptive design pattern we refer to as Expertise
Fog. We argue that the user interface of GPTs and the way in which they are marketed on the
GPT Store encourages users to rely on “expert” models with no evidence that they can capably
offer advice in a domain - or are in any way functionally different from the ChatGPT base model.
– The deceptive design community can proactively mobilize to prevent real-world harms. We
argue that deceptive design in GPTs can be addressed by providing mechanisms to transparently
disclose GPT components; by requiring robust, in-domain evaluation of GPTs; by clearly labeling
GPTs as tools; and by foregrounding shortcomings of GPTs for users.
      </p>
      <p>Customized, shareable generative AI technologies present the potential for novel deceptive
designs, and additional deceptive patterns may arise as GPTs become more integrated into
user-facing technologies. Building capabilities to describe and address deceptive patterns as
they emerge can position the community to mitigate their associated harms.</p>
    </sec>
    <sec id="sec-3">
      <title>2. Overview of GPTs and the GPT Store</title>
      <p>
        OpenAI introduced custom versions of ChatGPT called GPTs during its Developer Day on
November 6, 2023 [
        <xref ref-type="bibr" rid="ref14">14</xref>
        ], when the company presented GPTs as a way of simplifying the process
of prompting to induce specific behaviors in ChatGPT, and of democratizing AI by enabling users
to design tailored models. On January 10, 2024, OpenAI launched the GPT Store, a platform
for sharing, discovering, and using GPTs designed by community users, and through which
community builders could eventually be paid [
        <xref ref-type="bibr" rid="ref15">15</xref>
        ] based on the usage of their GPTs [
        <xref ref-type="bibr" rid="ref13">13</xref>
        ]. In this
section, we provide a description of what a GPT is and how it is constructed, and we describe the
interface of the GPT Store, through which users access GPTs. Following OpenAI [
        <xref ref-type="bibr" rid="ref14">14</xref>
        ], we refer
to the creators of custom GPTs as “builders,” custom GPTs created by builders as “Community
GPTs” (distinct from GPTs created by OpenAI), and the end users of GPTs as “users.”
2.1. GPTs
While the earliest versions of ChatGPT interacted with the user only via written dialogue [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ],
wherein the user submitted text and the model also responded in text, ChatGPT now includes
functionality that allows the model to retrieve information from external sources [
        <xref ref-type="bibr" rid="ref16">16</xref>
        ], interact
with more specialized AI models such as the text-to-image generator DALL-E [
        <xref ref-type="bibr" rid="ref17 ref18">17, 18</xref>
        ], and use
programmatic tools [
        <xref ref-type="bibr" rid="ref19">19</xref>
        ], such as invoking API calls to an external website. The introduction of
GPT-4 in early 2023 [20] also permitted conversations wherein the model more closely adheres to
a “System Prompt,” instructions intended to govern its behavior which are invisible to the end user.
      </p>
      <p>
        These advances produced configurable options that enabled the creation of the customized
versions of ChatGPT now called GPTs. We outline the four core components of a GPT:
– Instructions: Builder-defined directions to govern model behavior and interactions with an end
user, typically including what the model should and should not say to users.
– Knowledge: Text files referred to by a GPT during conversation with a user. According to the
most recent documentation, builders may equip a GPT with up to twenty files [
        <xref ref-type="bibr" rid="ref15">15</xref>
        ].
– Capabilities: Settings allowing a model to 1) generate images from text; 2) retrieve data from
the internet; and 3) use a “Code Interpreter” to run code and analyze data.
– Actions: Builder-defined programmatic calls to retrieve data, interact with APIs, and take
actions online, outside of the ChatGPT environment, based on user input.
      </p>
      <p>A GPT must have Instructions, while other components are optional. Other features
communicate GPT functionality, scaffolding conversations and assisting in discovery on the Store:
– Name: A title for the GPT.
– Description: 300 words or fewer describing the intended functionality of the GPT.
– Image: An image to represent the GPT, uploaded by the builder or made with DALL-E.
– Conversation Starters: Prompts suggested before a conversation with the GPT begins, which
appear as buttons above the text input box in the GPT’s conversational user interface.</p>
      <p>Builders can use a menu-based interface to define both the core components of the GPT and
the user-facing features intended to describe the GPT’s functionality. However, the default
interface for building a GPT is another GPT called the GPT Builder, which creates a GPT
based on its conversation with a (human) builder. Toggling between the two tabs (“Create” and
“Configure”) of the GPT Creation interface reveals that the GPT Builder automatically enters
text into the Instructions, Description, and Conversation Starters fields based on the builder’s
description. The GPT Builder often suggests a Name, and may automatically generate an image
for a GPT using DALL-E after it completes the Instructions, Description, and Name fields.</p>
      <sec id="sec-3-1">
        <title>2.2. The GPT Store</title>
        <p>The GPT Store provides a means for users to discover and search for GPTs created both by
Community builders and by OpenAI itself. As of this writing, the GPT Store is entirely contained
within a single web page (https://chat.openai.com/gpts), from which every available GPT
can be navigated to. At the top of the page is a search bar with the default text “Search public
GPTs.” Typing into this search bar displays the top ten results of a keyword search in a dropdown
descending from the search bar. Each result includes the Name, Image, Description, and Author
of the GPT, along with its approximate number of conversations with users (e.g., “1K+”).</p>
        <p>
          Below the search bar is an OpenAI-curated set of four “Featured” Community GPTs, which are
updated weekly. The Name, Image, Description, and Author of these GPTs are displayed such that
they appear larger than any other GPTs included on the page. Below the blocks of Featured GPTs
is a block of “Trending” GPTs, which includes the six (expandable to twelve) most popular
Community GPTs on the GPT Store. The Trending GPTs are followed by a block of six (expandable
to seventeen) GPTs developed by OpenAI itself. Below the OpenAI GPTs are seven additional
blocks of six (expandable to twelve) Community GPTs each, organized according to the following
headings: DALL-E, Writing, Productivity, Research &amp; Analysis, Programming, Education, and
Lifestyle. How models are selected for inclusion in these blocks of GPTs is opaque: OpenAI’s
documentation states only, “we actively review engagement with GPTs and surface trending GPTs
for different categories” [
          <xref ref-type="bibr" rid="ref15">15</xref>
          ]. For Featured GPTs, OpenAI looks for “Distinctive features,”
“Performance consistency,” and “Broad relevance,” and considers seasonality and current events [21].
        </p>
        <p>In order to access the GPTs available via the GPT Store, users must subscribe to ChatGPT Plus,
which costs $20.00 USD per month as of this writing, and also provides users with faster model
response times and access to OpenAI’s flagship GPT-4 model via the ChatGPT user interface [ 22].
Similarly, builders must subscribe to ChatGPT Plus to create GPTs [22]. While non-subscribing
users can see GPTs available in the GPT Store, they cannot interact with them [22].</p>
      </sec>
    </sec>
    <sec id="sec-4">
      <title>3. Presentations of Expertise on the GPT Store</title>
      <p>We turn to the question of how a GPT’s expertise is communicated to users on the GPT Store.
We first study the block of seventeen OpenAI models included on the GPT Store home page, and
then we examine the Community GPTs accessible on the home page and via the search bar.</p>
      <sec id="sec-4-1">
        <title>3.1. Presentation of Expertise in OpenAI GPTs</title>
        <p>OpenAI’s presentation of its own GPTs provides insight into its views on how to share GPT
functionality, and implicitly sets expectations for how builders can position their GPTs on the GPT
Store. We studied the seventeen OpenAI GPTs on the GPT Store and found these commonalities:
– The GPT’s Name connotes expertise. Names of OpenAI GPTs include Math Mentor, Tech
Support Advisor, and Creative Writing Coach.
– The GPT’s Description is in the first person. Many OpenAI GPT descriptions are written as
though by the GPT itself, suggesting interaction with a capable, intelligent being.
– The author of the GPT is simply “ChatGPT.” There is no information about who created the
GPT, who developed its Capabilities or Actions, or who authored its Knowledge.</p>
        <p>GPTs are presented as anthropomorphized experts, with little reference to the human data and
tools supporting the model. A model’s About page provides scarcely more information than is
available via the home page. GPTs like Tech Support Advisor note that they can perform data
analysis, or that they are capable of “Browsing,” but do not disclose what data they are browsing,
nor what their capabilities mean in the context in which the GPT will be employed. Nor is there
an explanation of why a user should rely on the GPT to perform this task, or what differences
the user might see were they to use the GPT rather than simply using a base model like GPT-4.
Attempting to follow the links next to the Author field to understand more about the model’s
authors simply leads to the OpenAI homepage, which provides no information about the authors
of a specific GPT. Moreover, our conversational interactions with these GPTs bear out what prior
research has found, that they answer questions authoritatively, even if they may not have the
correct answer to a question. Warnings about the potential unreliability of the GPT’s output
typically arise only at the very end of an answer, where a user might miss them or understandably
ignore them, given the otherwise authoritative tone of the GPT. The presentation of these GPTs
suggests that users are encouraged to trust the models as experts, even in the absence of traditional
indicators of expertise. While not every OpenAI GPT intends to provide expertise, we will show
that the presentation of GPTs as experts occurs not only in OpenAI GPTs but in Community
GPTs as well, many of which offer expertise in consequential domains.</p>
      </sec>
      <sec id="sec-4-2">
        <title>3.2. Presentations of Expertise in Community GPTs</title>
        <p>A far cry from the seventeen GPTs offered by OpenAI, the thousands of Community GPTs
available via the GPT Store defy comprehensive treatment in a short position paper. We will
thus make our argument by addressing the presentation of expertise among GPTs intended for
three consequential professional domains: law, medicine, and finance. We undertake a simple
experiment by searching on “legal,” ”medical,” or ”finance” in the GPT Store search bar, and
report the Name and Description of the top ten GPTs returned in the results for each search. We
note that we report the results of this experiment based on a search on February 25, 2024.</p>
        <p>Table 1 includes the top ten results for each keyword. Though these GPTs mostly do not address
the user in the first-person, they nonetheless leave little doubt about their purported expertise. The
top result for “legal” returns Legal+, described as “your personal AI lawyer,” and promising “real
time legal advice.” The second result for “medical” returns Medical GPT, described as a “friendly
virtual doctor” providing “broad medical advice.” The fifth result for “financial” returns Financial
Planner, promising “personalized financial advice.” In all three cases, a GPT has adopted the
title of a human professional (lawyer, doctor, financial planner) and promised expert professional
advice for which human experts prepare for many years to provide. These GPTs are far from
outliers: top ranked legal GPTs employ Names positioning the models as an Assistant, an Advisor,
a Scribe, and a Pro; top medical GPTs use names such as Genius and Mentor; and top financial
GPTs use names like Wizard, Guru, Analyst, Companion, Expert, Advisor, and Navigator.</p>
        <p>Users searching for expertise on the GPT Store will readily find it, or at least the appearance of
it. According to statistics from the GPT Store, users have logged more than 10,000 conversations
with the Legal+ GPT. Moreover, we found that some expert GPTs were among the GPTs on the
home page. The GPT Store featured the Financial Wizard GPT, for example, in the top six GPTs
for Research &amp; Analysis on February 26, 2024. Having established that GPTs in widespread
use are presented as experts, we next theorize their presentation as a deceptive design pattern.</p>
      </sec>
    </sec>
    <sec id="sec-5">
      <title>4. Expertise Fog: Deceptive Design on the GPT Store</title>
      <p>
        Prior work establishes that deceptive design patterns manipulate the behavior of users in order
to obtain money, data, or attention from the user [23, 24]. Such patterns are pervasive among
user-facing apps on platforms such as the Apple and Android stores [25, 26], and they can
influence not only a user’s online behavior but their well-being [ 27] and sense of safety in the
real world [28]. Given the relative recency of GPTs and the GPT Store in particular, little prior
work explicitly treats the presence of deceptive design patterns in these technologies, save that of
Zhan et al. [
        <xref ref-type="bibr" rid="ref11">11</xref>
        ], who argue that ChatGPT is itself an example of a “deceptive ecosystem.”
      </p>
      <p>We contend that the user-facing presentation of expertise on the GPT Store constitutes a
deceptive design pattern, which we refer to as Expertise Fog. We specifically define Expertise</p>
      <p>GPT Name
Legal+
Legal
Assistant
Legal
Advisor
Legal
Eagle
Legal
Scribe
Legal Bot
Legal
Legal
eagle
Legal Pro</p>
      <p>Legal scenario
simulator for professionals,
students, and
enthusiasts
Assists in drafting
basic legal documents
Legal bot: friendly,
step-by-step legal
advisor
We provide you with
intelligent text
generation capabilities to
help you create
highquality text content
in various
applications.</p>
      <p>I analyze legal texts
and explain laws.</p>
      <p>Examines any legal
document to identify
pitfalls. #Legal</p>
      <p>Medical
Genius
Medical
Notes
Medical
Advice
Medical
Mentor
Medical
Illustration
Master
Medical
Translate</p>
      <p>Medical AI</p>
      <p>Community GPTs for Three Expert Professional Domains on the GPT Store
Legal Keyword Medical Keyword Financial Keyword</p>
      <p>GPT Description GPT Name GPT Description GPT Name GPT Description
Your personal AI Medical I simplify complex Financial Financial expert and
lawyer. Does it all Research medical research, Wizard programmer,
clarifyfrom providing real highlighting key ing investments and
time legal advice for points and sources. translating financial
day-to-day problems, indicators into
algoproduce legal con- rithms.
tract templates &amp;
much more!
Legal assistant for Medical Friendly virtual doc- Financial Financial Analyst
consulting, drafting GPT tor for broad medical Guru Generalist
contracts and legal advice.
documents
Answers legal ques- AI for Financial
tions in a variety of Medical Analyst
topics Students
Expert in financial
markets, strategies,
and analysis. ”A
senior financial analyst
with 20 years of
experience, specializing
in financial auditing,
valuation, and
investment analysis.”
Financial derivatives
tutor-explainer
Personalized financial
advice to help you
achieve your financial
goals
Tailored ifnancial
planning advice for
UK users
Financial planning
assistant for budgeting
and investments
Expert in financial
analysis and advice
Strategic advisor
for mid-life financial
planning.</p>
      <p>A savvy ifnancial
advisor ofering
personalized investment
strategies.</p>
      <p>Medical Study AI
aids in Medical
Assistant learning, AI for
Medical Students,
understanding Medical
Terminology,
navigating Medical School,
and Molecular
Biology concepts…
Oral Medical
Pathology Diagnostic
Lesion description,
diferential diagnosis,
complementary tests
and treatments.</p>
      <p>Write Excellent
Medical Notes
helpful dialogue
simulation roleplay
DrBrinkhouse
Advanced medical
professor delving
into complex medical
details.</p>
      <p>Creates high-quality
medical art from
keywords.</p>
      <p>A medical translation
assistant, providing
accurate translations
of medical texts and
conversations.
helpful dialogue
simulation roleplay
DrBrinkhouse</p>
      <p>Financial
derivatives
Financial
Planner
Financial
Planning
UK
Financial
Companion
Financial
Expert
Financial
Advisor
Financial
Navigator
10</p>
      <p>Legal
Reader</p>
      <p>Adaptive Kosovo
Legal Guide.
Fog as the manipulative redirection of attention to a source of information presented to the user
as an expert, without adequately establishing and communicating whether the ostensible expert
can truly offer expertise. In theorizing Expertise Fog, we build on prior work of Chaudhary
et al. [29], who describe “Feature Fog,” a deceptive pattern that reduces a user’s ability to detect
how much time they’ve spent on an activity (such as the lack of a time stamp in a video player).
This pattern is also characterized as “Time Fog” by Monge Roffarello et al. [30], who note that
it is also an example of the “interface interference” pattern described by Gray et al. [31], and
specifically reflects the “hidden information” pattern. Expertise fog is similarly characterized
by hidden information; what makes Expertise Fog distinctive is that determining whether the
user appropriately relied on an expert GPT requires the judgment of a human expert - the very
entity for which the GPT presents a substitute. A non-expert would be unqualified even to judge
whether the model can stand in for an expert. Indeed, while we interacted with the GPTs in
question during the course of collection data for this paper, we could not assess whether they
offer sound legal, medical, or financial advice, because we are not lawyers, doctors, or financial
experts. While the outputs of these models appear compelling to lay users, they may nonetheless
contain consequential mistakes only apparent to skilled professionals.</p>
      <p>The potential for real-world harms to result from the legal, medical, and financial GPTs outlined
in this work is relatively self-evident: a user might act on bad legal, medical, or financial advice
from a GPT presented to the user as an expert. Even without the Expertise Fog induced by the
GPT Store, users (including lawyers themselves, as noted in the introduction) have already acted
on bad output from the ChatGPT base model, resulting in professional consequences.</p>
      <sec id="sec-5-1">
        <title>4.1. Motivations for Expertise Fog</title>
        <p>We next consider why Expertise Fog occurs on the GPT Store. On the one hand exist
straightforward financial incentives: OpenAI sells a $20.00 monthly subscription to ChatGPT
Plus, and access to many GPTs rather than a single ChatGPT base model may motivate purchases.
Moreover, though builders cannot yet monetize their GPTs within the Store, they can direct users
to purchase their products or interact with their data via GPT Actions. However, Expertise Fog on
the GPT Store may also arise from more complex motivations than financial considerations tied
to the use of individual models. Consider that, unless a builder adds proprietary data, or defines
a programmatic action, a GPT is nothing more than a text string describing how the GPT-4 base
model should behave - which is by default written by a custom version of GPT-4 known as the
GPT Builder. GPT-4 can itself employ all of the Capabilities with which a GPT can be equipped.
Were a GPT’s instructions revealed, it could be trivially replicated by an end user. One can
imagine a very different design for the GPT Store, which provided the most appropriate instructions
to an end user, similar to open repositories of prompts for image generators like Stable Diffusion
and Midjourney. However, transparently providing instructions rather than presenting a GPT
would likely command less of a user’s time and attention than an interaction with a humanlike
expert. For example, Instructions for a GPT like Legal+ could be as straightforward as “You are
a lawyer”; however, such a prompt would likely generate less user activity than the Legal+ GPT.</p>
        <p>Given the minor differences between many GPTs and the GPT-4 base model, we observe that
the inability to review GPTs may reflect OpenAI’s concerns that user feedback would implicate
not only a builder-created GPT, but OpenAI’s base model. From February 22, 2024, OpenAI
has allowed users to rate GPTs on a vfie-star scale. However, users cannot leave reviews, wherein
criticisms might reflect shortcomings both in individual GPTs and in GPT-4 itself. This differs
from app stores like those maintained by Apple and Google. Where apps published to those stores
contain flaws, they reflect a shortcoming of the app’s developer, rather than the store. Moreover,
maintainers were motivated to ensure that reviews highlighted untrustworthy apps to maintain
the quality of the store. However, identifying flaws in GPTs could also harm the brand of GPT-4.</p>
        <p>Finally, OpenAI’s motivations may differ from builders. We encountered a deceptive aspect of
the GPT Builder interface: when uploading files as Knowledge, or defining a new Action, the
interface inserts a new “Additional Settings” dropdown at the bottom of the page, usually out of
sight. Expanding this dropdown reveals a previously absent checkbox labeled “Use conversation
data in your GPT to improve our models” - checked by default. OpenAI’s motivations may thus
have more to do with collecting data for training models, whether builders are aware of it or not.</p>
      </sec>
    </sec>
    <sec id="sec-6">
      <title>5. Mobilizing Around Deceptive Design in GPTs</title>
      <p>We propose a preliminary roadmap for addressing deceptive design patterns in GPTs by 1)
transparently describing GPT components; 2) establishing expertise through systematic evaluation;
3) clearly labeling GPTs as tools; and 4) foregrounding shortcomings as a human expert would.</p>
      <sec id="sec-6-1">
        <title>5.1. Transparent Presentation of GPT Components</title>
        <p>Transparent presentation of a GPT’s components could better communicate its functionality. For
example, the GPT creation interface might permit builders to describe the Knowledge accessible
by a GPT, and while making a GPT’s Instructions more transparent may conflict with intentional
vagueness that renders these text strings valuable, the GPT Store could allow builders to indirectly
describe instructions, and set standards for communicating a GPT’s intended behaviors. If a legal
GPT is instructed to review drafts, but not to produce documents, describing these distinctions
might mitigate improper use. Clearly communicating Actions can also help establish their
value. For example, the Kayak GPT presents hotel options by retrieving them from the Kayak
website. However, rather than describing what Action is occurring, the interface simply notes
that the GPT is “talking to kayak.com” next to a progress bar, providing little information while
anthropomorphizing model outputs. Simply stating that the model is querying Kayak’s database
of hotel listings would provide clarity. Finally, GPT builders also deserve transparency in the
use of the data with which they equip their GPTs. The dropdown obscuring a default enabling
OpenAI to use conversation data from knowledge-equipped GPTs inhibits the agency of builders.</p>
      </sec>
      <sec id="sec-6-2">
        <title>5.2. Zero-Shot Should Not Imply Zero-Evaluation</title>
        <p>Generative AI shifted away from the pretraining and task-specific fine-tuning paradigm of early
transformer models [32, 33]. Models like ChatGPT and DALL-E respond to user inputs in a
“zeroshot” setting - with no examples of a task provided by the user, and no updates to the weights [34].
Such models improved the usability of AI and broadened its appeal. However, increased usability
has come at the expense of evaluation. Most chat-based models are assessed on world knowledge
[35] and the degree to which their responses are preferred by human users [36]. This is useful for
assessing model usability and general knowledge, but it provides no measure of how well-suited
specialist GPTs like those on the GPT Store are for the task for which they are specialized. Legal
GPTs are not assessed on legal tasks, nor medical GPTs on medical tasks, nor financial GPTs
on financial tasks. Adopting rigorous standards for evaluating GPTs, especially those positioned
as experts, can mitigate deceptive design. Such efforts might build on prior work to adequately
document machine learning models using Model Cards [37], or datasets using Datasheets [38].
Domain-specific GPTs could undergo a battery of evaluations, with the results presented to
potential users. Popular models might be red-teamed [39] for problematic in-domain content, and
potentially certified by human experts. Such evaluations would not relieve the responsibility to note
that GPTs can make mistakes, but it would provide more specific, contextual knowledge to users.</p>
      </sec>
      <sec id="sec-6-3">
        <title>5.3. Tools, Not Personas</title>
        <p>Many ostensibly expert GPTs differ from the GPT-4 base model only in the Instructions provided
by the builder. The learned weights of the underlying model do not differ, and the GPT may not
have access to proprietary data or programmatic actions. We argue that such GPTs should not
be positioned as experts, in that they have access to precisely the same knowledge available to the
base model. If a user should not use ChatGPT to write legal documents for submission to a court,
the user also should not use ChatGPT adopting the persona of a “personal lawyer” to write such
documents. Where users have made the mistake of relying on ChatGPT for such tasks due to
claims about the model’s general intelligence, the advertising of GPTs like Legal+ now means that
users could make the same mistake - about the same underlying model - due to unwarranted claims
about the model’s specific intelligence. Anthropomorphized names and descriptions exacerbate
the issue by positioning models as doctors, lawyers, and financial planners. Addressing this may
serve as a site for regulation: where a GPT is functionally the same as a base model but for its
Instructions, its name and description should reflect that it is a tool, not a standin for a human.</p>
      </sec>
      <sec id="sec-6-4">
        <title>5.4. Foregrounding Shortcomings</title>
        <p>Human experts often defer to the authority of other experts. For example, a general practitioner
would refer a patient experiencing heart issues to a cardiologist. Human experts with general
skills are nonetheless unqualified to address every specific situation, and trustworthy experts
highlight this inadequacy, prioritizing the well-being of clients. Similarly, generalist AI will
be unqualified in many cases - and this should be highlighted. Conversations with ostensibly
expert GPTs like Legal+ show that the GPT attempts to answer whatever question is put to it, and
then adds a brief disclaimer. However, this assumes the user will read to the end, and disregard
a potentially false answer in light of the disclaimer. We propose instead that, unless a GPT is
directly quoting from a qualified human expert, the GPT can highlight that 1) the question is best
addressed by a human with specific expertise; and 2) the model is unreliable for use as an expert.</p>
      </sec>
    </sec>
    <sec id="sec-7">
      <title>6. Conclusion</title>
      <p>In this position paper, we argued that the GPT Store facilitates deceptive design related to the
presentation of expertise, resulting in a deceptive design pattern we call Expertise Fog. We
suggested four ways to address deceptive design in GPTs via transparency of GPT components,
robust evaluation of GPTs, labeling GPTs as tools, and foregrounding the shortcomings of GPTs.
We intend to continue our analysis a full study of deceptive presentations on the GPT Store and
deceptive interactions with GPTs.
normative considerations, and measurement methods, in: Proceedings of the 2021 CHI
conference on human factors in computing systems, 2021, pp. 1–18.
[25] L. Di Geronimo, L. Braz, E. Fregnan, F. Palomba, A. Bacchelli, Ui dark patterns and where
to find them: a study on mobile applications and user perception, in: Proceedings of the
2020 CHI conference on human factors in computing systems, 2020, pp. 1–14.
[26] J. Gunawan, A. Pradeep, D. Choffnes, W. Hartzog, C. Wilson, A comparative study of dark
patterns across web and mobile modalities, Proceedings of the ACM on Human-Computer
Interaction 5 (2021) 1–29.
[27] C. M. Gray, J. Chen, S. S. Chivukula, L. Qu, End user accounts of dark patterns as felt
manipulation, Proceedings of the ACM on Human-Computer Interaction 5 (2021) 1–25.
[28] I. Chordia, L.-P. Tran, T. J. Tayebi, E. Parrish, S. Erete, J. Yip, A. Hiniker, Deceptive design
patterns in safety technologies: A case study of the citizen app, in: Proceedings of the 2023
CHI Conference on Human Factors in Computing Systems, 2023, pp. 1–18.
[29] A. Chaudhary, J. Saroha, K. Monteiro, A. G. Forbes, A. Parnami, are you still watching?:
Exploring unintended user behaviors and dark patterns on video streaming platforms, in:
Designing Interactive Systems Conference, 2022, pp. 776–791.
[30] A. Monge Roffarello, K. Lukoff, L. De Russis, Defining and identifying attention capture
deceptive designs in digital interfaces, in: Proceedings of the 2023 CHI Conference on
Human Factors in Computing Systems, 2023, pp. 1–19.
[31] C. M. Gray, Y. Kou, B. Battles, J. Hoggatt, A. L. Toombs, The dark (patterns) side of
ux design, in: Proceedings of the 2018 CHI conference on human factors in computing
systems, 2018, pp. 1–14.
[32] J. D. M.-W. C. Kenton, L. K. Toutanova, Bert: Pre-training of deep bidirectional
transformers for language understanding, in: Proceedings of NAACL-HLT, volume 1, 2019,
p. 2.
[33] Y. Liu, M. Ott, N. Goyal, J. Du, M. Joshi, D. Chen, O. Levy, M. Lewis, L. Zettlemoyer,
V. Stoyanov, Roberta: A robustly optimized bert pretraining approach, arXiv preprint
arXiv:1907.11692 (2019).
[34] W. Wang, V. W. Zheng, H. Yu, C. Miao, A survey of zero-shot learning: Settings, methods,
and applications, ACM Transactions on Intelligent Systems and Technology (TIST) 10
(2019) 1–37.
[35] D. Hendrycks, C. Burns, S. Basart, A. Zou, M. Mazeika, D. Song, J. Steinhardt, Measuring
massive multitask language understanding, in: International Conference on Learning
Representations, 2020.
[36] L. Zheng, W.-L. Chiang, Y. Sheng, S. Zhuang, Z. Wu, Y. Zhuang, Z. Lin, Z. Li, D. Li,
E. Xing, et al., Judging llm-as-a-judge with mt-bench and chatbot arena, Advances in
Neural Information Processing Systems 36 (2024).
[37] M. Mitchell, S. Wu, A. Zaldivar, P. Barnes, L. Vasserman, B. Hutchinson, E. Spitzer, I. D.</p>
      <p>Raji, T. Gebru, Model cards for model reporting, in: Proceedings of the conference on
fairness, accountability, and transparency, 2019, pp. 220–229.
[38] T. Gebru, J. Morgenstern, B. Vecchione, J. W. Vaughan, H. Wallach, H. D. Iii, K. Crawford,</p>
      <p>Datasheets for datasets, Communications of the ACM 64 (2021) 86–92.
[39] D. Ganguli, L. Lovitt, J. Kernion, A. Askell, Y. Bai, S. Kadavath, B. Mann, E. Perez,
N. Schiefer, K. Ndousse, et al., Red teaming language models to reduce harms: Methods,
scaling behaviors, and lessons learned, arXiv preprint arXiv:2209.07858 (2022).</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          [1] OpenAI, Introducing chatgpt,
          <source>OpenAI Blog</source>
          (
          <year>2022</year>
          ) .
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          [2]
          <string-name>
            <given-names>T.</given-names>
            <surname>Eloundou</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Manning</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Mishkin</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Rock</surname>
          </string-name>
          ,
          <article-title>Gpts are gpts: An early look at the labor market impact potential of large language models</article-title>
          ,
          <source>arXiv preprint arXiv:2303.10130</source>
          (
          <year>2023</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          [3]
          <string-name>
            <given-names>J.</given-names>
            <surname>Porter</surname>
          </string-name>
          ,
          <article-title>Chatgpt continues to be one of the fastest-growing services ever, The Verge (</article-title>
          <year>2023</year>
          ) .
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          [4]
          <string-name>
            <given-names>T.</given-names>
            <surname>Claburn</surname>
          </string-name>
          ,
          <article-title>Lawyers who cited fake cases hallucinated by chatgpt must pay</article-title>
          , https://www. theregister.com/
          <year>2023</year>
          /06/22/lawyers_fake_cases/,
          <year>2023</year>
          . [Accessed 24-
          <fpage>02</fpage>
          -2024].
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          [5]
          <string-name>
            <given-names>A.</given-names>
            <surname>Parikh</surname>
          </string-name>
          ,
          <article-title>Deception inspection: Attorney faces discipline for citing fake law</article-title>
          , https: //www.natlawreview.com/article/deception-inspection
          <article-title>-attorney-faces-discipline-cit ing-fake-</article-title>
          <string-name>
            <surname>law</surname>
          </string-name>
          ,
          <year>2024</year>
          . [Accessed 24-
          <fpage>02</fpage>
          -2024].
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          [6]
          <string-name>
            <given-names>J.</given-names>
            <surname>Miller</surname>
          </string-name>
          ,
          <article-title>New york judge rebukes law firm for using chatgpt to justify its fees</article-title>
          , https: //www.ft.
          <source>com/content/fc30bc2b-d89b-4222-ad26-3a700b047c27</source>
          ,
          <year>2024</year>
          . [Accessed 24-
          <fpage>02</fpage>
          -2024].
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          [7]
          <string-name>
            <surname>T. A</surname>
          </string-name>
          . Press,
          <article-title>Michael cohen says he unwittingly sent ai-generated fake legal cases to his attorney</article-title>
          , https://www.npr.org/
          <year>2023</year>
          /12/30/1222273745/michael-cohen
          <article-title>-ai-fake-legal -</article-title>
          <string-name>
            <surname>cases</surname>
          </string-name>
          ,
          <year>2023</year>
          . [Accessed 24-
          <fpage>02</fpage>
          -2024].
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          [8]
          <string-name>
            <surname>OpenAI</surname>
          </string-name>
          , About, https://openai.com/about,
          <year>2024</year>
          . [Accessed 02-
          <fpage>25</fpage>
          -2024].
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          [9]
          <string-name>
            <given-names>K.</given-names>
            <surname>Wach</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C. D.</given-names>
            <surname>Duong</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Ejdys</surname>
          </string-name>
          , R. Kazlauskaitė,
          <string-name>
            <given-names>P.</given-names>
            <surname>Korzynski</surname>
          </string-name>
          , G. Mazurek,
          <string-name>
            <given-names>J.</given-names>
            <surname>Paliszkiewicz</surname>
          </string-name>
          ,
          <string-name>
            <surname>E. Ziemba,</surname>
          </string-name>
          <article-title>The dark side of generative artificial intelligence: A critical analysis of controversies and risks of chatgpt</article-title>
          ,
          <source>Entrepreneurial Business and Economics Review</source>
          <volume>11</volume>
          (
          <year>2023</year>
          )
          <fpage>7</fpage>
          -
          <lpage>30</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          [10]
          <string-name>
            <given-names>J.</given-names>
            <surname>Zhou</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Ke</surname>
          </string-name>
          ,
          <string-name>
            <given-names>X.</given-names>
            <surname>Qiu</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Huang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J. Zhang</given-names>
            , Chatgpt: potential, prospects, and limitations, Frontiers of Information Technology &amp; Electronic
            <surname>Engineering</surname>
          </string-name>
          (
          <year>2023</year>
          )
          <fpage>1</fpage>
          -
          <lpage>6</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          [11]
          <string-name>
            <given-names>X.</given-names>
            <surname>Zhan</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Y.</given-names>
            <surname>Xu</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Sarkadi</surname>
          </string-name>
          ,
          <article-title>Deceptive ai ecosystems: The case of chatgpt</article-title>
          ,
          <source>arXiv preprint arXiv:2306.13671</source>
          (
          <year>2023</year>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref12">
        <mixed-citation>
          [12]
          <string-name>
            <given-names>B.</given-names>
            <surname>Weiser</surname>
          </string-name>
          ,
          <string-name>
            <given-names>N.</given-names>
            <surname>Schweber</surname>
          </string-name>
          ,
          <article-title>The chatgpt lawyer explains himself</article-title>
          , https://www.nytimes.com/
          <year>2023</year>
          /06/08/nyregion/lawyer-chatgpt-sanctions.html,
          <year>2023</year>
          . [Accessed 02-
          <fpage>25</fpage>
          -2024].
        </mixed-citation>
      </ref>
      <ref id="ref13">
        <mixed-citation>
          [13]
          <article-title>OpenAI, Introducing the gpt store</article-title>
          ,
          <source>OpenAI Blog</source>
          (
          <year>2024</year>
          ) .
        </mixed-citation>
      </ref>
      <ref id="ref14">
        <mixed-citation>
          [14]
          <string-name>
            <surname>OpenAI</surname>
          </string-name>
          , Introducing gpts,
          <source>OpenAI Blog</source>
          (
          <year>2023</year>
          ) .
        </mixed-citation>
      </ref>
      <ref id="ref15">
        <mixed-citation>
          [15]
          <article-title>OpenAI, Building and publishing a gpt</article-title>
          , https://help.openai.com/en/articles/8798878-b
          <string-name>
            <surname>uilding-</surname>
          </string-name>
          and
          <article-title>-publishing-a-</article-title>
          <string-name>
            <surname>gpt</surname>
          </string-name>
          ,
          <year>2024</year>
          . [Accessed 02-
          <fpage>25</fpage>
          -2024].
        </mixed-citation>
      </ref>
      <ref id="ref16">
        <mixed-citation>
          [16]
          <string-name>
            <given-names>P.</given-names>
            <surname>Lewis</surname>
          </string-name>
          ,
          <string-name>
            <given-names>E.</given-names>
            <surname>Perez</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Piktus</surname>
          </string-name>
          ,
          <string-name>
            <given-names>F.</given-names>
            <surname>Petroni</surname>
          </string-name>
          ,
          <string-name>
            <given-names>V.</given-names>
            <surname>Karpukhin</surname>
          </string-name>
          ,
          <string-name>
            <given-names>N.</given-names>
            <surname>Goyal</surname>
          </string-name>
          ,
          <string-name>
            <given-names>H.</given-names>
            <surname>Küttler</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Lewis</surname>
          </string-name>
          , W.-t. Yih,
          <string-name>
            <given-names>T.</given-names>
            <surname>Rocktäschel</surname>
          </string-name>
          , et al.,
          <article-title>Retrieval-augmented generation for knowledge-intensive nlp tasks</article-title>
          ,
          <source>Advances in Neural Information Processing Systems</source>
          <volume>33</volume>
          (
          <year>2020</year>
          )
          <fpage>9459</fpage>
          -
          <lpage>9474</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref17">
        <mixed-citation>
          [17]
          <string-name>
            <given-names>A.</given-names>
            <surname>Ramesh</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Pavlov</surname>
          </string-name>
          , G. Goh,
          <string-name>
            <given-names>S.</given-names>
            <surname>Gray</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.</given-names>
            <surname>Voss</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Radford</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Chen</surname>
          </string-name>
          ,
          <string-name>
            <surname>I. Sutskever</surname>
          </string-name>
          ,
          <article-title>Zero-shot text-to-image generation</article-title>
          ,
          <source>in: International Conference on Machine Learning, PMLR</source>
          ,
          <year>2021</year>
          , pp.
          <fpage>8821</fpage>
          -
          <lpage>8831</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref18">
        <mixed-citation>
          [18]
          <string-name>
            <given-names>A.</given-names>
            <surname>Ramesh</surname>
          </string-name>
          ,
          <string-name>
            <given-names>P.</given-names>
            <surname>Dhariwal</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Nichol</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.</given-names>
            <surname>Chu</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Chen</surname>
          </string-name>
          ,
          <article-title>Hierarchical text-conditional image generation with clip latents (????).</article-title>
        </mixed-citation>
      </ref>
      <ref id="ref19">
        <mixed-citation>
          [19]
          <string-name>
            <given-names>Y.</given-names>
            <surname>Qin</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Liang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Y.</given-names>
            <surname>Ye</surname>
          </string-name>
          ,
          <string-name>
            <given-names>K.</given-names>
            <surname>Zhu</surname>
          </string-name>
          ,
          <string-name>
            <given-names>L.</given-names>
            <surname>Yan</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Y.</given-names>
            <surname>Lu</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Y.</given-names>
            <surname>Lin</surname>
          </string-name>
          ,
          <string-name>
            <given-names>X.</given-names>
            <surname>Cong</surname>
          </string-name>
          ,
          <string-name>
            <given-names>X.</given-names>
            <surname>Tang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>B.</given-names>
            <surname>Qian</surname>
          </string-name>
          , et al.,
          <article-title>Toolllm: Facilitating large language models to master 16000+ real-world apis</article-title>
          ,
          <source>arXiv preprint arXiv:2307.16789</source>
          (
          <year>2023</year>
          ).
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>