<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta>
      <journal-title-group>
        <journal-title>October</journal-title>
      </journal-title-group>
    </journal-meta>
    <article-meta>
      <title-group>
        <article-title>Are LLMs and the Model Context Protocol Suficient for Automating Web-Based Information Processing?</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Stephen Cranefield</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>School of Computing, University of Otago</institution>
          ,
          <addr-line>Dunedin</addr-line>
          ,
          <country country="NZ">New Zealand</country>
        </aff>
      </contrib-group>
      <pub-date>
        <year>2025</year>
      </pub-date>
      <volume>26</volume>
      <issue>2025</issue>
      <fpage>0000</fpage>
      <lpage>0001</lpage>
      <abstract>
        <p>More than two decades ago, the concept of the Semantic Web was motivated by a vision of personal assistant agents that fulfil users' goals by locating, accessing and reasoning about information and services on the web. Until the recent advent of agents that can act on instructions generated by large language models (LLMs), progress towards this vision has been slow. Now, the natural language understanding abilities of LLMs and their emerging ability to generate and follow instructions suggest that LLM-powered agents may provide a path towards developing such general-purpose assistant agents. This paper presents a case study of automating a multi-step web-based information seeking and filtering task using a large language model provided with tool access via the Model Context Protocol (MCP) and information about relevant web resources. We found this was possible, but only by providing the LLM with detailed information about how to use the resources. We discuss the reasons for this and how this requirement could be eliminated by providing discovery mechanisms and web page usage information intended for LLMs. We also propose the development of higher-level models of web resources in terms of information processing goals or tasks.</p>
      </abstract>
      <kwd-group>
        <kwd>eol&gt;Information processing</kwd>
        <kwd>World Wide Web</kwd>
        <kwd>Large Language Models (LLMs)</kwd>
        <kwd>Tools</kwd>
        <kwd>Model Context Protocol (MCP)</kwd>
      </kwd-group>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>1. Introduction</title>
      <p>
        Since the birth of the World Wide Web in the 1990s, it has become an indispensable means for finding
information, buying products and services, submitting service requests, communicating with our social
networks and satisfying many other information processing and communication goals. Although
the web was designed for people, it is no surprise that researchers and industry have long sought
to develop software to extract information from the web [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ] or to serve as a personal assistant to
automate information-gathering from local and web-based resources [
        <xref ref-type="bibr" rid="ref2">2</xref>
        ]. As far back as 2001, a popular
science magazine article introduced the concept of the Semantic Web by imagining the existence of
personal assistant agents that fulfil users’ goals by interacting with other assistants and web resources
[
        <xref ref-type="bibr" rid="ref3">3</xref>
        ]. However, until the advent of pre-trained large language models (LLMs) and their connection with
memory, planners, tools, etc., to form “LLM Agents” [
        <xref ref-type="bibr" rid="ref4 ref5 ref6">4, 5, 6</xref>
        ], little progress was made towards this
vision.
      </p>
      <p>
        Recently, researchers from the fields of Web Architecture and the Web of Things, the Semantic Web
and Linked Data, and Autonomous Agents and Multi-Agent Systems have come together to study
how agents can use and be part of the modern web, considered as a “homogeneous hypermedia fabric
that interconnects everything—devices, physical objects, documents, or abstract concepts” [
        <xref ref-type="bibr" rid="ref7 ref8">7, 8</xref>
        ]. This
motivates the development of hypermedia multi-agent systems: “systems of agents able to perceive,
decide, and act through the Web to achieve goals”.1
      </p>
      <p>
        On the other hand, a huge amount of activity is occurring in the IT industry to develop tools that
enable LLM-powered agents to interact with each other and both local and online resources. Several
specifications have been developed for how these interactions could work in a plug-and-play fashion
[
        <xref ref-type="bibr" rid="ref9">9</xref>
        ]. In this paper we focus on the Model Context Protocol (MCP),2 which is currently generating
a large amount of activity, including a flood of blog posts about its architecture and use, and the
development of various industry and community “MCP servers” that provide LLMs with one or more
tools to perform computations or provide access to other software services. The MCP is based on a more
traditional architecture than that envisaged for hypermedia multi-agent systems: it allows client-server
communication (via JSON-RPC 2.0) between the application hosting the LLM and one or more MCP
servers that provide tools.
      </p>
      <p>
        In this paper, we evaluate how much efort is required (if at all possible) to prompt an LLM equipped
with relevant MCP servers to automate a specific web-based information-seeking filtering task presented
by Berners-Lee et al. [
        <xref ref-type="bibr" rid="ref3">3</xref>
        ]. We found that this was possible after iterated prompt development resulting
in the provision of detailed usage information about the required web resources that was not available
to the agent by examining the HTML of those web resources. This level of guidance is not realistic in
practice—it would be more eficient to do the task yourself unless the task was frequently repeated. We
present our observations about the causes of this problem and suggestions about what is missing to
achieve general-purpose web information agents based on LLMs.
      </p>
    </sec>
    <sec id="sec-2">
      <title>2. Case Study</title>
      <p>
        In a classic article from 2001, Berners-Lee et al. [
        <xref ref-type="bibr" rid="ref3">3</xref>
        ] introduce the concept of the Semantic Web in the
context of two siblings (Lucy and Peter) seeking to jointly organise medical treatment for their mother.
Lucy has just taken her mother to see a doctor, who says that a specialist appointment is needed and
then a series of physical therapy sessions. Lucy uses an assistant agent on her “handheld Web browser”
to set up the appointments. The agent contacts the doctor’s agent to obtain details of the required
treatment and then prepares a shortlist of providers by looking up several online lists, and selecting
those that are approved by the mother’s insurance plan, within a certain distance of her home, and
highly rated on trusted rating services. It then obtains available appointment times from the provider’s
web sites and cross references these with Lucy’s and Pete’s free times, as they will share the chaufeuring
duties.
      </p>
      <p>The article highlights how the Semantic Web would enable all parties’ agents to access information
from web sites and communicate with each other using structured knowledge representations that
make reference to online ontologies defining the concepts and relationship between them, as well as
rules for reasoning about them. It also proposed that a service discovery framework would allow agents
and web-based services to advertise their functionality in semantic terms so they can be discovered and
made use of dynamically to solve problems such as the one above.</p>
      <p>
        It has been nearly 25 years since that article was published, and while Semantic Web technologies
are now used in niche areas, the vision of widespread Semantic Web enhancement of existing
humanoriented web sites has not come to pass, arguably due to the high efort required to develop ontologies
and annotate existing web sites with semantic information. Furthermore, the domain-independent
planning competence required by the agents in the scenario has not yet become available, at least for
consumer applications. However, the rapid advancement of large language models (LLMs) promises
to solve both of these limitations: (i) their high level skills at understanding and generating natural
language raises the question of whether they can mediate between agents and services that use diferent
terminology, (ii) and their ability to decompose instructions into sequential task suggests they may
eventually be able perform the type of task decomposition illustrated in the scenario (although we
acknowledge that the general planning ability of LLMs has limitations [
        <xref ref-type="bibr" rid="ref10 ref11">10, 11</xref>
        ]). Furthermore, an
explosion of frameworks and tools for LLM agents has made it possible for LLM agents to execute steps
of a plan that require interaction with external resources.
      </p>
      <p>
        We have therefore developed an implementation of a simplified version of the scenario above, using
the Model Context Protocol (MCP), which Ehtesham et al. [
        <xref ref-type="bibr" rid="ref9">9</xref>
        ] position in Stage 1 of their roadmap
for interoperability between LLM-powered agents and external services and tools. The Agent
Communication Protocol (ACP), Agent-to-Agent Protocol (A2A) and Agent Network Protocol (ANP) are
positioned above MCP at increasingly higher levels. Our aim is to evaluate what is possible with the
fast-emerging MCP. There are already a large number of commercially and community developed MCP
servers available, and MCP is supported by a number of environments for running LLMs with tool
access (“MCP Hosts”), such as Claude for Desktop and LM Studio. We identify some limitations of this
approach and make suggestions about how these can be eliminated or mitigated.
      </p>
      <p>We used LM Studio version 0.3.17, with the LLM qwen2.5-7b-instruct, one of three LLMs available
for LM Studio that has “native” tool support. We installed the Fetch MCP server3, which fetches a web
page and converts it to Markdown, the MCP Http Server4, which provides HTTP GET, POST, PUT and
DELETE tools, and the Mapbox MCP Server5, which includes a forward geocoding tool and a matrix
tool that “calculates travel times and distances between multiple points”.</p>
      <p>
        Our simplified version of the scenario from Berners-Lee et al. [
        <xref ref-type="bibr" rid="ref3">3</xref>
        ] involves prompting an LLM to:
• Find all physiotherapists in Dunedin (New Zealand) and then either filter the list to keep only
those that are listed by the nib insurance company as FirstChoice medical providers, or if none
are, keep the full list.
• Ask the user for their address.
• Find the provider that is closest to the address
• Ask the user to phone the selected provider for the first three available appointment times, and
then enter these in this chat.
      </p>
      <p>• Check the user’s calendar to see at which of the appointment times the user is listed as free.</p>
      <p>These instructions, developed through a trial and error process, have already decomposed the user’s
goal into sequential steps to reduce the demand for planning by the LLM. No tool was provided for
calendar access in the last step, but without further instruction the LLM asked the user to provide their
free times and then selected the best appointment time.</p>
      <p>The following information about resources was also provided in the prompt:
• https://nzdirectory.co.nz/search?query=&lt;BUSINESS_SEARCH_TERM&gt;&amp;region=&lt;REGION&gt;
&amp;city=&lt;CITY&gt;. This is the search URL pattern for a directory of businesses in New Zealand. For
&lt;BUSINESS_SEARCH_TERM&gt; you can use a suficiently discriminating prefix of the business or
service type. Given a &lt;CITY&gt; you MUST choose &lt;REGION&gt; to be whichever of the following
New Zealand regions contains the city: Northland, Auckland, Waikato, Bay of Plenty, Gisborne,
Hawke’s Bay, Taranaki, Manawatu-Whanganui, Wellington, Tasman, Nelson, Marlborough, West
Coast, Canterbury, Otago and Southland.</p>
      <p>Use your background knowledge to choose the correct region for the city. You must convert the
region and city to lowercase when including them in the search URL. Performing a GET on the
search URL returns an HTML page for human viewing, so use a tool that converts the result to a
more concise format like Markdown.
• https://www.nib.co.nz/find-a-provider/api/recommendations?searchTerm=&lt;SEARCH_TERM&gt; is
the search URL for the insurance company nib’s FirstChoice providers. The search term can be a
suficiently distinguishing prefix of the medical speciality of interest (in lowercase). The results
are in JSON, so obtain the results using a tool that does not modify the format. Examine the JSON
to find the details for the returned providers.</p>
      <p>With this level of detail, the LLM was able to successfully reach the end of the simplified scenario.
However, further prompt engineering and the introduction of a response verifier agent would be needed
to create a more reliable solution.
3https://github.com/modelcontextprotocol/servers/tree/main/src/fetch
4https://github.com/one-matrix/mcp-http
5https://github.com/mapbox/mcp-server</p>
      <p>
        The need for such a detailed prompt highlights the need for solutions to the following problems:
Tool documentation, selection and calling. During the iterative development and testing of the
prompt above, we found that the Mapbox MCP Server’s matrix tool does not provide the MCP client
with any information about the units of the distances it returns. This did not afect the task performance,
which only required finding the closest nib FirstSelect physiotherapist. However, it highlights the need
for a more expressive approach to documenting tools compared to MCP’s reliance on function names
and a textual description. Furthermore, we tested a number of alternative tools for making HTTP
requests and only those that LM Studio and qwen2.5-7b-instruct used successfully were made available
to LM Studio for subsequent attempts. We also found it necessary to include hints about tool use in the
prompt. Tool learning, which includes the problems of tool selection and calling, is an active research
ifeld [
        <xref ref-type="bibr" rid="ref12 ref13">12, 13</xref>
        ] and should alleviate these issues in the future.
      </p>
      <p>
        Discoverability and informed selection of services. Ideally, there should be no need to provide
web resources in the LLM prompt as these should be able to be discovered dynamically by the LLM
agent from registries and/or catalogues. This could be achieved by extending the existing web indexing
and search infrastructure, for example, by developing a web standard in collaboration with search
engine providers that enables web search results to be enhanced with service descriptions. These could
be either in natural language, under the assumption that LLM agents will be able interpret these, or
use a semantic representation such as a Web of Things Thing Description [
        <xref ref-type="bibr" rid="ref14">14</xref>
        ]. It is an open question
whether the latter approach is likely to reach widespread use in the mainstream web or whether a
textual description will prove to be suficient as the capabilities of LLMs improve.
      </p>
      <p>Alternatively, a separate ecosystem of registries and catalogues could be developed. The original
vision for web services included such infrastructure, which never came about (at least for public use).
However, the promise of LLM-powered agents as universal clients of web resources may create suficient
demand for such infrastructure to be viable.</p>
      <p>
        Given that multiple web sites may provide the same information, perhaps with diferent search
mechanisms, levels of detail and information currency, there is also a need to support user comments,
ratings and usage tips for specific sites
Navigation from a single entry point. Once a useful web site has been identified, a key principle
of the web is that it should be possible to navigate the site by following hyperlinks that have suficient
context to understand their purpose. This is Fielding’s concept of “hypermedia as the engine of
application state” [
        <xref ref-type="bibr" rid="ref15">15</xref>
        ], abbreviated as HATEOAS in REST API circles. While navigation through web
sites has become natural for people browsing the web, and including links labelled with their relation
type in REST API responses is considered best practice [
        <xref ref-type="bibr" rid="ref16">16</xref>
        ], understanding how to navigate through
web sites is challenging for LLMs when accessing web pages designed for people to read. The number of
tokens consumed when a web page in HTML is included in a prompt can overwhelm an LLM (especially
when there are large amounts of JavaScript), hence some web access tools such as the Fetch MCP
server convert web pages to Markdown. Furthermore, the allowed options for drop-down lists in the
input fields for forms may be dynamically created by JavaScript. This may also be the case for the URL
template used when the form is submitted.
      </p>
      <p>For example, the NZ Directory search URL given in the prompt above cannot be readily inferred
(at least by this author) by examining the search page at https://nzdirectory.co.nz/. It only becomes
evident when the search results are returned. Furthermore, the accepted values for the region input
element are not present in the HTML (including scripts). Only once a region has been entered, does
the city input element appear, with its options populated using a POST request to the server. This
dynamic modification of a web page cannot be predicted or observed by an LLM. Therefore, our prompt
specified a standard list of New Zealand regions to choose from and stated that the selected region must
be provided in the search URL’s query parameters in lowercase (otherwise no results are returned).
Similarly, the nib “Find a Provider” web page has search input fields for the healthcare provider name
or speciality and for a suburb or region, and both are populated dynamically by GET requests to the
server every time time a character is typed in the field. However, it turns out the second of these inputs
is not required as a search URL query parameter, so we did not prompt the LLM to use it.</p>
      <p>
        This impediment to LLM agents’ use of modern dynamic web sites could be addressed in several
ways:
• Use an MCP server with a richer interface for an LLM to interact with and examine web sites,
such as the browser-use MCP server6 [
        <xref ref-type="bibr" rid="ref17">17</xref>
        ]. This and other similar tools run a headless browser,
which allows interaction and interrogation at the DOM level as well as screenshot capture. This
approach would allow a suitably competent LLM to explore a web site by trial and error (and
user-supplied heuristics) with the aim of satisfying the user’s goal.
• Build separate web resources, parallel to the human-facing ones, that are optimised for use by
LLM agents. This is envisaged by a proposal for a Markdown file, llms.txt, at the top level of a
web site to provide LLMs accessing the site with “brief background information, guidance, and
links to detailed markdown files” [
        <xref ref-type="bibr" rid="ref18">18</xref>
        ]. The links should be to LLM-friendly Markdown versions
of existing web pages on the site.
• Alternatively, vastly expand the information that is available from web sites via web services.
      </p>
      <p>This and the option above seem unlikely to be attractive to organisations until there is widespread
consumer use of LLM web agents, and in the meantime there will still be a need for agents that
can use human-oriented sites.
• Publish web site documentation designed to help agents navigate their way through the site.</p>
      <p>This could include structured information such as site maps, stand-alone documents in natural
language and machine-readable annotations embedded within web pages. A limiting factor of
this approach is that it will take extra time and efort when developing and maintaining a web
site, which may be dificult for organisations to justify.
• Develop web standards (in collaboration with the creators of web page development frameworks)
for the use of server-push technology7 to provide navigational information to be to LLM agents
as the content is dynamically updated by scripts. This would remove the requirement to maintain
documentation for the LLM separately from the web site front end.</p>
      <p>
        Reasoning at a goal or task level. The interface between an MCP tool and an LLM is specified
at a low level, in terms of input schemas, a (hopefully meaningful) tool name and a “human-readable
description”. Any further information about when it would be useful to use each tool must be provided
in a system or user LLM prompt. Beyond the scope of MCP, there is also a lack of description formats
(other than natural language) and ontologies for the information processing purpose of web resources.
Some steps in this direction have been made in the Web of Things Thing Description information
model [
        <xref ref-type="bibr" rid="ref14">14</xref>
        ], which provides a format for describing (amongst other features) the afordances ofered
by a web-connected thing to inspect properties, perform actions and publish events. Hafiene et al.
[
        <xref ref-type="bibr" rid="ref19">19</xref>
        ] discuss the dynamic use of information represented using the Hypermedia Multi-Agent Systems
(hMAS) ontology to enable an agent to discover a description of thing’s afordance that will achieve
one its goals.8
      </p>
      <p>
        The abstraction of resources and tools in terms of goals or tasks seems an important means of
facilitating high planning for the combined use of various tools and resources. Furthermore, reasoning
at this level would allow the use of techniques like Hierarchical Task Network (HTN) planning [
        <xref ref-type="bibr" rid="ref20">20</xref>
        ]
to make use of predefined decompositions of tasks into subtasks. This would allow successful recipes
for combining multiple resources and tools to be published and considered for use by LLM agents
equipped with a planner (or perhaps by the LLM itself). However, research is needed into goal or task
representations suited to the context of using the web to find and create information (such as our case
6https://medium.com/towards-agi/how-to-setup-and-use-browser-use-mcp-server-8d0725440f31
7We note that MCP supports Server-Sent Events (SSE) for delivering notifications from tools to clients.
8The scenario also involves the agent joining a regulated agent organisation as part of the scenario, but this is beyond the
scope of our discussion.
study of finding and selecting medical practitioners and making appointments). An HTN planning
approach to information processing on a personal computer was proposed by the author [
        <xref ref-type="bibr" rid="ref21">21</xref>
        ]. This
requires a data model of the problem domain so that information flows into, out of, and between tasks
involving diferent resources can be modelled. This cannot be provided for general-purpose agents that
may be asked to perform any web information processing task. However, textual descriptions may
sufice for LLM-powered agents.
      </p>
    </sec>
    <sec id="sec-3">
      <title>3. Conclusion</title>
      <p>
        We have presented a case study of automating a multi-step information seeking and filtering task using
a large language model, tool access via MCP, and information about relevant web resources that require
providing search inputs. Our aim was to evaluate how close this current technology could get to the
capabilities envisaged for Semantic Web agents by Berners-Lee et al. [
        <xref ref-type="bibr" rid="ref3">3</xref>
        ] in 2001, under the premise that
LLMs’ language understanding abilities could remove the need for semantic mark-up on web sites. We
were able to implement our case study, but detailed guidance about resource use was required in the
LLM prompt. Our conclusion is that to facilitate the use of the web by LLM-powered agents, further
advances are needed in the areas of resource registries and catalogues and web site usage information
intended for LLMs. Furthermore we believe that higher level abstractions of web resources are needed
that model the information-processing goals or tasks they can be used to achieve and how these can be
composed to solve complex goals.
      </p>
    </sec>
    <sec id="sec-4">
      <title>Declaration on Generative AI</title>
      <p>While the use of a generative AI tool was the subject of this paper, the author has not employed any
generative AI tools to write the paper.</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          [1]
          <string-name>
            <given-names>O.</given-names>
            <surname>Etzioni</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Banko</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Soderland</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D. S.</given-names>
            <surname>Weld</surname>
          </string-name>
          ,
          <article-title>Open information extraction from the web</article-title>
          ,
          <source>Communications of the ACM</source>
          <volume>51</volume>
          (
          <year>2008</year>
          )
          <fpage>68</fpage>
          -
          <lpage>74</lpage>
          . doi:
          <volume>10</volume>
          .1145/1409360.1409378.
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          [2]
          <string-name>
            <given-names>P.</given-names>
            <surname>Cohen</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Cheyer</surname>
          </string-name>
          , E. Horvitz,
          <string-name>
            <given-names>R. El</given-names>
            <surname>Kaliouby</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Whittaker</surname>
          </string-name>
          ,
          <article-title>On the future of personal assistants</article-title>
          ,
          <source>in: Proceedings of the 2016 CHI Conference Extended Abstracts on Human Factors in Computing Systems, Association for Computing Machinery</source>
          ,
          <year>2016</year>
          , p.
          <fpage>1032</fpage>
          -
          <lpage>1037</lpage>
          . doi:
          <volume>10</volume>
          .1145/2851581. 2886425.
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          [3]
          <string-name>
            <given-names>T.</given-names>
            <surname>Berners-Lee</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Hendler</surname>
          </string-name>
          ,
          <string-name>
            <surname>O. Lassila,</surname>
          </string-name>
          <article-title>The semantic web</article-title>
          ,
          <source>Scientific American</source>
          <volume>284</volume>
          (
          <year>2001</year>
          )
          <fpage>34</fpage>
          -
          <lpage>43</lpage>
          . URL: https://static.scientificamerican.com/sciam/cache/file/ 394EDA92-D03F
          <string-name>
            <surname>-</surname>
          </string-name>
          4110-
          <fpage>B5AA4465CE486800</fpage>
          .pdf#page=
          <fpage>25</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          [4]
          <string-name>
            <given-names>T.</given-names>
            <surname>Guo</surname>
          </string-name>
          ,
          <string-name>
            <given-names>X.</given-names>
            <surname>Chen</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Y.</given-names>
            <surname>Wang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Chang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Pei</surname>
          </string-name>
          ,
          <string-name>
            <given-names>N. V.</given-names>
            <surname>Chawla</surname>
          </string-name>
          ,
          <string-name>
            <given-names>O.</given-names>
            <surname>Wiest</surname>
          </string-name>
          ,
          <string-name>
            <surname>X. Zhang,</surname>
          </string-name>
          <article-title>Large language model based multi-agents: a survey of progress and challenges</article-title>
          ,
          <source>in: Proceedings of the Thirty-Third International Joint Conference on Artificial Intelligence</source>
          ,
          <year>2024</year>
          . doi:
          <volume>10</volume>
          .24963/ijcai.
          <year>2024</year>
          /890.
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          [5]
          <string-name>
            <given-names>L.</given-names>
            <surname>Wang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.</given-names>
            <surname>Ma</surname>
          </string-name>
          ,
          <string-name>
            <given-names>X.</given-names>
            <surname>Feng</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Z.</given-names>
            <surname>Zhang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>H.</given-names>
            <surname>Yang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Zhang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Z.</given-names>
            <surname>Chen</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Tang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>X.</given-names>
            <surname>Chen</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Y.</given-names>
            <surname>Lin</surname>
          </string-name>
          , et al.,
          <article-title>A survey on large language model based autonomous agents</article-title>
          ,
          <source>Frontiers of Computer Science</source>
          <volume>18</volume>
          (
          <year>2024</year>
          )
          <fpage>1</fpage>
          -
          <lpage>26</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          [6]
          <string-name>
            <given-names>Z.</given-names>
            <surname>Xi</surname>
          </string-name>
          ,
          <string-name>
            <given-names>W.</given-names>
            <surname>Chen</surname>
          </string-name>
          ,
          <string-name>
            <given-names>X.</given-names>
            <surname>Guo</surname>
          </string-name>
          ,
          <string-name>
            <given-names>W.</given-names>
            <surname>He</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Y.</given-names>
            <surname>Ding</surname>
          </string-name>
          ,
          <string-name>
            <given-names>B.</given-names>
            <surname>Hong</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Zhang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Wang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Jin</surname>
          </string-name>
          ,
          <string-name>
            <given-names>E.</given-names>
            <surname>Zhou</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Zheng</surname>
          </string-name>
          ,
          <string-name>
            <given-names>X.</given-names>
            <surname>Fan</surname>
          </string-name>
          ,
          <string-name>
            <given-names>X.</given-names>
            <surname>Wang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>L.</given-names>
            <surname>Xiong</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Y.</given-names>
            <surname>Zhou</surname>
          </string-name>
          ,
          <string-name>
            <given-names>W.</given-names>
            <surname>Wang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.</given-names>
            <surname>Jiang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Y.</given-names>
            <surname>Zou</surname>
          </string-name>
          ,
          <string-name>
            <given-names>X.</given-names>
            <surname>Liu</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Z.</given-names>
            <surname>Yin</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Dou</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Weng</surname>
          </string-name>
          ,
          <string-name>
            <given-names>W.</given-names>
            <surname>Qin</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Y.</given-names>
            <surname>Zheng</surname>
          </string-name>
          ,
          <string-name>
            <given-names>X.</given-names>
            <surname>Qiu</surname>
          </string-name>
          ,
          <string-name>
            <given-names>X.</given-names>
            <surname>Huang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Q.</given-names>
            <surname>Zhang</surname>
          </string-name>
          , T. Gui,
          <article-title>The rise and potential of large language model based agents: a survey</article-title>
          ,
          <source>Science China Information Sciences</source>
          <volume>68</volume>
          (
          <year>2025</year>
          ).
          <source>doi:10.1007/ s11432-024-4222-0.</source>
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          [7]
          <string-name>
            <given-names>O.</given-names>
            <surname>Boissier</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Ciortea</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Harth</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Ricci</surname>
          </string-name>
          ,
          <source>Autonomous Agents on the Web (Dagstuhl Seminar 21072)</source>
          ,
          <source>Dagstuhl Reports</source>
          <volume>11</volume>
          (
          <year>2021</year>
          )
          <fpage>24</fpage>
          -
          <lpage>100</lpage>
          . doi:
          <volume>10</volume>
          .4230/DagRep.11.1.24.
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          [8]
          <string-name>
            <given-names>O.</given-names>
            <surname>Boissier</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Ciortea</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Harth</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Ricci</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Vachtsevanou</surname>
          </string-name>
          ,
          <source>Agents on the Web (Dagstuhl Seminar 23081)</source>
          ,
          <source>Dagstuhl Reports</source>
          <volume>13</volume>
          (
          <year>2023</year>
          )
          <fpage>71</fpage>
          -
          <lpage>162</lpage>
          . doi:
          <volume>10</volume>
          .4230/DagRep.13.2.71.
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          [9]
          <string-name>
            <given-names>A.</given-names>
            <surname>Ehtesham</surname>
          </string-name>
          ,
          <string-name>
            <given-names>A.</given-names>
            <surname>Singh</surname>
          </string-name>
          ,
          <string-name>
            <given-names>G. K.</given-names>
            <surname>Gupta</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Kumar</surname>
          </string-name>
          ,
          <article-title>A survey of agent interoperability protocols: Model context protocol (MCP), agent communication protocol (ACP), agent-to-agent protocol (A2A), and agent network protocol (ANP</article-title>
          ),
          <year>2025</year>
          . arXiv:
          <volume>2505</volume>
          .
          <fpage>02279</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          [10]
          <string-name>
            <given-names>K.</given-names>
            <surname>Stechly</surname>
          </string-name>
          ,
          <string-name>
            <given-names>K.</given-names>
            <surname>Valmeekam</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Kambhampati</surname>
          </string-name>
          ,
          <article-title>Chain of thoughtlessness? an analysis of CoT in planning</article-title>
          ,
          <year>2024</year>
          . URL: http://arxiv.org/abs/2405.04776, arXiv:
          <fpage>2405</fpage>
          .04776 [cs].
        </mixed-citation>
      </ref>
      <ref id="ref11">
        <mixed-citation>
          [11]
          <string-name>
            <given-names>S.</given-names>
            <surname>Kambhampati</surname>
          </string-name>
          ,
          <string-name>
            <given-names>K.</given-names>
            <surname>Valmeekam</surname>
          </string-name>
          ,
          <string-name>
            <given-names>L.</given-names>
            <surname>Guan</surname>
          </string-name>
          ,
          <string-name>
            <given-names>K.</given-names>
            <surname>Stechly</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Verma</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Bhambri</surname>
          </string-name>
          ,
          <string-name>
            <given-names>L.</given-names>
            <surname>Saldyt</surname>
          </string-name>
          ,
          <string-name>
            <surname>A</surname>
          </string-name>
          . Murthy,
          <article-title>LLMs can't plan, but can help planning in LLM-Modulo</article-title>
          <string-name>
            <surname>Frameworks</surname>
          </string-name>
          ,
          <year>2024</year>
          . URL: http://arxiv.org/ abs/2402.
          <year>01817</year>
          , arXiv:
          <fpage>2402</fpage>
          .
          <year>01817</year>
          [cs].
        </mixed-citation>
      </ref>
      <ref id="ref12">
        <mixed-citation>
          [12]
          <string-name>
            <given-names>Y.</given-names>
            <surname>Qin</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Hu</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Y.</given-names>
            <surname>Lin</surname>
          </string-name>
          ,
          <string-name>
            <given-names>W.</given-names>
            <surname>Chen</surname>
          </string-name>
          ,
          <string-name>
            <given-names>N.</given-names>
            <surname>Ding</surname>
          </string-name>
          ,
          <string-name>
            <given-names>G.</given-names>
            <surname>Cui</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Z.</given-names>
            <surname>Zeng</surname>
          </string-name>
          ,
          <string-name>
            <given-names>X.</given-names>
            <surname>Zhou</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Y.</given-names>
            <surname>Huang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.</given-names>
            <surname>Xiao</surname>
          </string-name>
          , C. Han,
          <string-name>
            <given-names>Y. R.</given-names>
            <surname>Fung</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Y.</given-names>
            <surname>Su</surname>
          </string-name>
          ,
          <string-name>
            <given-names>H.</given-names>
            <surname>Wang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.</given-names>
            <surname>Qian</surname>
          </string-name>
          ,
          <string-name>
            <given-names>R.</given-names>
            <surname>Tian</surname>
          </string-name>
          ,
          <string-name>
            <given-names>K.</given-names>
            <surname>Zhu</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Liang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>X.</given-names>
            <surname>Shen</surname>
          </string-name>
          ,
          <string-name>
            <given-names>B.</given-names>
            <surname>Xu</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Z.</given-names>
            <surname>Zhang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Y.</given-names>
            <surname>Ye</surname>
          </string-name>
          ,
          <string-name>
            <given-names>B.</given-names>
            <surname>Li</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Z.</given-names>
            <surname>Tang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Yi</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Y.</given-names>
            <surname>Zhu</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Z.</given-names>
            <surname>Dai</surname>
          </string-name>
          ,
          <string-name>
            <given-names>L.</given-names>
            <surname>Yan</surname>
          </string-name>
          ,
          <string-name>
            <given-names>X.</given-names>
            <surname>Cong</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Y.</given-names>
            <surname>Lu</surname>
          </string-name>
          ,
          <string-name>
            <given-names>W.</given-names>
            <surname>Zhao</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Y.</given-names>
            <surname>Huang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Yan</surname>
          </string-name>
          ,
          <string-name>
            <given-names>X.</given-names>
            <surname>Han</surname>
          </string-name>
          ,
          <string-name>
            <given-names>X.</given-names>
            <surname>Sun</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Li</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Phang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>C.</given-names>
            <surname>Yang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>T.</given-names>
            <surname>Wu</surname>
          </string-name>
          ,
          <string-name>
            <given-names>H.</given-names>
            <surname>Ji</surname>
          </string-name>
          ,
          <string-name>
            <given-names>G.</given-names>
            <surname>Li</surname>
          </string-name>
          ,
          <string-name>
            <given-names>Z.</given-names>
            <surname>Liu</surname>
          </string-name>
          ,
          <string-name>
            <given-names>M.</given-names>
            <surname>Sun</surname>
          </string-name>
          ,
          <article-title>Tool learning with foundation models</article-title>
          ,
          <source>ACM Computing Surveys</source>
          <volume>57</volume>
          (
          <year>2024</year>
          ). doi:
          <volume>10</volume>
          .1145/3704435.
        </mixed-citation>
      </ref>
      <ref id="ref13">
        <mixed-citation>
          [13]
          <string-name>
            <given-names>C.</given-names>
            <surname>Qu</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Dai</surname>
          </string-name>
          ,
          <string-name>
            <given-names>X.</given-names>
            <surname>Wei</surname>
          </string-name>
          ,
          <string-name>
            <given-names>H.</given-names>
            <surname>Cai</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Wang</surname>
          </string-name>
          ,
          <string-name>
            <given-names>D.</given-names>
            <surname>Yin</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Xu</surname>
          </string-name>
          , J.-r. Wen,
          <article-title>Tool learning with large language models: a survey</article-title>
          ,
          <source>Frontiers of Computer Science</source>
          <volume>19</volume>
          (
          <year>2025</year>
          )
          <article-title>198343</article-title>
          . doi:
          <volume>10</volume>
          .1007/ s11704-024-40678-2.
        </mixed-citation>
      </ref>
      <ref id="ref14">
        <mixed-citation>
          [14]
          <string-name>
            <given-names>World</given-names>
            <surname>Wide Web Consortium</surname>
          </string-name>
          ,
          <article-title>Web of Things (WoT) Thing Description 1</article-title>
          .1,
          <string-name>
            <given-names>W3C</given-names>
            <surname>Recommendation</surname>
          </string-name>
          , https://www.w3.org/TR/wot-thing-description11/,
          <year>2023</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref15">
        <mixed-citation>
          [15]
          <string-name>
            <given-names>R. T.</given-names>
            <surname>Fielding</surname>
          </string-name>
          ,
          <article-title>Architectural Styles and the Design of Network-based Software Architectures</article-title>
          ,
          <source>PhD dissertation</source>
          , University of California, Irvine,
          <year>2000</year>
          . URL: https://ics.uci.edu/~fielding/pubs/ dissertation/top.htm.
        </mixed-citation>
      </ref>
      <ref id="ref16">
        <mixed-citation>
          [16]
          <string-name>
            <given-names>M.</given-names>
            <surname>Fowler</surname>
          </string-name>
          , Richardson Maturity Model, https://martinfowler.com/articles/ richardsonMaturityModel.html,
          <year>2010</year>
          . Accessed:
          <fpage>2025</fpage>
          -07-01.
        </mixed-citation>
      </ref>
      <ref id="ref17">
        <mixed-citation>
          [17]
          <string-name>
            <given-names>E.</given-names>
            <surname>Milošević</surname>
          </string-name>
          ,
          <article-title>How to setup and use Browser Use MCP Server</article-title>
          ,
          <string-name>
            <surname>Towards</surname>
            <given-names>AGI</given-names>
          </string-name>
          (Medium),
          <year>2025</year>
          . URL: https://medium.com/towards-agi/
          <article-title>how-to-setup-and-use-browser-use-mcp-server-8d0725440f31.</article-title>
        </mixed-citation>
      </ref>
      <ref id="ref18">
        <mixed-citation>
          [18]
          <string-name>
            <given-names>J.</given-names>
            <surname>Howard</surname>
          </string-name>
          , The /llms.txt file, https://github.com/AnswerDotAI/llms-txt,
          <year>2024</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref19">
        <mixed-citation>
          [19]
          <string-name>
            <given-names>N.</given-names>
            <surname>Hafiene</surname>
          </string-name>
          ,
          <string-name>
            <given-names>L. G.</given-names>
            <surname>Nardin</surname>
          </string-name>
          ,
          <string-name>
            <given-names>O.</given-names>
            <surname>Boissier</surname>
          </string-name>
          ,
          <article-title>Knowledge level support for programming agents to interact in regulated online forums</article-title>
          , in: S. Cranefield,
          <string-name>
            <given-names>L. G.</given-names>
            <surname>Nardin</surname>
          </string-name>
          , N. Lloyd (Eds.), Coordination, Organizations, Institutions, Norms, and
          <article-title>Ethics for Governance of Multi-</article-title>
          Agent
          <string-name>
            <surname>Systems</surname>
            <given-names>XVII</given-names>
          </string-name>
          , Springer Nature Switzerland, Cham,
          <year>2025</year>
          , pp.
          <fpage>100</fpage>
          -
          <lpage>112</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref20">
        <mixed-citation>
          [20]
          <string-name>
            <given-names>I.</given-names>
            <surname>Georgievski</surname>
          </string-name>
          ,
          <string-name>
            <surname>M.</surname>
          </string-name>
          <article-title>Aiello, HTN planning: Overview, comparison, and beyond</article-title>
          ,
          <source>Artificial Intelligence</source>
          <volume>222</volume>
          (
          <year>2015</year>
          )
          <fpage>124</fpage>
          -
          <lpage>156</lpage>
          . doi:
          <volume>10</volume>
          .1016/j.artint.
          <year>2015</year>
          .
          <volume>02</volume>
          .002.
        </mixed-citation>
      </ref>
      <ref id="ref21">
        <mixed-citation>
          [21]
          <string-name>
            <given-names>S.</given-names>
            <surname>Cranefield</surname>
          </string-name>
          ,
          <string-name>
            <given-names>E.</given-names>
            <surname>Moreale</surname>
          </string-name>
          ,
          <string-name>
            <given-names>B.</given-names>
            <surname>McKinlay</surname>
          </string-name>
          ,
          <string-name>
            <surname>M.</surname>
          </string-name>
          <article-title>Purvis, Automating the interoperation of information processing tools</article-title>
          ,
          <source>in: 32nd Annual Hawaii International Conference on System Sciences, IEEE Computer Society</source>
          ,
          <year>1999</year>
          . doi:
          <volume>10</volume>
          .1109/HICSS.
          <year>1999</year>
          .
          <volume>773089</volume>
          .
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>