<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta>
      <issn pub-type="ppub">1613-0073</issn>
    </journal-meta>
    <article-meta>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Elijah Cantu</string-name>
          <email>caelijah@umich.edu</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Gyunam Park</string-name>
          <email>g.park@tue.nl</email>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Download/Demo URL</string-name>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Documentation URL</string-name>
        </contrib>
        <contrib contrib-type="editor">
          <string-name>Process Mining, Web Browsing Behavior, User Journey, Web Event Logging, Browser Extension</string-name>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Electrical Engineering and Computer Science, University of Michigan</institution>
          ,
          <addr-line>Ann Arbor, MI</addr-line>
          ,
          <country country="US">United States</country>
        </aff>
        <aff id="aff1">
          <label>1</label>
          <institution>Mathematics and Computer Science, Eindhoven University of Technology</institution>
          ,
          <addr-line>Eindhoven</addr-line>
          ,
          <country country="NL">the Netherlands</country>
        </aff>
      </contrib-group>
      <pub-date>
        <year>2025</year>
      </pub-date>
      <fpage>20</fpage>
      <lpage>24</lpage>
      <abstract>
        <p>Web browsing behavior ofers a rich yet underexplored source of data for process mining (PM) research. However, existing web analytics tools primarily rely on server-side logs or unstructured browsing history, which lack detailed and organized event logging needed for efective process analysis. To address this gap, we introduce Webel, a Chromium-based browser extension tailored for PM researchers. Webel captures structured web browsing events by automatically grouping related activities into coherent process traces while tracking tab relationships and navigation patterns. It supports multiple trace modeling strategies, including parent-child tab relationships, per-tab isolation, and session-wide grouping, providing flexible analytical perspectives. Webel also enables seamless export of logs in the XES format for immediate use in PM tools and ofers optional Google Drive integration for cloud-based log synchronization.</p>
      </abstract>
      <kwd-group>
        <kwd>Metadata description</kwd>
      </kwd-group>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>-</title>
      <p>CEUR</p>
      <p>ceur-ws.org
Screencast video
Languages, tools and services used
Supported operating environment
Chrome Extensions API (Manifest V3)
Chromium-based browsers
Value
Webel
1.0
MIT
webels.org/install
webels.org
github.com/elijahcantu/webel
webels.org/demo</p>
    </sec>
    <sec id="sec-2">
      <title>1. Introduction</title>
      <p>
        PM research has traditionally focused on business processes derived from enterprise systems,
with limited exploration of user-level web browsing behavior as a data source. Although Web
server logs provide information on user interactions with specific websites [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ], they do not
capture the entire user journey between multiple sites. Existing browser history tools and web
analytics platforms collect extensive browsing data but lack the semantic structure required for
https://elijahcantu.com (E. Cantu); https://www.gyunam.com (G. Park)
      </p>
      <p>
        © 2025 Copyright for this paper by its authors. Use permitted under Creative Commons License Attribution 4.0 International (CC BY 4.0).
PM analysis. Specifically, they do not organize events into meaningful traces with clear process
boundaries [
        <xref ref-type="bibr" rid="ref2">2</xref>
        ].
      </p>
      <p>
        This gap between raw browsing data and PM-ready event logs presents a significant barrier
to researchers analyzing web-based user behavior. Current tools lack the ability to create logical
web browsing processes [
        <xref ref-type="bibr" rid="ref3">3</xref>
        ], often requiring extensive data preprocessing that can introduce
artificial trace boundaries that may not accurately reflect navigation behavior.
      </p>
      <p>Webel addresses these challenges by providing a browser extension that enables the automatic
grouping of web browsing events into coherent, semantically meaningful process instances.</p>
    </sec>
    <sec id="sec-3">
      <title>2. Tool Innovations and Features</title>
      <p>Webel is a Chromium-based extension specifically designed for PM applications. Unlike
conventional web analytics tools that log events as flat, unstructured sequences, Webel implements
multiple trace modeling strategies to capture the semantic structure of user browsing behavior.
It preserves relationships between tabs and navigation events and exports the resulting event
logs directly in XES format, facilitating seamless integration with PM tools.</p>
      <sec id="sec-3-1">
        <title>2.1. Trace Modeling Strategies</title>
        <p>Webel supports three trace methods to group web browsing behavior into process instances.
Each ofers a diferent analytical perspective:
• Parent-Child Tab Traces: This method groups browsing activities based on tab
relationships. When a user opens a new tab page, it starts a new trace as the parent. If a link
from this tab opens in another tab, the new tab inherits the parent’s trace ID. This way, it
captures the user’s main tasks along with any subtasks branching of.
• Per-Tab Traces: Each browser tab is treated as a separate trace, regardless of how it
was opened. This allows for independent analysis of navigation behavior per tab and is
particularly useful when examining parallel exploration patterns.
• Session-Based Traces: All activity during a single browser session is grouped into
a single trace. This models coarse-grained user behavior across a session, useful for
longitudinal task analysis or macro-level patterns.</p>
      </sec>
      <sec id="sec-3-2">
        <title>2.2. Trace Export and Formats</title>
        <p>Each trace method is logged simultaneously, allowing researchers to export and compare all
three variants in XES format using the Webel popup or the Google Drive sync. These logs are
fully compatible with PM tools such as ProM, Disco, Celonis, and pm4py, enabling immediate
analysis without preprocessing.</p>
      </sec>
      <sec id="sec-3-3">
        <title>2.3. Google Drive Integration</title>
        <p>Webel supports optional Google Drive synchronization. Users can enter a Google Drive folder
ID or URL they can access. The logs are automatically synced to that folder using OAuth 2.0
and the Google Drive API. This feature facilitates cross-device analysis and shared research
workflows while retaining user control over storage and sharing.</p>
      </sec>
      <sec id="sec-3-4">
        <title>2.4. Privacy and Control</title>
        <p>All data remains local unless explicitly exported or synced to Google Drive. The extension
popup lets users toggle logging on or of, trigger XES exports, or sync to Google Drive at
any time. If logging is disabled, reminder notifications and badge indicators prompt users to
re-enable it after they start a new browser session or clear their history.</p>
      </sec>
    </sec>
    <sec id="sec-4">
      <title>3. Case Study 1: Course Resource Navigation</title>
      <p>To evaluate Webel’s capabilities, we conducted a case study focused on how a user accessed
course-related resources on the Canvas platform, a widely used learning management system
in higher education, at the University of Michigan. In this scenario, the user visited the Canvas
page for each enrolled course and then attempted to reach all external resources linked from
that course page.</p>
      <p>Although this browsing pattern is common among students, it involves a complex sequence
of navigation steps that make it well-suited for testing Webel’s trace logic. We recorded the
user’s browser sessions during this process and analyzed the resulting logs using Disco 1 after
applying basic URL-based regular expressions to label each resource.</p>
      <p>Figure 1a–c present process models generated from the same set of browsing sessions, each
visualized using a diferent trace modeling strategy. The parent-child trace map captures how
external resource visits are organized under each course as distinct sub-tasks, revealing the
hierarchical structure of student navigation and how main activities generate subtasks. The
per-tab model provides insight into how the user structured their navigation across browser tabs,
showing clear separation of activity per course and highlighting parallel exploration behavior.
The session trace ofers a comprehensive temporal view of the entire session, reflecting the
user’s sequential progress through a course and its associated resources, and capturing habitual
return points and overall workflow patterns.</p>
      <p>The generated models reveal several important insights. Canvas emerges as the central hub
across all perspectives, with substantial self-loop activity (228 returns), indicating frequent
returns as a primary “home base.” A dual-hub pattern also appears, with Google Services
(148 visits) serving as a secondary integration hub, reflecting the role of third-party tools in
educational workflows.</p>
      <p>Resource usage shows clear functional distinctions: lecture recordings (30 visits) and Slack
(14) are heavily used, while assessment platforms like Gradescope (18) and discussion forums
like Piazza (51) see moderate activity. Synchronous tools such as Zoom (4) and ofice hours
queues (8–9) have lower but consistent usage.</p>
      <p>Taken together, the three trace perspectives provide complementary views: parent-child traces
reveal task hierarchies and branching behavior, per-tab traces emphasize parallel exploration,
and session traces illustrate sequential workflows and macro-level patterns. This case study
(a) Parent-child tab process map</p>
      <p>(b) Per-tab process map
(c) Per-session process map
Figure 1: Comparing navigation flows using three trace modeling strategies
highlights Webel’s flexibility in supporting multiple trace perspectives out of the box, allowing
researchers to choose the most appropriate level of abstraction for their analysis without
requiring manual segmentation or complex preprocessing.</p>
    </sec>
    <sec id="sec-5">
      <title>4. Case Study 2: Canvas Files User Journey</title>
      <p>To demonstrate Webel’s applicability to single-tab navigation analysis, we examined a focused
scenario within Canvas’s file management system. This study was conducted entirely within a
single tab and browser session, so we applied the per-tab trace method. Since Canvas does not
support subfolder previews, users must repeatedly enter and exit individual folders to explore
the file hierarchy. This behavior results in a navigation pattern that is well-suited for analyzing
ineficiencies within isolated browsing contexts.</p>
      <p>
        We recorded browsing events during typical file navigation tasks and applied more
finegrained URL-based regular expressions to label each navigation. The resulting event log was
analyzed using Disco and Cortado [
        <xref ref-type="bibr" rid="ref4">4</xref>
        ]. The process model in Figure 2 reveals frequent loops
between subfolder and file preview activities, quantifying the repetitive overhead imposed by
Canvas’s interface design. Figure 3 shows multiple process variants, each containing
redundant cycles that highlight the absence of eficient file browsing paths. This focused analysis
demonstrates Webel’s practical value for identifying specific usability bottlenecks. Webel can
transform raw browsing data into actionable insights for user experience improvement.
      </p>
    </sec>
    <sec id="sec-6">
      <title>5. Conclusion</title>
      <p>The tool demonstrates robust support for logging web browsing events, and its XES export
functionality has been verified for compatibility with leading PM platforms. Webel focuses
exclusively on Chromium-based browsers, and logging activities outside the browser is out of
scope for the current design. Future work includes adding support for Safari.</p>
      <p>Multiple trace modeling strategies in Webel address the need for flexible process instance
identification in web browsing and allow researchers to select the most appropriate level of
abstraction for their research questions without manual preprocessing.</p>
    </sec>
    <sec id="sec-7">
      <title>6. Declaration on Generative AI</title>
      <p>During the preparation of this work, the authors used ChatGPT and Claude for grammar and
spelling checks. After using these services, the authors reviewed and edited the content as
needed and take full responsibility for the publication’s content.</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          [1]
          <string-name>
            <given-names>H.</given-names>
            <surname>Sarirah Husin</surname>
          </string-name>
          ,
          <string-name>
            <given-names>S.</given-names>
            <surname>Ismail</surname>
          </string-name>
          ,
          <article-title>Process mining approach to analyze user navigation behavior of a news website</article-title>
          ,
          <source>in: Proceedings of the 4th International Conference on Information Science and Systems</source>
          , ICISS '21,
          <string-name>
            <surname>Association</surname>
          </string-name>
          for Computing Machinery, New York, NY, USA,
          <year>2021</year>
          , p.
          <fpage>7</fpage>
          -
          <lpage>12</lpage>
          . doi:
          <volume>10</volume>
          .1145/3459955.3460593.
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          [2]
          <string-name>
            <given-names>N.</given-names>
            <surname>Bhattacharya</surname>
          </string-name>
          ,
          <string-name>
            <given-names>J.</given-names>
            <surname>Gwizdka</surname>
          </string-name>
          , Yasbil:
          <article-title>Yet another search behaviour (and) interaction logger</article-title>
          ,
          <source>in: Proceedings of the 44th International ACM SIGIR Conference on Research and Development in Information Retrieval</source>
          , SIGIR '21,
          <string-name>
            <surname>Association</surname>
          </string-name>
          for Computing Machinery, New York, NY, USA,
          <year>2021</year>
          , p.
          <fpage>2585</fpage>
          -
          <lpage>2589</lpage>
          . doi:
          <volume>10</volume>
          .1145/3404835.3462800.
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          <article-title>[3] GESIS - Leibniz Institute for the Social Sciences, Gesis web tracking</article-title>
          ,
          <year>2025</year>
          . URL: https://www.gesis.org/en/services/planning
          <article-title>-studies-and-collecting-data/collecting/ gesis-web-tracking, accessed: [insert your access date].</article-title>
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          [4]
          <string-name>
            <given-names>D.</given-names>
            <surname>Schuster</surname>
          </string-name>
          ,
          <string-name>
            <surname>S. J. van Zelst</surname>
          </string-name>
          ,
          <string-name>
            <surname>W. M. van der Aalst</surname>
          </string-name>
          ,
          <article-title>Cortado: A dedicated process mining tool for interactive process discovery</article-title>
          ,
          <source>SoftwareX</source>
          <volume>22</volume>
          (
          <year>2023</year>
          )
          <article-title>101373</article-title>
          . doi:
          <volume>10</volume>
          .1016/j.softx.
          <year>2023</year>
          .
          <volume>101373</volume>
          .
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>