<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta>
      <journal-title-group>
        <journal-title>First Workshop on Computational Design and Computer-aided Creativity</journal-title>
      </journal-title-group>
    </journal-meta>
    <article-meta>
      <title-group>
        <article-title>Marching The Manifold - Visualizing Semantic Relationships in CLIP's Latent Space</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Tobias Bongartz</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Hochschule Bielefeld</institution>
          ,
          <addr-line>Lampingstraße 3, 33649 Bielefeld</addr-line>
          ,
          <country country="DE">Germany</country>
        </aff>
      </contrib-group>
      <pub-date>
        <year>2025</year>
      </pub-date>
      <volume>23</volume>
      <abstract>
        <p>Latent spaces form the mathematical foundation of modern AI image generation systems, yet their highdimensional complexity remains inaccessible to design practitioners who could beneft most from understanding them. The Bachelor Thesis "Marching The Manifold" transforms CLIP's (Contrastive Language-Image Pre-training)[1] abstract 768-dimensional embedding space into an interactive, galaxylike 3D visualization where 235,886 embedded words are semantically clustered. The Web-Interface is architected with Vite, leveraging JavaScript and HTML alongside React and Three.js for real time application. A custom-developed Latent Navigator tool, programmed in Python, employs beam search algorithms with FAISS[2]-optimized nearest-neighbor calculations to identify meaningful pathways through this space, which users explore via a purpose-built physical controller. As users traverse these pathways, pre generated Image sets by the difusion models FLUX.1[dev][3] and PixelWave Flux.1-dev 03[4] show corresponding imagery at each point, revealing the continuous semantic transitions between concepts at each point of the pathway. The resulting system visualizes semantic relationships as navigable paths through the latent space. Transitions between concepts manifest as mostly coherent visual transformations in the generated images, while the celestial metaphor provides users with an intuitive frame of reference. This visualization approach bridges the gap between technical AI implementation and creative practice by enabling designers to develop intuitive mental models of latent spaces through direct exploration. The project demonstrates how thoughtful interaction design can transform abstract mathematical structures into accessible tools for creative exploration.</p>
      </abstract>
      <kwd-group>
        <kwd>eol&gt;Latent Space Navigation</kwd>
        <kwd>Computational Visualization</kwd>
        <kwd>CLIP Embeddings</kwd>
        <kwd>Physical Computing</kwd>
        <kwd>AI Transparency 1</kwd>
      </kwd-group>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>1. Video Documentation</title>
    </sec>
    <sec id="sec-2">
      <title>2. The Custom Controller</title>
      <p>A video documentation can be seen at: https://youtu.be/L29iY3emqsA</p>
      <p>A 3D rendering of the physical controller can be seen at:https://youtu.be/Vok3NN-SeLs
Declaration on Generative AI
During the preparation of this work, the author used Perplexity and Claude 3.7 Sonnet
Thinking in order to: Improve writing style and for grammar and spelling check. After using
these tools, the author reviewed and edited the content as needed and takes full responsibility
for the publication’s content.</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          [1]
          <string-name>
            <given-names>Alec</given-names>
            <surname>Radford</surname>
          </string-name>
          , Ilya Sutskever, Jong Wook Kim, Gretchen Krueger, Sandhini Agarwal, “CLIP:
          <article-title>Connecting Text and Images,” CLIP: Connecting Text and Images (blog</article-title>
          ),
          <source>January 5</source>
          ,
          <year>2021</year>
          , https://openai.com/index/clip/.
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          [2]
          <string-name>
            <given-names>Matthijs</given-names>
            <surname>Douze</surname>
          </string-name>
          et al.,
          <source>“The Faiss Library” (arXiv, September</source>
          <volume>6</volume>
          ,
          <year>2024</year>
          ), URL: https://doi.org/10.48550/arXiv.2401.08281
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          [3]
          <string-name>
            <given-names>“</given-names>
            <surname>Black-</surname>
          </string-name>
          Forest-Labs/FLUX.1-Dev · Hugging Face,
          <source>” accessed January 1</source>
          ,
          <year>2025</year>
          , URL: https://huggingface.co/black-forest-labs
          <source>/FLUX</source>
          .1-de.v
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          [4] “PixelWave - FLUX.
          <fpage>1</fpage>
          -Dev 03 | Flux Checkpoint | Civitai,” December 14,
          <year>2024</year>
          , URL: https://civitai.com/models/141592/pixelwave.
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>