<?xml version="1.0" encoding="UTF-8"?>
<TEI xml:space="preserve" xmlns="http://www.tei-c.org/ns/1.0" 
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" 
xsi:schemaLocation="http://www.tei-c.org/ns/1.0 https://raw.githubusercontent.com/kermitt2/grobid/master/grobid-home/schemas/xsd/Grobid.xsd"
 xmlns:xlink="http://www.w3.org/1999/xlink">
	<teiHeader xml:lang="en">
		<fileDesc>
			<titleStmt>
				<title level="a" type="main">ScaDANN: A Scalable Disk-Based Graph Indexing Method for ANN</title>
			</titleStmt>
			<publicationStmt>
				<publisher/>
				<availability status="unknown"><licence/></availability>
			</publicationStmt>
			<sourceDesc>
				<biblStruct>
					<analytic>
						<author>
							<persName><forename type="first">Hyein</forename><surname>Gu</surname></persName>
							<email>hyein99@kaist.ac.kr</email>
							<affiliation key="aff0">
								<orgName type="institution">Korea Advanced Institute of Science and Technology</orgName>
								<address>
									<country key="KR">Republic of Korea</country>
								</address>
							</affiliation>
						</author>
						<author>
							<persName><forename type="first">Min-Soo</forename><surname>Kim</surname></persName>
							<email>minsoo.k@kaist.ac.kr</email>
							<affiliation key="aff0">
								<orgName type="institution">Korea Advanced Institute of Science and Technology</orgName>
								<address>
									<country key="KR">Republic of Korea</country>
								</address>
							</affiliation>
						</author>
						<title level="a" type="main">ScaDANN: A Scalable Disk-Based Graph Indexing Method for ANN</title>
					</analytic>
					<monogr>
						<idno type="ISSN">1613-0073</idno>
					</monogr>
					<idno type="MD5">3E6B94EDF3C29A4F6BF0A2B494801A4F</idno>
				</biblStruct>
			</sourceDesc>
		</fileDesc>
		<encodingDesc>
			<appInfo>
				<application version="0.7.2" ident="GROBID" when="2025-04-23T18:37+0000">
					<desc>GROBID - A machine learning software for extracting information from scholarly documents</desc>
					<ref target="https://github.com/kermitt2/grobid"/>
				</application>
			</appInfo>
		</encodingDesc>
		<profileDesc>
			<textClass>
				<keywords>
					<term>Graph</term>
					<term>ANN</term>
					<term>Nearest neighbor search</term>
					<term>Vector search</term>
				</keywords>
			</textClass>
			<abstract>
<div xmlns="http://www.tei-c.org/ns/1.0"><p>Approximate Nearest Neighbor (ANN) indexing constitutes a fundamental component of modern vector-based databases, facilitating efficient and accurate information retrieval for applications such as retrieval-augmented generation. However, scaling graph-based ANN indices to billion-scale datasets poses substantial challenges, including high memory demands and inefficiencies in handling partitioned graphs. To address these issues, we propose ScaDANN, a scalable disk-based graph indexing method tailored for large-scale datasets under limited memory conditions. ScaDANN introduces two novel techniques: overlapping block-level insertion and grid block merge, which enable the efficient construction of unified graph indices while preserving high search performance. Our approach achieves notable advancements in index construction speed, search accuracy, and memory efficiency, establishing ScaDANN as a robust and effective solution for scalable ANN indexing in resource-limited environments.</p></div>
			</abstract>
		</profileDesc>
	</teiHeader>
	<text xml:lang="en">
		<body>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="1.">Introduction</head><p>Large Language Models (LLMs) have revolutionized natural language processing by excelling in tasks such as text generation <ref type="bibr" target="#b0">[1]</ref>, summarization <ref type="bibr" target="#b1">[2]</ref>, and question answering <ref type="bibr" target="#b2">[3]</ref>. Their ability to leverage pre-trained knowledge has facilitated applications across diverse domains, including healthcare <ref type="bibr" target="#b3">[4]</ref>, education <ref type="bibr" target="#b4">[5]</ref>, and customer service <ref type="bibr" target="#b5">[6]</ref>. However, despite their strengths, LLMs face critical limitations, such as hallucination <ref type="bibr" target="#b6">[7]</ref>, where they generate plausible but incorrect outputs, outdated knowledge due to static training data, and privacy concerns when processing sensitive information <ref type="bibr" target="#b7">[8]</ref>.</p><p>Retrieval-Augmented Generation (RAG) <ref type="bibr" target="#b8">[9]</ref> has been introduced as a promising framework to mitigate these limitations by integrating LLMs with external knowledge retrieval systems. By leveraging vector databases (VectorDBs) <ref type="bibr" target="#b9">[10]</ref>, RAG retrieves relevant and up-to-date information to enhance the accuracy and reliability of generated responses. This framework not only reduces the impact of hallucination but also addresses privacy concerns by accessing only the information necessary for a given query.</p><p>Approximate Nearest Neighbor (ANN) indexing serves as a foundational component in RAG systems, enabling efficient information retrieval from VectorDBs. ANN indexes can be broadly categorized into clustering-based <ref type="bibr" target="#b10">[11,</ref><ref type="bibr" target="#b11">12,</ref><ref type="bibr" target="#b12">13]</ref>, hash-based <ref type="bibr" target="#b13">[14,</ref><ref type="bibr" target="#b14">15,</ref><ref type="bibr" target="#b15">16]</ref>, and graph-based methods <ref type="bibr" target="#b16">[17,</ref><ref type="bibr" target="#b17">18,</ref><ref type="bibr" target="#b18">19,</ref><ref type="bibr" target="#b19">20,</ref><ref type="bibr" target="#b20">21]</ref>. Clustering-based methods partition datasets into smaller clusters, facilitating memory-efficient index construction but often incurring longer construction times and reduced search accuracy. Hash-based methods employ hash functions to compress datasets and reduce dimensionality, which accelerates construction and search processes but at the cost of reduced accuracy due to compression. Graph-based methods, in contrast, represent vectors as nodes and their relationships as edges, connecting nodes based on proximity <ref type="bibr" target="#b21">[22]</ref>. These methods construct neighbor lists through vector operations between nodes (as illustrated in Figure <ref type="figure" target="#fig_0">1</ref>) and are particularly effective for large-scale datasets, offering superior search accuracy and performance despite higher memory demands during construction.</p><p>The need for scalable ANN indexing becomes increasingly apparent as data is segmented into smaller chunks to meet the input constraints of LLMs. For instance, dividing a 100 MB document into 4 KB chunks suitable for LLM input results in approximately 25,600 chunks-a 25,600-fold increase in data chunks relative to the original document count. This exponential growth in the number of chunks significantly increases the size of the ANN index, necessitating the development of methods capable of efficiently handling billion-scale datasets.</p><p>Building large-scale ANN indexes, however, poses significant challenges <ref type="bibr" target="#b22">[23,</ref><ref type="bibr" target="#b23">24]</ref>. First, the substantial size of datasets and indexes requires considerable memory resources. Second, achieving competitive search performance on disk-based systems, which are necessary for handling such datasets, is particularly challenging. To address these issues, we propose ScaDANN, a Scalable Disk-based Graph Indexing Method for ANN, specifically designed to overcome these limitations.</p><p>This study pursues two primary objectives: (1) to enable the construction of large-scale ANN indexes in memoryconstrained environments and (2) to enhance search performance through the use of merged graph structures that improve efficiency and accuracy during the search process. By focusing on these goals, we provide a scalable and efficient solution for constructing and querying ANN indexes in scenarios where both large-scale datasets and memory limitations are critical challenges.</p><p>This work makes the following key contributions:</p><p>• We propose a graph-based ANN method that leverages grid blocks to efficiently construct and scale billion-scale indexes in memory-constrained environments. • Our approach outperforms existing disk-based methods, achieving index construction speeds 1.3 to 1.5 times faster and search speeds 1.2 to 1.4 times faster, while maintaining comparable accuracy. • We demonstrate that our method enables the construction of single indexes up to 10 times larger than those of existing approaches within the same memory constraints.</p><p>These contributions mark a significant advancement in ANN indexing, particularly in terms of efficiency and scalability for large-scale, memory-constrained applications.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="2.">Preliminaries</head></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="2.1.">ANN search</head><p>Nearest Neighbor (NN) search identifies the data point(s) in a dataset closest to a given query point based on a defined distance metric, such as Euclidean distance or cosine similarity. While exact NN search computes the distance between the query and every data point to determine the nearest neighbor, this approach becomes computationally infeasible for large datasets. ANN search addresses this challenge by finding points that are approximately close to the true nearest neighbors, significantly reducing computational costs. Although ANN sacrifices some accuracy, it delivers faster query response times, which is particularly advantageous in high-dimensional and large-scale datasets.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>Definition 1 (ANN).</head><p>Given a dataset 𝐷 = {𝑥 1 , 𝑥 2 , … , 𝑥 𝑛 } and a query point 𝑞, the task of nearest neighbor search (whether exact or approximate) is to find the point 𝑥 𝑁 𝑁 ∈ 𝐷 such that the distance 𝑑(𝑞, 𝑥 𝑁 𝑁 ) is minimized, where 𝑑(⋅, ⋅) represents a distance metric. In the case of ANN search, the algorithm guarantees that the returned point 𝑥 * satisfies:</p><formula xml:id="formula_0">𝑑(𝑞, 𝑥 * ) ≤ (1 + 𝜖) ⋅ 𝑑(𝑞, 𝑥 𝑁 𝑁 ),</formula><p>where 𝜖 is a small approximation factor, and 𝑥 𝑁 𝑁 is the exact nearest neighbor of the query point 𝑞.</p><p>To evaluate ANN algorithms, two primary metrics are used: Recall and Queries Per Second (QPS). Recall quantifies the effectiveness of an algorithm in retrieving true nearest neighbors, defined as the ratio of true nearest neighbors retrieved to the total number of true nearest neighbors. Higher recall indicates greater accuracy but may increase search time. QPS measures the system's throughput, i.e., the number of queries processed per second. Higher QPS reflects greater efficiency, which is critical for large-scale and real-time applications.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="2.2.">Existing graph ANN indexing methods</head></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="2.2.1.">Vamana graph</head><p>Vamana <ref type="bibr" target="#b24">[25]</ref> is a widely used graph-based ANN indexing algorithm and serves as the foundational graph structure in many recent ANN studies. Its construction is based on two key techniques: Greedy Search and Robust Pruning.</p><p>Greedy Search enables efficient exploration of nearest neighbors within the graph. Beginning from an initial node, the algorithm iteratively identifies and prioritizes nodes closest to the query. It maintains a priority queue of visited nodes, extracting the nearest unvisited node at each step, adding its neighbors to the queue, and repeating this process until all relevant nodes are explored.</p><formula xml:id="formula_1">𝛼 ⋅ 𝑑(𝑝 * , 𝑝 ′ ) ≤ 𝑑(𝑝, 𝑝 ′ ) (𝛼 &gt; 1)</formula><p>Robust Pruning optimizes the graph by removing unnecessary edges discovered during Greedy Search. This technique utilizes the Sparse Neighborhood Graph (SNG) property <ref type="bibr" target="#b25">[26]</ref> and applies a distance threshold parameter, alpha (𝛼), to determine whether an edge should be retained.</p><p>Specifically, an edge between nodes 𝑃 and 𝑃 ′ is removed if the distance between these nodes exceeds the distance between 𝑃 ′ and a third node 𝑃 * . By retaining longer edges under relaxed conditions, Vamana enhances search efficiency while maintaining graph connectivity.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="2.2.2.">DiskANN</head><p>DiskANN <ref type="bibr" target="#b24">[25]</ref> extends the Vamana graph to a disk-based environment, making it well-suited for large-scale datasets. It employs a divide-and-conquer approach, partitioning the dataset into subgraphs and merging them into a unified structure.</p><p>Figure <ref type="figure" target="#fig_0">1</ref> illustrates the partitioning and merging process of DiskANN. <ref type="bibr" target="#b0">(1)</ref> The dataset is first divided into multiple overlapping partitions using a clustering technique, with each node assigned to at least two partitions. (2) Independent subgraphs are constructed within each partition. (3) These subgraphs are then merged by randomly selecting nodes from the partitioned graphs, followed by merging and pruning operations. This process updates the neighbor lists with merged results, ensuring connectivity between partitions while reducing memory and computational costs. DiskANN's partitioning and merging strategy enables scalable graph construction but introduces challenges in maintaining search performance across partitions.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="2.2.3.">Batch insertion for graph construction</head><p>Efficient parallelization is essential for graph construction in large-scale datasets. Batch insertion, as introduced in Par-layANN <ref type="bibr" target="#b26">[27]</ref>, addresses the challenge of concurrently adding nodes to a graph while ensuring conflict-free operations. Conflicts are avoided by grouping nodes into independent batches and processing each batch in isolation, ensuring that nodes within a batch do not interfere with each other during insertion.</p><p>The batch insertion process occurs in two steps. First, nodes in a batch are independently connected to their nearest neighbors within the existing graph. These connections are established in parallel, as nodes within a batch do not share dependencies, preventing conflicts during edge creation. This step ensures that all newly added nodes are integrated into the graph without disrupting its structure. Second, reversed edges are added to the graph. A reversed</p><formula xml:id="formula_2">𝑣 ! 𝑣"# 𝑣 $ 𝑣 " 𝑣 # 𝑣 % 𝑣&amp; 𝑣 ' 𝑣( 𝑣 "" 𝑣 ) 𝑣 * P1 P2 P3</formula><p>Partition into</p><p>(1) Partitioning based on overlapping clustering</p><formula xml:id="formula_3">𝑣! 𝑣"# 𝑣" 𝑣$ 𝑣# 𝑣% 𝑣&amp; 𝑣' 𝑣! 𝑣" 𝑣$ 𝑣&amp; 𝑣( 𝑣) 𝑣"# 𝑣* 𝑣 " 𝑣 #$ 𝑣 # 𝑣 % 𝑣 $ 𝑣 &amp; 𝑣 ' 𝑣 ( 𝑣 ) 𝑣 ## 𝑣 * 𝑣 + 6 5 4 3 2 1 𝑣 𝑣 ) 𝑣 &amp; 𝑣 " 𝑣 * 𝑣 #$ 𝑣 # 𝑣 $ 𝑣 $ 𝑣 ' 𝑣 ) 𝑣 " 𝑣 `# … 𝑣 &amp; 𝑣 " 𝑣 $ 𝑣 #$ 𝑣 ( 𝑣 * 𝑣 + 𝑣 ) 𝑣 ## (2) Sub-graph construction (3) Merging 𝑣"# 𝑣# 𝑣% 𝑣( 𝑣"" 𝑣' 𝑣* 𝑣)</formula><p>* Two overlapping clusters * Each cluster size: 8</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>Random shuffle and Prune</head><p>Neighbor list  </p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>Main memory</head><formula xml:id="formula_4">𝑣! 𝑣 " 𝑣 # 𝑣 $ 𝑣 % 𝑣 &amp; 𝑣 ' 𝑣(</formula><p>Step 1. Build graph with block 0 and 1</p><formula xml:id="formula_5">𝑣 ! 𝑣!" 𝑣 % 𝑣 ' 𝑣) 𝑣!! 𝑣 ( 𝑣# 𝑣 ! 𝑣 "# 𝑣 " 𝑣 # 𝑣 $ 𝑣 % 𝑣 &amp; 𝑣 ' 𝑣 ( 𝑣 "" 𝑣 ) 𝑣 *</formula><p>Step 2. Build graph with block 1 and 2</p><p>Step By separating the insertion of neighbors and the addition of reversed edges into distinct steps, batch insertion guarantees structural consistency while leveraging parallel processing. This approach is particularly effective in large-scale datasets, as it minimizes computational overhead while maintaining the integrity and quality of the graph. Moreover, the absence of interdependencies between batches ensures that the graph remains stable throughout the construction process, enabling efficient and scalable graph-based ANN indexing.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="3.">ScaDANN Method</head></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="3.1.">Grid partitioning</head><p>Figure <ref type="figure" target="#fig_2">3</ref> illustrates the structure of Graph ANN using a grid format, where each node in the graph consists of a base vector and a neighbor list. Grid partitioning divides large-scale datasets into grid blocks, with rows representing source vertices and columns representing destination vertices. This approach organizes datasets into manageable segments, enabling memory-and disk-efficient graph construction. For example, constructing a Vamana graph for a 1-billion-point dataset with 100 dimensions, requiring 400GB for vector storage, is infeasible under memory constraints of 256GB  or 512GB. However, grid partitioning ensures scalability by allowing the dataset to be partitioned into grid blocks, making it feasible to construct the graph even when the dataset size exceeds the available memory.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>Edges from p2 to p1</head></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>Grid Base vector</head><formula xml:id="formula_6">𝑣 &amp; 𝑣 !# 𝑣 ! 𝑣+ 𝑣 # 𝑣, 𝑣 $ 𝑣 - 𝑣 % 𝑣 !! 𝑣 " 𝑣' 𝑣 # 𝑣! 𝑣 !! … … … Neighbor list</formula><p>Each grid block captures intra-and inter-partition relationships. A grid block where the source and destination partitions are the same forms a subgraph, which follows the Vamana graph structure. Diagonal blocks represent subgraphs within individual partitions, while off-diagonal blocks denote connections between partitions. For example, the red block (𝑝2, 𝑝1) in Figure <ref type="figure" target="#fig_2">3</ref> corresponds to connections from partition 𝑝2 (green) to partition 𝑝1 (blue). This structure improves scalability and ensures conflict-free parallel processing during node insertion.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="3.2.">Overlapping block-level insertion</head><p>The overlapping block-level insertion method incrementally constructs the graph by processing data in grid block units. This method employs overlapping blocks, where a block from the previous step is reused in the next step, allowing two blocks to be processed simultaneously. As detailed in Algorithm 1, the process begins with the construction of the graph for the first partition (𝐵 1 = 0). In this step(line 3), specifically generates the subgraph for the first partition, ensuring that the intra-partition edges are established within the initial grid block.</p><p>In subsequent steps, pairs of overlapping blocks are processed to extend the graph. For each step, the data corresponding to the current blocks is loaded into memory to construct the graph (lines 9-11). Particularly, line 11 involves filling four grid blocks by constructing the subgraphs for two partitions and establishing connections between them. This ensures that intra-and inter-partition edges are created simultaneously, improving efficiency and maintaining the global structure of the graph. Once the graph is constructed, the earlier block is offloaded from memory (lines 12-15), and the next block is loaded (lines 5-10).</p><p>By reusing overlapping blocks during the insertion process, ScaDANN integrates graph construction and merging into a single operation, eliminating the need for a separate, computationally expensive merging step. This approach ensures that inter-block connections are captured as the graph is built, thereby avoiding the creation of disjoint subgraphs. Additionally, the overlapping mechanism significantly reduces I/O overhead by combining insertion and merging, which are traditionally distinct steps.</p><p>Figure <ref type="figure">2</ref> illustrates the construction process. Initially, blocks 0 and 1 are loaded into memory, and edges between the two blocks are created. In the second step, block 0 is offloaded, block 1 is retained, and block 2 is loaded. The graph is extended based on connections between blocks 1 and 2. Finally, in the third step, block 1 is offloaded, block 2 is retained, and block 0 is reloaded to extend the graph further by connecting blocks 2 and 0. This iterative process efficiently captures all relevant connections between blocks, enabling the construction of a unified graph while minimizing memory usage and I/O costs.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="3.3.">Grid block merge</head><p>The grid block merge process establishes connections between two blocks loaded into memory and ensures bidirectional connectivity through reverse edges. Reverse edges, where the source and destination vertices are swapped, maintain symmetric inter-block connections, ensuring consistent and efficient search performance.</p><p>In conventional methods, merging requires all data to be loaded into memory to calculate distances between nodes. ScaDANN overcomes this limitation by storing distance information directly within the neighbor list of each node. This pre-stored distance means that block 1 retains the neighbor list formed in step 1. For example, after step 1, 𝑣 5 in block 1 (green) has already stored 𝑣 3 and 𝑣 2 from block 0 (yellow) as its neighbors. This allows distance comparisons between existing and new neighbors without reloading offloaded data, reducing memory usage and improving scalability.</p><p>Figure <ref type="figure" target="#fig_3">4</ref> illustrates the merging process during step 2 in Figure <ref type="figure">2</ref>, where blocks 1 and 2 are loaded into memory while block 0 remains stored on disk. When updating the neighbor list of 𝑣 5 , the algorithm uses pre-stored distance data from step 1 to compare block 0 neighbors with block 2 neighbors, avoiding redundant distance calculations. As a result, 𝑣 5 's neighbors are updated to 𝑣  The merging process follows the Vamana graph construction method, incorporating a greedy search to identify candidate neighbors and a robust pruning step to refine the neighbor list. Inter-block edges are established by adding neighbors from the second block to the neighbor list of the first block. The reverse edges are then added, ensuring bidirectional connectivity, where nodes in the first block are also linked as neighbors of nodes in the second block. This comprehensive approach ensures a robust and efficient merging process.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="3.4.">Complexity</head><p>Table <ref type="table">1</ref> compares the time and space complexities of Vamana, DiskANN, and ScaDANN in the index-building process, focusing on scalability and efficiency. The space complexity considers storage requirements, including neighbor lists.</p><p>Vamana constructs a single large-scale graph, resulting in a time complexity of 𝑂(𝑆 1.16 ) and a space complexity of 𝑂(𝑆 ⋅ 𝑅). Since it does not partition the dataset, the entire graph must fit in memory, making it less scalable for large-scale data. Both DiskANN and ScaDANN partition the dataset to construct subgraphs. Partitioning reduces memory usage and enables disk-based processing. The time complexity is inversely proportional to the number of partitions, as increasing 𝑃 allows for more localized graph construction, reducing the computational cost.</p><p>In DiskANN, each node is assigned to multiple overlapping partitions (typically 𝐾 = 2 ), as shown in Figure <ref type="figure" target="#fig_0">1</ref>. Since DiskANN performs clustering to determine partitions, the build time complexity includes the clustering time ( 𝐶 ). In contrast, ScaDANN utilizes dynamic partitioning, where the data structure size is determined based on the dataset size. This process does not require separate clustering and is treated as an internal computational step, which is why it is not included in the time complexity. Unlike DiskANN, which incurs overhead due to overlapping partitions, ScaDANN eliminates these redundancies, streamlining subgraph construction and merging. This results in a computationally efficient process that scales well with increasing dataset size.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>Methods</head><p>Build Time Space Vamana 𝑂(𝑆 1.16 ) 𝑂(𝑆 ⋅ 𝑅) DiskANN 𝑂(𝐾 1.16 ⋅ 𝑃 −0.84 ⋅ 𝑆 1.16 + 𝐶) 𝑂(𝑆 ⋅ 𝑅 ⋅ 𝐾 ⋅ 𝑃 −1 ) ScaDANN 𝑂(𝑃 −0.84 ⋅ 𝑆 1.16 ) 𝑂(𝑆 ⋅ 𝑅 ⋅ 𝑃 −1 )</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>Table 1</head><p>Comparison of time and space complexities for Vamana, DiskANN, and ScaDANN, where 𝑆 is the dataset size, 𝑅 is the maximum degree of the graph, 𝐾 is the number of overlapping partitions per node, 𝑃 is the number of partitions, and 𝐶 is the clustering time.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="4.">Experiments</head></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="4.1.">Experimental setup</head><p>The experiments were conducted to evaluate the performance of ScaDANN under controlled conditions. The hardware environment included two Gold 6326 16-core processors and 1.9 TB of main memory, while the software environment was based on Ubuntu 20.04.6 LTS. To simulate memory-constrained scenarios, the available memory was deliberately restricted to 256 GB during the experiments. Under these constraints, both Vamana and ParlayANN encountered Out of Memory (OOM) errors, making them unsuitable for comparison. As a result, our evaluation focused on comparing ScaDANN with DiskANN, a state-of-the-art disk-based ANN indexing method.</p><p>Table <ref type="table">2</ref> provides a summary of the datasets used in the experiments, including their data types and dimensions. The datasets used were BIGANN, Microsoft SPACEV, Yandex DEEP, and Yandex Text-to-Image, each posing unique challenges for ANN indexing. BIGANN consists of 1 billion 128-dimensional SIFT descriptors from a large-scale image dataset, serving as a benchmark for high-dimensional similarity search. Microsoft SPACEV includes 1 billion 100dimensional document and query vectors reflecting realworld web search scenarios. Yandex DEEP comprises 1 billion 96-dimensional image descriptors designed to evaluate deep learning-based image representations. Yandex Text-to-Image contains 1 billion multi-modal embeddings, where image embeddings serve as the database and text embeddings form the query set, representing a cross-modal retrieval task with distinct distributions for database and query vectors. </p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>Dataset</head></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>Table 2</head><p>Summary of datasets used in the experiments. The bottom graph in Figure <ref type="figure" target="#fig_5">5</ref> shows the memory usage during the index construction process. Both DiskANN and ScaDANN employed dynamic partitioning, which adjusts the number of partitions based on available memory capacity and efficiently divides the data. Since the graph size and memory usage vary depending on the degree of the neighbor list, dynamic partitioning enables optimized partitioning by adapting to these variations. Notably, ScaDANN required a higher number of partitions than DiskANN. This is because ScaDANN stores additional distance information within the graph, which increases memory consumption per block. As a result, ScaDANN divides the data into more partitions than DiskANN to accommodate the additional memory usage. For instance, for the BIGANN dataset, ScaDANN used around 200GB of memory for indexing, while DiskANN remained under the 256GB limit, but required fewer partitions to store its graph.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="4.2.">Results and Analysis</head></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="4.2.1.">Billion-scale index in limited memory</head><p>We conducted a detailed comparison of the index building process with DiskANN, as illustrated in Figure <ref type="figure">6</ref>. In the subgraph construction phase, our method completed approximately twice as fast as DiskANN, with the BIGANN dataset showing a significant speedup. However, the subgraph merging phase took longer in our approach. This is due to the finer-grained distance comparisons performed during the merging process, which operates at the level of grid blocks. Unlike DiskANN, which uses a more straightforward merging process, ScaDANN's overlapping block approach requires additional computations during the merging phase to ensure accurate distance calculations between partitioned blocks.</p><p>In terms of I/O costs, our method demonstrated greater efficiency. This improvement can be attributed to the use of overlapping block techniques, which optimize data loading between memory and disk. Specifically, ScaDANN's approach minimizes unnecessary disk reads and efficiently manages memory, leading to a faster construction process despite the increased memory overhead. This is evident in the I/O time breakdown in Figure <ref type="figure">6</ref>, where ScaDANN exhibits a more balanced distribution of time between subgraph building, partitioning, and merging, compared to DiskANN, which spends a significant portion of time in I/O operations.  The graph compares Recall@10 (x-axis) and QPS (y-axis), where higher placement indicates lower latency at a given recall level. A method positioned higher in the graph achieves faster query processing while maintaining similar accuracy.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="4.2.2.">Disk-based search performance</head><p>Figure <ref type="figure" target="#fig_7">7</ref> presents the comparison of the search performance based on disks between ScaDANN and DiskANN. The experiments were carried out using the disk search method for evaluation. Our proposed ScaDANN method demonstrated better search performance than DiskANN for most datasets in terms of QPS, especially for the BIGANN-1B dataset. As shown in the graph, ScaDANN achieved a higher QPS while maintaining competitive recall rates. For example, ScaDANN processed queries at a faster rate than DiskANN for BIGANN and MSSPACEV-1B, achieving a significant speedup while still maintaining similar or slightly better recall at top-10 results. However, for the DEEP-1B dataset, ScaDANN showed a slight performance drop when Recall@10 exceeded 95. This discrepancy can be attributed to the disk search method used, which is specifically optimized for DiskANN.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="4.3.">Ablation study 4.3.1. Impact of partition size on index construction</head><p>Figure <ref type="figure">8</ref> illustrates the effects of varying the number of partitions on the BIGANN 100M dataset. Increasing the number of partitions led to a significant increase in building time, with approximately a 70% increase per step. However, RAM usage decreased by around 40% per step, demonstrating the trade-off between memory efficiency and construction time. Despite the changes in partition numbers, there were no significant differences observed in QPS relative to Recall@10 across the tested configurations. This indicates that increasing the partition number primarily affects the construction phase without adversely impacting search performance.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="4.3.2.">In-memory graph construction experiments</head><p>The in-memory graph construction experiments on BIGANN-100M and Yandex Text-to-Image-100M compare ParlayANN, ScaDANN, and DiskANN in terms of indexing time and query performance. Since ParlayANN encounters out-of-memory (OOM) issues at the billion-scale, the experiments were conducted on 100 million data points without partitioning for ScaDANN and DiskANN. ScaDANN achieves indexing speeds comparable to or faster than ParlayANN by storing distances within the graph, reducing redundant computations. Both methods also benefit from batch insertion, leading to faster construction than DiskANN. ScaDANN not only achieves a faster build time but also maintains strong search performance, striking a balance between efficient indexing and high query throughput.  </p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="5.">Conclusion</head><p>This study proposed ScaDANN, a scalable disk-based graph indexing method designed for billion-scale ANN datasets in memory-constrained environments. By introducing overlapping block-level insertion and grid block merge, ScaDANN effectively mitigates the high memory demands and inefficiencies of existing methods. These techniques enable seamless integration of blocks and efficient merging of partitioned graphs while minimizing I/O overhead and leveraging stored distance information. Experimental results demonstrated that ScaDANN achieves up to 1.5× faster index construction and 1.4× higher query throughput compared to DiskANN, while maintaining competitive accuracy across diverse datasets. Furthermore, ScaDANN supports the construction of single indexes 10 times larger than existing approaches within the same memory constraints, making it a robust solution for large-scale ANN indexing. This work establishes ScaDANN as an efficient and scalable approach, with significant potential for real-world applications requiring high-performance ANN indexing.</p></div><figure xmlns="http://www.tei-c.org/ns/1.0" xml:id="fig_0"><head>Figure 1 :</head><label>1</label><figDesc>Figure 1: DiskANN partitioning and merging process.</figDesc></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" xml:id="fig_2"><head>Figure 3 :</head><label>3</label><figDesc>Figure 3: Graph ANN with grid format. The grid format represents the structure of Graph ANN, where each node consists of a base vector and a neighbor list.</figDesc></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" xml:id="fig_3"><head>Figure 4 :</head><label>4</label><figDesc>Figure 4: Grid block merge process with distance information.</figDesc></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" xml:id="fig_5"><head>Figure 5</head><label>5</label><figDesc>Figure 5 presents the results of indexing at a billion-scale under a 256GB memory constraint. The top graph in Figure 5 illustrates the index building time across different datasets. ScaDANN significantly reduced the index-building time compared to DiskANN across all datasets. For example, in the BIGANN dataset, ScaDANN completed the indexing task in 10.7 hours, compared to 16.5 hours for DiskANN, achieving approximately a 35% reduction in build time. This efficiency improvement can be attributed to ScaDANN's optimized data partitioning and overlapping block techniques, which minimize disk I/O and improve memory access patterns.The bottom graph in Figure5shows the memory usage during the index construction process. Both DiskANN and ScaDANN employed dynamic partitioning, which adjusts the number of partitions based on available memory capacity and efficiently divides the data. Since the graph size and memory usage vary depending on the degree of the neighbor list, dynamic partitioning enables optimized partitioning by adapting to these variations. Notably, ScaDANN required a</figDesc></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" xml:id="fig_6"><head>Figure 5 :Figure 6 :</head><label>56</label><figDesc>Figure 5: Billion-scale index building experiment results, including building time and memory usage, conducted under a 256GB memory constraint. The memory usage values indicate the number of partitions dynamically determined during the indexing process.</figDesc></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" xml:id="fig_7"><head>Figure 7 :</head><label>7</label><figDesc>Figure7: Disk-based search performance. The graph compares Recall@10 (x-axis) and QPS (y-axis), where higher placement indicates lower latency at a given recall level. A method positioned higher in the graph achieves faster query processing while maintaining similar accuracy.</figDesc></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" xml:id="fig_8"><head>10 QPS / Recall@ 10 Figure 8 :</head><label>10108</label><figDesc>Figure 8: Experimental results of BIGANN 100M index building with varying numbers of partitions.</figDesc></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" xml:id="fig_9"><head>Figure 9 :</head><label>9</label><figDesc>Figure 9: In-memory experimental results of BIGANN 100M and Yandex Text-to-Image 100M. The numbers in parentheses next to each method indicate the index building time in minutes.</figDesc></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" type="table" xml:id="tab_0"><head>3. Build graph with block 2 and 0 Block 1 Block 2</head><label></label><figDesc>ScaDANN building process with three partitions and corresponding grid blocks at each step. The graph is constructed by loading two partitions into memory at a time, ensuring efficient graph generation. In the next step, the previously used partition is overlapped with a new partition to continue the building process.edge is defined as an edge where the source vertex and destination vertex are swapped, ensuring bidirectional connectivity between nodes. For example, if an edge connects node 𝑣 𝑖 (source) to node 𝑣 𝑗 (destination), the reversed edge connects 𝑣 𝑗 back to 𝑣 𝑖 . During this step, the neighbors of each newly added node are updated with reversed edges, completing the bidirectional connections required for robust graph-based indexing.</figDesc><table><row><cell>Main memory</cell><cell></cell><cell>Main memory</cell></row><row><cell></cell><cell></cell><cell>Block 2</cell></row><row><cell></cell><cell></cell><cell>Block 0</cell></row><row><cell></cell><cell></cell><cell>𝑣 #</cell></row><row><cell></cell><cell>𝑣"</cell><cell></cell></row><row><cell></cell><cell></cell><cell>𝑣&amp;</cell></row><row><cell>𝑣!"</cell><cell>𝑣$</cell><cell>𝑣)</cell></row><row><cell></cell><cell></cell><cell>𝑣!!</cell></row><row><cell></cell><cell></cell><cell>𝑣</cell></row></table><note>*Final graphFigure 2:</note></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" type="table" xml:id="tab_2"><head></head><label></label><figDesc>first block number 𝐵 1 = 0, second block number 𝐵 2 = 1, block interval 𝑖𝑡𝑣 = 1, neighbor list 𝑁 𝐵𝑅 𝑖 = ∅;</figDesc><table><row><cell cols="2">Algorithm 1: ScaDANN Construction</cell></row><row><cell></cell><cell>Input: block size 𝑃 and dataset 𝐷</cell></row><row><cell></cell><cell>Output: Graph</cell></row><row><cell></cell><cell>/* Initialize first block and interval</cell><cell>*/</cell></row><row><cell></cell><cell>/* Read dataset for the first block</cell><cell>*/</cell></row><row><cell cols="2">2 𝐷 1 = ReadDataset(𝐵 1 );</cell></row><row><cell cols="2">3 𝑁 𝐵𝑅 1 = vamana(𝐵 1 , 𝑁 𝐵𝑅 1 );</cell></row><row><cell cols="2">4 while 𝑖𝑡𝑣 &lt; 𝑃 do</cell></row><row><cell>5</cell><cell>𝐵 2 ≡ (𝐵 1 + 𝑖𝑡𝑣) (mod 𝑃);</cell></row><row><cell></cell><cell>/* Increase interval size</cell><cell>*/</cell></row><row><cell>6</cell><cell>if 𝐵 1 = 𝐵 2 then</cell></row><row><cell>7</cell><cell>𝑖𝑡𝑣 = 𝑖𝑡𝑣  *  2;</cell></row><row><cell>8</cell><cell>continue</cell></row><row><cell></cell><cell cols="2">/* Read the second dataset and graph */</cell></row><row><cell>9</cell><cell>𝐷 2 = ReadDataset(𝐵 2 );</cell></row><row><cell>10</cell><cell>𝑁 𝐵𝑅 2 = ReadGraph(𝐵 2 );</cell></row></table><note>1 11 𝑁 𝐵𝑅 1 , 𝑁 𝐵𝑅 2 = vamana((𝐷 1 , 𝐷 2 ), (𝑁 𝐵𝑅 1 , 𝑁 𝐵𝑅 2 )); 12 WriteGraph(𝑁 𝐵𝑅 1 , 𝐵 1 ); 13 𝐷 1 = 𝐷 2 ; 14 𝑁 𝐵𝑅 1 = 𝑁 𝐵𝑅 2 ; 15 𝐵 1 = 𝐵 2 ; 16 WriteGraph(𝑁 𝐵𝑅 1 , 𝐵 1 );</note></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" type="table" xml:id="tab_3"><head></head><label></label><figDesc><ref type="bibr" target="#b10">11</ref> , 𝑣 3 , and 𝑣 2 .</figDesc><table><row><cell></cell><cell></cell><cell>𝑣 !</cell><cell>𝑣 "</cell><cell>𝑣 $</cell><cell></cell><cell></cell><cell>Main memory</cell><cell>Disk</cell></row><row><cell>𝑣 "#</cell><cell></cell><cell>𝑣 #</cell><cell>𝑣 (</cell><cell>𝑣 &amp; 𝑣 ""</cell><cell>𝑣 '</cell><cell></cell><cell>Block 1 Block 2</cell><cell>Block 0</cell></row><row><cell></cell><cell></cell><cell>𝑣 %</cell><cell>𝑣 )</cell><cell>𝑣 *</cell><cell></cell><cell></cell><cell>Stored on disk (not loaded in main memory)</cell></row><row><cell>𝑣 !</cell><cell cols="6">Neighbor list (Neighbor ID, Distance)</cell><cell>Max degree:3</cell></row><row><cell cols="2">( 𝑣" , 236) 𝑣 &amp;</cell><cell cols="2">( 𝑣! , 387) 𝑣 $</cell><cell cols="2">( 𝑣$$ , 219) 𝑣 $$</cell><cell>( 𝑣# , 473) 𝑣 *</cell><cell>( 𝑣$$ , 219) 𝑣$$</cell><cell>( 𝑣" , 236) 𝑣 &amp;</cell><cell>( 𝑣! , 387) 𝑣 $</cell></row><row><cell></cell><cell></cell><cell></cell><cell></cell><cell cols="3">Added neighbors</cell><cell>Updated neighbor list</cell></row></table></figure>
		</body>
		<back>
			<div type="references">

				<listBibl>

<biblStruct xml:id="b0">
	<analytic>
		<title level="a" type="main">A survey of knowledge-enhanced text generation</title>
		<author>
			<persName><forename type="first">W</forename><surname>Yu</surname></persName>
		</author>
		<author>
			<persName><forename type="first">C</forename><surname>Zhu</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Z</forename><surname>Li</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Z</forename><surname>Hu</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Q</forename><surname>Wang</surname></persName>
		</author>
		<author>
			<persName><forename type="first">H</forename><surname>Ji</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><surname>Jiang</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">ACM Computing Surveys</title>
		<imprint>
			<biblScope unit="volume">54</biblScope>
			<biblScope unit="issue">11s</biblScope>
			<biblScope unit="page" from="1" to="38" />
			<date type="published" when="2022">2022</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b1">
	<monogr>
		<author>
			<persName><forename type="first">H</forename><surname>Jin</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Y</forename><surname>Zhang</surname></persName>
		</author>
		<author>
			<persName><forename type="first">D</forename><surname>Meng</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><surname>Wang</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><surname>Tan</surname></persName>
		</author>
		<idno type="arXiv">arXiv:2403.02901</idno>
		<title level="m">A comprehensive survey on process-oriented automatic text summarization with exploration of llm-based methods</title>
				<imprint>
			<date type="published" when="2024">2024</date>
		</imprint>
	</monogr>
	<note type="report_type">arXiv preprint</note>
</biblStruct>

<biblStruct xml:id="b2">
	<analytic>
		<title level="a" type="main">Toolqa: A dataset for llm question answering with external tools</title>
		<author>
			<persName><forename type="first">Y</forename><surname>Zhuang</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Y</forename><surname>Yu</surname></persName>
		</author>
		<author>
			<persName><forename type="first">K</forename><surname>Wang</surname></persName>
		</author>
		<author>
			<persName><forename type="first">H</forename><surname>Sun</surname></persName>
		</author>
		<author>
			<persName><forename type="first">C</forename><surname>Zhang</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">Advances in Neural Information Processing Systems</title>
		<imprint>
			<biblScope unit="volume">36</biblScope>
			<biblScope unit="page" from="50117" to="50143" />
			<date type="published" when="2023">2023</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b3">
	<analytic>
		<title level="a" type="main">Evaluating the feasibility of chatgpt in healthcare: an analysis of multiple clinical and research scenarios</title>
		<author>
			<persName><forename type="first">M</forename><surname>Cascella</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><surname>Montomoli</surname></persName>
		</author>
		<author>
			<persName><forename type="first">V</forename><surname>Bellini</surname></persName>
		</author>
		<author>
			<persName><forename type="first">E</forename><surname>Bignami</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">Journal of medical systems</title>
		<imprint>
			<biblScope unit="volume">47</biblScope>
			<biblScope unit="issue">1</biblScope>
			<biblScope unit="page">33</biblScope>
			<date type="published" when="2023">2023</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b4">
	<analytic>
		<title level="a" type="main">Detecting llm-generated text in computing education: Comparative study for chatgpt cases</title>
		<author>
			<persName><forename type="first">M</forename><forename type="middle">S</forename><surname>Orenstrakh</surname></persName>
		</author>
		<author>
			<persName><forename type="first">O</forename><surname>Karnalim</surname></persName>
		</author>
		<author>
			<persName><forename type="first">C</forename><forename type="middle">A</forename><surname>Suarez</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><surname>Liut</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">IEEE 48th Annual Computers, Software, and Applications Conference (COMPSAC)</title>
				<imprint>
			<publisher>IEEE</publisher>
			<date type="published" when="2024">2024. 2024</date>
			<biblScope unit="page" from="121" to="126" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b5">
	<monogr>
		<author>
			<persName><forename type="first">K</forename><surname>Pandya</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><surname>Holia</surname></persName>
		</author>
		<idno type="arXiv">arXiv:2310.05421</idno>
		<title level="m">Automating customer service using langchain: Building custom open-source gpt chatbot for organizations</title>
				<imprint>
			<date type="published" when="2023">2023</date>
		</imprint>
	</monogr>
	<note type="report_type">arXiv preprint</note>
</biblStruct>

<biblStruct xml:id="b6">
	<monogr>
		<author>
			<persName><forename type="first">L</forename><surname>Huang</surname></persName>
		</author>
		<author>
			<persName><forename type="first">W</forename><surname>Yu</surname></persName>
		</author>
		<author>
			<persName><forename type="first">W</forename><surname>Ma</surname></persName>
		</author>
		<author>
			<persName><forename type="first">W</forename><surname>Zhong</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Z</forename><surname>Feng</surname></persName>
		</author>
		<author>
			<persName><forename type="first">H</forename><surname>Wang</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Q</forename><surname>Chen</surname></persName>
		</author>
		<author>
			<persName><forename type="first">W</forename><surname>Peng</surname></persName>
		</author>
		<author>
			<persName><forename type="first">X</forename><surname>Feng</surname></persName>
		</author>
		<author>
			<persName><forename type="first">B</forename><surname>Qin</surname></persName>
		</author>
		<title level="m">A survey on hallucination in large language models: Principles, taxonomy, challenges, and open questions</title>
				<imprint>
			<date type="published" when="2023">2023</date>
		</imprint>
	</monogr>
	<note>ACM Transactions on Information Systems</note>
</biblStruct>

<biblStruct xml:id="b7">
	<monogr>
		<author>
			<persName><forename type="first">Y</forename><surname>Yao</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><surname>Duan</surname></persName>
		</author>
		<author>
			<persName><forename type="first">K</forename><surname>Xu</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Y</forename><surname>Cai</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Z</forename><surname>Sun</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Y</forename><surname>Zhang</surname></persName>
		</author>
		<title level="m">A survey on large language model (llm) security and privacy: The good, the bad, and the ugly, High-Confidence Computing</title>
				<imprint>
			<date type="published" when="2024">2024</date>
			<biblScope unit="page">100211</biblScope>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b8">
	<analytic>
		<title level="a" type="main">Retrieval-augmented generation for knowledge-intensive nlp tasks</title>
		<author>
			<persName><forename type="first">P</forename><surname>Lewis</surname></persName>
		</author>
		<author>
			<persName><forename type="first">E</forename><surname>Perez</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Piktus</surname></persName>
		</author>
		<author>
			<persName><forename type="first">F</forename><surname>Petroni</surname></persName>
		</author>
		<author>
			<persName><forename type="first">V</forename><surname>Karpukhin</surname></persName>
		</author>
		<author>
			<persName><forename type="first">N</forename><surname>Goyal</surname></persName>
		</author>
		<author>
			<persName><forename type="first">H</forename><surname>Küttler</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><surname>Lewis</surname></persName>
		</author>
		<author>
			<persName><forename type="first">W</forename><surname>-T. Yih</surname></persName>
		</author>
		<author>
			<persName><forename type="first">T</forename><surname>Rocktäschel</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">Advances in Neural Information Processing Systems</title>
		<imprint>
			<biblScope unit="volume">33</biblScope>
			<biblScope unit="page" from="9459" to="9474" />
			<date type="published" when="2020">2020</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b9">
	<analytic>
		<title level="a" type="main">Survey of vector database management systems</title>
		<author>
			<persName><forename type="first">J</forename><forename type="middle">J</forename><surname>Pan</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><surname>Wang</surname></persName>
		</author>
		<author>
			<persName><forename type="first">G</forename><surname>Li</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">The VLDB Journal</title>
		<imprint>
			<biblScope unit="volume">33</biblScope>
			<biblScope unit="issue">5</biblScope>
			<biblScope unit="page" from="1591" to="1615" />
			<date type="published" when="2024">2024</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b10">
	<analytic>
		<title level="a" type="main">Spann: Highly-efficient billion-scale approximate nearest neighborhood search</title>
		<author>
			<persName><forename type="first">Q</forename><surname>Chen</surname></persName>
		</author>
		<author>
			<persName><forename type="first">B</forename><surname>Zhao</surname></persName>
		</author>
		<author>
			<persName><forename type="first">H</forename><surname>Wang</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><surname>Li</surname></persName>
		</author>
		<author>
			<persName><forename type="first">C</forename><surname>Liu</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Z</forename><surname>Li</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><surname>Yang</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><surname>Wang</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">Advances in Neural Information Processing Systems</title>
		<imprint>
			<biblScope unit="volume">34</biblScope>
			<biblScope unit="page" from="5199" to="5212" />
			<date type="published" when="2021">2021</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b11">
	<analytic>
		<title level="a" type="main">Hierarchical clustering-based graphs for large scale approximate nearest neighbor search</title>
		<author>
			<persName><forename type="first">J</forename><forename type="middle">V</forename><surname>Munoz</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><forename type="middle">A</forename><surname>Gonçalves</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Z</forename><surname>Dias</surname></persName>
		</author>
		<author>
			<persName><forename type="first">R</forename><forename type="middle">D S</forename><surname>Torres</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">Pattern Recognition</title>
		<imprint>
			<biblScope unit="volume">96</biblScope>
			<biblScope unit="page">106970</biblScope>
			<date type="published" when="2019">2019</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b12">
	<monogr>
		<author>
			<persName><forename type="first">S</forename><surname>Yu</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><surname>Engels</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><surname>Huang</surname></persName>
		</author>
		<author>
			<persName><surname>Shun</surname></persName>
		</author>
		<idno type="arXiv">arXiv:2312.03940</idno>
		<title level="m">Pecann: Parallel efficient clustering with graph-based approximate nearest neighbor search</title>
				<imprint>
			<date type="published" when="2023">2023</date>
		</imprint>
	</monogr>
	<note type="report_type">arXiv preprint</note>
</biblStruct>

<biblStruct xml:id="b13">
	<analytic>
		<title level="a" type="main">Product quantization for nearest neighbor search</title>
		<author>
			<persName><forename type="first">H</forename><surname>Jegou</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><surname>Douze</surname></persName>
		</author>
		<author>
			<persName><forename type="first">C</forename><surname>Schmid</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">IEEE transactions on pattern analysis and machine intelligence</title>
		<imprint>
			<biblScope unit="volume">33</biblScope>
			<biblScope unit="issue">1</biblScope>
			<biblScope unit="page" from="117" to="128" />
			<date type="published" when="2010">2010</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b14">
	<analytic>
		<title level="a" type="main">Locally optimized product quantization for approximate nearest neighbor search</title>
		<author>
			<persName><forename type="first">Y</forename><surname>Kalantidis</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Y</forename><surname>Avrithis</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Proceedings of the IEEE conference on computer vision and pattern recognition</title>
				<meeting>the IEEE conference on computer vision and pattern recognition</meeting>
		<imprint>
			<date type="published" when="2014">2014</date>
			<biblScope unit="page" from="2321" to="2328" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b15">
	<analytic>
		<title level="a" type="main">The inverted multi-index</title>
		<author>
			<persName><forename type="first">A</forename><surname>Babenko</surname></persName>
		</author>
		<author>
			<persName><forename type="first">V</forename><surname>Lempitsky</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">IEEE transactions on pattern analysis and machine intelligence</title>
		<imprint>
			<biblScope unit="volume">37</biblScope>
			<biblScope unit="issue">6</biblScope>
			<biblScope unit="page" from="1247" to="1260" />
			<date type="published" when="2014">2014</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b16">
	<analytic>
		<title level="a" type="main">Efficient k-nearest neighbor graph construction for generic similarity measures</title>
		<author>
			<persName><forename type="first">W</forename><surname>Dong</surname></persName>
		</author>
		<author>
			<persName><forename type="first">C</forename><surname>Moses</surname></persName>
		</author>
		<author>
			<persName><forename type="first">K</forename><surname>Li</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Proceedings of the 20th international conference on World wide web</title>
				<meeting>the 20th international conference on World wide web</meeting>
		<imprint>
			<date type="published" when="2011">2011</date>
			<biblScope unit="page" from="577" to="586" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b17">
	<monogr>
		<author>
			<persName><forename type="first">C</forename><surname>Fu</surname></persName>
		</author>
		<author>
			<persName><forename type="first">C</forename><surname>Xiang</surname></persName>
		</author>
		<author>
			<persName><forename type="first">C</forename><surname>Wang</surname></persName>
		</author>
		<author>
			<persName><forename type="first">D</forename><surname>Cai</surname></persName>
		</author>
		<idno type="arXiv">arXiv:1707.00143</idno>
		<title level="m">Fast approximate nearest neighbor search with the navigating spreadingout graph</title>
				<imprint>
			<date type="published" when="2017">2017</date>
		</imprint>
	</monogr>
	<note type="report_type">arXiv preprint</note>
</biblStruct>

<biblStruct xml:id="b18">
	<analytic>
		<title level="a" type="main">Approximate nearest neighbor algorithm based on navigable small world graphs</title>
		<author>
			<persName><forename type="first">Y</forename><surname>Malkov</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Ponomarenko</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Logvinov</surname></persName>
		</author>
		<author>
			<persName><forename type="first">V</forename><surname>Krylov</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">Information Systems</title>
		<imprint>
			<biblScope unit="volume">45</biblScope>
			<biblScope unit="page" from="61" to="68" />
			<date type="published" when="2014">2014</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b19">
	<analytic>
		<title level="a" type="main">Efficient and robust approximate nearest neighbor search using hierarchical navigable small world graphs</title>
		<author>
			<persName><forename type="first">Y</forename><forename type="middle">A</forename><surname>Malkov</surname></persName>
		</author>
		<author>
			<persName><forename type="first">D</forename><forename type="middle">A</forename><surname>Yashunin</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">IEEE transactions on pattern analysis and machine intelligence</title>
		<imprint>
			<biblScope unit="volume">42</biblScope>
			<biblScope unit="issue">4</biblScope>
			<biblScope unit="page" from="824" to="836" />
			<date type="published" when="2018">2018</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b20">
	<analytic>
		<title level="a" type="main">Fast approximate knn graph construction for high dimensional data via recursive lanczos bisection</title>
		<author>
			<persName><forename type="first">J</forename><surname>Chen</surname></persName>
		</author>
		<author>
			<persName><forename type="first">H</forename><surname>-R. Fang</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Y</forename><surname>Saad</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">Journal of Machine Learning Research</title>
		<imprint>
			<biblScope unit="volume">10</biblScope>
			<biblScope unit="issue">9</biblScope>
			<date type="published" when="2009">2009</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b21">
	<monogr>
		<author>
			<persName><forename type="first">M</forename><surname>Wang</surname></persName>
		</author>
		<author>
			<persName><forename type="first">X</forename><surname>Xu</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Q</forename><surname>Yue</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Y</forename><surname>Wang</surname></persName>
		</author>
		<idno type="arXiv">arXiv:2101.12631</idno>
		<title level="m">A comprehensive survey and experimental comparison of graph-based approximate nearest neighbor search</title>
				<imprint>
			<date type="published" when="2021">2021</date>
		</imprint>
	</monogr>
	<note type="report_type">arXiv preprint</note>
</biblStruct>

<biblStruct xml:id="b22">
	<analytic>
		<title level="a" type="main">New trends in high-d vector similarity search: al-driven, progressive, and distributed</title>
		<author>
			<persName><forename type="first">K</forename><surname>Echihabi</surname></persName>
		</author>
		<author>
			<persName><forename type="first">K</forename><surname>Zoumpatianos</surname></persName>
		</author>
		<author>
			<persName><forename type="first">T</forename><surname>Palpanas</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">Proceedings of the VLDB Endowment</title>
		<imprint>
			<biblScope unit="volume">14</biblScope>
			<biblScope unit="issue">12</biblScope>
			<biblScope unit="page" from="3198" to="3201" />
			<date type="published" when="2021">2021</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b23">
	<analytic>
		<title level="a" type="main">Results of the neurips&apos;21 challenge on billion-scale approximate nearest neighbor search</title>
		<author>
			<persName><forename type="first">H</forename><forename type="middle">V</forename><surname>Simhadri</surname></persName>
		</author>
		<author>
			<persName><forename type="first">G</forename><surname>Williams</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><surname>Aumüller</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><surname>Douze</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Babenko</surname></persName>
		</author>
		<author>
			<persName><forename type="first">D</forename><surname>Baranchuk</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Q</forename><surname>Chen</surname></persName>
		</author>
		<author>
			<persName><forename type="first">L</forename><surname>Hosseini</surname></persName>
		</author>
		<author>
			<persName><forename type="first">R</forename><surname>Krishnaswamny</surname></persName>
		</author>
		<author>
			<persName><forename type="first">G</forename><surname>Srinivasa</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">NeurIPS 2021 Competitions and Demonstrations Track</title>
				<imprint>
			<publisher>PMLR</publisher>
			<date type="published" when="2022">2022</date>
			<biblScope unit="page" from="177" to="189" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b24">
	<analytic>
		<title level="a" type="main">Diskann: Fast accurate billion-point nearest neighbor search on a single node</title>
		<author>
			<persName><forename type="first">S</forename><forename type="middle">Jayaram</forename><surname>Subramanya</surname></persName>
		</author>
		<author>
			<persName><forename type="first">F</forename><surname>Devvrit</surname></persName>
		</author>
		<author>
			<persName><forename type="first">H</forename><forename type="middle">V</forename><surname>Simhadri</surname></persName>
		</author>
		<author>
			<persName><forename type="first">R</forename><surname>Krishnawamy</surname></persName>
		</author>
		<author>
			<persName><forename type="first">R</forename><surname>Kadekodi</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">Advances in Neural Information Processing Systems</title>
		<imprint>
			<biblScope unit="volume">32</biblScope>
			<date type="published" when="2019">2019</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b25">
	<analytic>
		<title level="a" type="main">Approximate nearest neighbor queries in fixed dimensions</title>
		<author>
			<persName><forename type="first">S</forename><surname>Arya</surname></persName>
		</author>
		<author>
			<persName><forename type="first">D</forename><forename type="middle">M</forename><surname>Mount</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">SODA</title>
		<imprint>
			<biblScope unit="volume">93</biblScope>
			<biblScope unit="page" from="271" to="280" />
			<date type="published" when="1993">1993</date>
			<publisher>Citeseer</publisher>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b26">
	<analytic>
		<title level="a" type="main">Parlayann: Scalable and deterministic parallel graph-based approximate nearest neighbor search algorithms</title>
		<author>
			<persName><forename type="first">M</forename><forename type="middle">D</forename><surname>Manohar</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Z</forename><surname>Shen</surname></persName>
		</author>
		<author>
			<persName><forename type="first">G</forename><surname>Blelloch</surname></persName>
		</author>
		<author>
			<persName><forename type="first">L</forename><surname>Dhulipala</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Y</forename><surname>Gu</surname></persName>
		</author>
		<author>
			<persName><forename type="first">H</forename><forename type="middle">V</forename><surname>Simhadri</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Y</forename><surname>Sun</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Proceedings of the 29th ACM SIGPLAN Annual Symposium on Principles and Practice of Parallel Programming</title>
				<meeting>the 29th ACM SIGPLAN Annual Symposium on Principles and Practice of Parallel Programming</meeting>
		<imprint>
			<date type="published" when="2024">2024</date>
			<biblScope unit="page" from="270" to="285" />
		</imprint>
	</monogr>
</biblStruct>

				</listBibl>
			</div>
		</back>
	</text>
</TEI>
