An Empirical Comparison of FAISS and FENSHSES for Nearest
               Neighbor Search in Hamming Space

             Cun (Matthew) Mu†                                               Binwei Yang†                            Zheng (John) Yan
                  Walmart Labs                                             Walmart Labs                                   Walmart Labs
                   Hoboken, NJ                                            Sunnyvale, CA                                    Hoboken, NJ
               matthew.mu@jet.com                                     BYang@walmartlabs.com                               john@jet.com

ABSTRACT                                                                                 The fundamental trade-off between search latency and cost-
In this paper, we compare the performances of FAISS and FENSHSES                      effectiveness would naturally classify nearest neighbor search solu-
on nearest neighbor search in Hamming space–a fundamental task                        tions into two broad categories.
with ubiquitous applications in nowadays eCommerce. Comprehen-
sive evaluations are made in terms of indexing speed, search latency                     NNS solutions implemented in main memory. This type of NNS
and RAM consumption. This comparison is conducted towards a                           solutions has been extensively studied and explored in the field
better understanding on trade-offs between nearest neighbor search                    of information retrieval (IR). As a result, the majority of those
systems implemented in main memory and the ones implemented                           widely used ones (e.g., Spotify’s Annoy [2], Facebook’s FAISS [9]
in secondary memory, which is largely unaddressed in literature.                      and Microsoft’s SPTAG [5, 21]) in nowadays software market fall
                                                                                      into this category.
CCS CONCEPTS
• Information systems → Nearest-neighbor search; Image
search; • Applied computing → Online shopping; • Software                                NNS solutions implemented in secondary memory. In contrast, the
and its engineering → Memory management;                                              second type of NNS solutions are delivered only recently by active
                                                                                      efforts from both academia and industry [1, 11, 14, 16, 19, 20] to
KEYWORDS                                                                              empower full-text search engines (e.g., Elasticsearch and Solr) with
                                                                                      the capability of finding nearest neighbors. By leveraging inverted-
Nearest neighbor search, FAISS, FENSHSES, Hamming space, Bi-
                                                                                      index-based information retrieval systems and cutting-edge engi-
nary codes, Vector similarity search, Full-text search engines, Elas-
                                                                                      neering designs from these full-text search engines, such full-text
ticsearch
                                                                                      search engine based solutions are capable of economically reduce
ACM Reference Format:                                                                 RAM consumption [1], incoherently supporting multi-model search
Cun (Matthew) Mu† , Binwei Yang† , and Zheng (John) Yan. 2019. An Em-                 [16] and being extremely well-prepared for production deployment
pirical Comparison of FAISS and FENSHSES for Nearest Neighbor Search                  [20]. However, some of the critical performance questions have not
in Hamming Space. In Proceedings of the SIGIR 2019 Workshop on
                                                                                      been quantitatively answered in literature:
eCommerce (SIGIR 2019 eCom), 3 pages.

                                                                                          • how much RAM could these full-text search based solutions
1 INTRODUCTION
                                                                                            save?
Nearest neighbor search (NNS) within semantic embeddings (a.k.a.,                         • how much search latency would these solutions sacrifice in
vector similarity search) has become a common practice in ubiq-                             order to reduce RAM consumption?
uitous eCommerce applications including neural ranking model
based text search [3, 12], content-based image retrieval [16, 22], col-
                                                                                          In this paper, we will shed light on the above questions through a
laborative filtering [7], large-scale product categorization [8], fraud
                                                                                      case study on the task of nearest neighbor search in Hamming space
detection [18], etc. While vector similarity search is capable of sub-
                                                                                      (i.e., the space of binary codes). This task is an extremely important
stantially boosting search relevancy by understanding customers’
                                                                                      subclass of NNS, as learning and representing textual, visual and
intents more semantically, it presents a major challenge: how to
                                                                                      acoustic data with compact and semantic binary vectors is a pretty
conduct nearest neighbor search among millions or even billions of
                                                                                      mature technology and common practice in nowadays IR systems.
high-dimensional vectors in a real-time and cost-effective manner.
                                                                                      In particular, eBay recently builds its whole visual search system
† C. Mu and B. Yang contributed equally to this work.                                 [22] upon finding nearest neighbors within binary embeddings
                                                                                      generated through deep neural network models.
Copyright © 2019 by the paper’s authors. Copying permitted for private and academic       We choose one representative solution of each category–FAISS
purposes.                                                                             (Facebook AI Similarity Search) from Facebook’s AI Research Lab
In: J. Degenhardt, S. Kallumadi, U. Porwal, A. Trotman (eds.):                        [9] and FENSHSES (Fast Exact Neighbor Search in Hamming Space
Proceedings of the SIGIR 2019 eCom workshop, July 2019, Paris, France, published at
http://ceur-ws.org                                                                    on Elasticsearch) from the search and catalog teams at Walmart
                                                                                      Labs [14, 15]–to evaluate their performances in finding nearest
                                                                                      neighbors within binary codes.
SIGIR 2019 eCom, July 2019, Paris, France                                                                                           C. Mu et al.


2    FAISS vs. FENSHSES                                                        RAM consumption. Since FAISS is implemented in main memory,
We will compare performances of FAISS and FENSHSES from three              its RAM consumption undoubtedly rises along with the increase in
key perspectives: time spent in indexing, search latency and RAM           the size of dataset B, as shown in Table 3. In contrast, by leveraging
consumption.                                                               the highly optimized disk-based index mechanics behind full-text
                                                                           search engines, FENSHSES consumes a much smaller amount of
   Data generation. Our dataset B is generated using 2.8 million           RAM when conducting nearest neighbor search. This property
images selected from Walmart.com’s home catalog through pHash              makes FENSHSES more cost-effective and thus more suitable espe-
[6, 10]–one of the most effective perceptual hash schemes in gen-          cially to big-data applications.
erating fingerprints for multimedia files (e.g. images, audios and
videos)–with m ∈ {64, 256, 1024, 4096} respectively. Note that vec-                                               FAISS        FENSHSES
tor similarity search based on pHash has been widely used in a                   # of Bits           r
                                                                                                                   (ms)          (ms)
variety of visual tasks including forensic image recognition [17],
                                                                                                    3            34.0           5.8
duplicate image detection [4] and copyright protection [13], etc.
                                                                                     64             7            37.0          25.7
    Settings. For FAISS, we use its binary flat index with five threads.                           11            42.7          117.7
For a fair comparison, we accordingly deploy FENSHSES by creating                                  15            42.9           7.8
its Elasticsearch index with five shards and zero replica. The rest of              256            31            42.8          22.5
configurations are left as their default and suggested values. Both                                47            45.4           77.7
FAISS and FENSHSES are set up and tested on the same Microsoft                                     63            79.2          31.6
Azure virtual machine.                                                             1024           127            81.9           89.9
    Speed in indexing. During the indexing phase, FAISS indexes the                               191            90.4          250.0
data into main memory (i.e., RAM), while FENSHSES indexes the                                     255           222.7         134.2
data into secondary memory (e.g., hard disk). As a consequence,                    4096           511           223.2          612.5
FAISS is much faster than FENSHSES in terms of data indexing (see                                 767           223.3         1797.5
Table 1). But on the other hand, whenever the process is killed and        Table 2: Search latency. FENSHSES is quite competitive for r -
needs a restart, FAISS has to go through this procedure again to           neighbor search when the Hamming distance r is small, while
re-index data into RAM, while FENSHSES could unaffectedly use              the performance of FAISS is pretty robust with respect to r . This
its built index on hard disk without re-indexing.                          provides FAISS and FENSHSES different edges for the task of NNS.

                                FAISS         FENSHSES
              # of Bits
                                (sec.)          (sec.)
                64           18.5            75.5                                                                 FAISS        FENSHSES
                                                                                 # of Bits           r
               256           37.7           140.2                                                                  (GB)          (GB)
               1024          111.9          369.5                                                      3         2.2           1.6
               4096          397.3         1300.9                                    64                7         2.2           1.6
Table 1: Indexing time consumption. FAISS is about four times                                         11         2.2           1.6
faster than FENSHSES in creating the index for nearest neighbor                                       15         2.3           1.6
search.                                                                              256              31         2.3           1.6
                                                                                                      47         2.3           1.6
                                                                                                      63         2.9           1.6
                                                                                    1024             127         2.9           1.6
   Search latency. We randomly select 10, 000 binary codes from B                                    191         2.9           1.6
to act as query codes. For each query code q, we instruct FAISS and                                  255         4.9           1.6
FENSHSES to find all r -neighbors of q in B, namely                                 4096             511         4.9           1.6
                  B H (q, r ) := {b ∈ B | d H (b, q) ≤ r } ,       (2.1)                             767         4.9           1.6
                                                                           Table 3: Main memory (RAM) consumption. The RAM con-
where d H (b, q) := i=1 1 {bi ,qi } denotes the Hamming distance
                       Ím
                                                                           sumed by FAISS substantially grows with the increase in the size
between binary code b and q, and the Hamming radius r ≥ 0. As
                                                                           of dataset B. In contrast, FENSHESE consumes a constant amount
shown in Table 2, FENSHSES is quite competitive for small radium r .
                                                                           of RAM, which is much smaller than the one consumed by FAISS.
This is because FENSHSES fully leverages Elasticsearch’s inverted
index to first conduct a sub-code filtering to only consider a subset
of B for Hamming distance computation, which is most effective
for small r . In contrast, FAISS scans every binary code in B, so its
search latency is almost invariant with respect to r . For applications    3   CONCLUSION
(e.g., near-duplicate image detection and visual search) where we          In this case study, we compare FAISS and FENSHSES for the task
care most about nearest neighbors within a small radius, FENSHSES          of nearest neighbor search in Hamming space. By evaluating their
could be in a more favorable position than FAISS.                          performances in terms of speed in data indexing, search latency
FAISS vs. FENSHSES                                                                       SIGIR 2019 eCom, July 2019, Paris, France


and RAM consumption, we hope practitioners could now better
understand the pros and cons of the main memory based NNS
solutions and the secondary memory based ones, and thus make
their best choices accordingly (at least in NNS systems within binary
cods). In the future, we will compare FAISS and FENSHSES under a
wider range of applications; and moreover, we will also go beyond
Hamming space to evaluate vector similarity search systems for
general NNS problems.

ACKNOWLEDGEMENT
We are grateful to three anonymous reviewers for their helpful
suggestions and comments that substantially improve the paper.
CM would like to thank Jun Zhao and Guang Yang for insightful
discussions on FENSHSES. BY would like to thank Alessandro Mag-
nani for helpful discussions on pHash and its related applications,
and Zuzar Nafar for his support on this study.

REFERENCES
 [1] G. Amato, P. Bolettieri, F. Carrara, F. Falchi, and C. Gennaro. 2018. Large-Scale
     Image Retrieval with Elasticsearch. In SIGIR.
 [2] Erik B. 2018. Annoy: Approximate Nearest Neighbors in C++/Python. https:
     //pypi.org/project/annoy/ Python package version 1.13.0.
 [3] E. P. Brenner, J. Zhao, A. Kutiyanawala, and Z. Yan. 2018. End-to-End Neural
     Ranking for eCommerce Product Search. In SIGIR eCommerce Workshop.
 [4] A. Chaudhuri, P. Messina, S. Kokkula, A. Subramanian, A. Krishnan, S. Gandhi,
     A. Magnani, and V. Kandaswamy. 2018. A Smart System for Selection of Optimal
     Product Images in E-Commerce. In Big Data.
 [5] Qi Chen, Haidong Wang, Mingqin Li, Gang Ren, Scarlett Li, Jeffery Zhu, Jason
     Li, Chuanjie Liu, Lintao Zhang, and Jingdong Wang. 2018. SPTAG: A library for
     fast approximate nearest neighbor search. https://github.com/Microsoft/SPTAG
 [6] Z. Christoph. 2010. Implementation and Benchmarking of Perceptual Image Hash
     Functions. In Upper Austria University of Applied Sciences. Hagenberg Campus.
 [7] M. Deshpande and G. Karypis. 2004. Item-based top-n recommendation algo-
     rithms. ACM Transactions on Information Systems (TOIS) 22, 1 (2004), 143–177.
 [8] H. Hu, R. Zhu, Y. Wang, W. Feng, X. Tan, and J. Huang. 2018. A Best Match KNN-
     based Approach for Large-scale Product Categorization. In SIGIR eCommerce
     Data Challenge.
 [9] J. Johnson, M. Douze, and H. Jégou. 2017. Billion-scale similarity search with
     GPUs. arXiv preprint arXiv:1702.08734 (2017).
[10] E. Klinger and D. Starkweather. 2010. pHash–the open source perceptual hash
     library. Technical Report. accessed 2016-05-19.[Online]. Available: http://www.
     phash. org/apps.
[11] M. Lux and O. Marques. 2013. Visual Information Retrieval Using Java and LIRE.
     Vol. 25. Morgan & Claypool Publishers.
[12] A. Magnani, F. Liu, M. Xie, and S. Banerjee. 2019. Neural Product Retrieval at
     Walmart. com. In WWW Workshop on eCommerce and NLP. ACM.
[13] R. Mehta, N. Kapoor, S. Sourav, and R. Shorey. 2019. Decentralised Image Sharing
     and Copyright Protection using Blockchain and Perceptual Hashes. In COM-
     SNETS.
[14] C. Mu, J. Zhao, G. Yang, B. Yang, and Z. Yan. 2019. Empowering Elasticsearch
     with Exact and Fast r -Neighbor Search in Hamming Space. arXiv preprint
     arXiv:1902.08498 (2019).
[15] C. Mu, J. Zhao, G. Yang, B. Yang, and Z. Yan. 2019. Fast and Exact Nearest
     Neighbor Search in Hamming Space on Full-Text Search Engines. In SISAP.
[16] C. Mu, J. Zhao, G. Yang, J. Zhang, and Z. Yan. 2018. Towards Practical Visual
     Search Engine Within Elasticsearch. In SIGIR eCommerce Workshop.
[17] A. Peter, T. Hartmann, S. Müller, and S. Katzenbeisser. 2012. Privacy-preserving
     architecture for forensic image recognition. In WIFS.
[18] A. Raghava-Raju. 2017. Predicting Fraud in Electronic Commerce: Fraud Detec-
     tion Techniques in E-Commerce. International Journal of Computer Applications
     171, 2 (2017).
[19] M. Ruzicka, V. Novotny, P. Sojka, J. Pomikalek, and R. Rehurek. 2017. Flexible
     Similarity Search of Semantic Vectors Using Fulltext Search Engines. In ISWC
     HSSUES Workshop.
[20] J. Rygl, J. Pomikalek, R. Rehurek, M. Ruzicka, V. Novotny, and P. Sojka. 2017.
     Semantic Vector Encoding and Similarity Search Using Fulltext Search Engines.
     In RepL4NLP Workshop.
[21] J. Wang, J. Wang, G. Zeng, Z. Tu, R. Gan, and S. Li. 2012. Scalable k-nn graph
     construction for visual descriptors. In CVPR.
[22] F. Yang, A. Kale, Y. Bubnov, L. Stein, Q. Wang, H. Kiapour, and R. Piramuthu.
     2017. Visual search at ebay. In KDD.