1. Introduction

10.1137/1.9781611977653.CH99

QuIPU: Evaluating Actual Privacy of Obfuscated Queries.⋆

Discussion Paper

Francesco Luigi

De Faveri

Guglielmo

Faggioli

Nicola

Ferro

0 0 Department of Information Engineering, University of Padova , Padova , Italy

2025

1 16 19

When Information Retrieval (IR) models are applied to and trained on sensitive and personal information, users' privacy is at risk. While mechanisms have been presented to safeguard user privacy, the efectiveness of these privacy protections is generally evaluated by studying the relations between performance on a downstream task and the parameters of the mechanisms, e.g., the privacy budget in Diferential Privacy ( DP). This often causes a partial understanding between formal privacy and the privacy experienced by the user, the actual privacy. In this paper, we discuss the Query Inference for Privacy and Utility (QuIPU) framework, a novel evaluation methodology designed to assess actual privacy based on the risk that an “honest-but-curious” IR system may correctly guess the original query from the obfuscated queries received. The QuIPU framework constitutes the ifrst endeavour to quantify actual privacy for IR tasks, extending beyond the partial comparison of formal privacy parameters. Our findings show that formal privacy parameters do not necessarily correspond to actual privacy, resulting in cases where, despite identical privacy parameters, two systems reveal difering actual privacy levels.

eol>Evaluation Measures Diferential Privacy Information Retrieval Information Security Privacy Risks

1. Introduction

Privacy is a fundamental right preserved by the Universal Declaration of Human Rights and its Article 12. Natural Language Processing (NLP) and Information Retrieval (IR) algorithms are trained and tested using textual datasets consisting of queries, documents, reviews and posts on online social media. In such a large amount of textual data, personal user profiles, personal opinions on diferent matters, such as politics and religions [ 2, 3 ], along with sensitive information needs, can raise privacy concerns from the user interactions with such systems. Specifically, through the analysis of browser search histories and obtained documents, malicious actors can reveal private information, including an individual’s salary and medical conditions [ 4, 5 ]. Heuristic strategies [ 6, 7 ] have been proposed for preserving privacy in document retrieval tasks. From another view, progress in NLP have demonstrated the potential of Diferential Privacy ( DP) [ 8 ] in the release of privacy-preserving text for diferent applications, including text classification [ 9 ], authorship anonymization [ 10 ], and query obfuscation [ 11 ].

Obfuscating a query concerns safeguarding the original information need of a user in such a manner that the resulting obfuscated queries can retrieve relevant documents while not fully disclosing those information needs. For example, the query “how tumour grows” may be transformed into the obfuscated alternatives “how cancer grows”, “how infection spreads”, “how leukemia evolves”. Focusing on the mechanisms’ privacy parameters represents a preliminary way to measure privacy. Several attempts to assess the privacy provided have been proposed by adapting information security measures based on entropy [ 12 ], syntactic and semantic similarities [ 13, 14 ] between original and obfuscated texts. However, all these measures limit the actual privacy evaluation reached by the mechanism [ 15, 16, 17 ]. After observing the obfuscated queries “how cancer grows”, “how infection spreads”, “how leukemia evolves”, an adversarial system can quickly discover the actual user information need about cancer diseases and spreading using available query logs to generate potential guesses of the original query. Nevertheless, some privacy budget parameters would lead to obfuscated queries, giving mathematical guarantees of privacy, and most of the state-of-the-art measures would consider such queries as properly obfuscated. In addition, since such queries still retain similar semantic meaning to the original, they would probably produce high retrieval utility, giving a false impression of privacy and achieving high retrieval results. Therefore, limiting the privacy analysis to the formal mechanism parameters does not quantify the user’s risk in submitting the obfuscated queries [ 18 ].

In this study, we discuss the Query Inference for Privacy and Utility (QuIPU) framework [ 1 ], an evaluation model developed to measure the actual privacy and utility trade-of provided by a mechanism to safeguard potential information leakage in an obfuscation protocol. The QuIPU score is computed by assessing the risk represented by a malicious adversary trying to infer the original user need correctly. Therefore, to estimate such score, the Query Inference Attack (QuIA), a variation of the inference attacks known as Membership Inference Attack (MIA) [ 19 ] tailored for a query obfuscation protocol, is used against the obfuscated queries submitted to the system. Such an attack considers the risk probability that the original query is successfully inferred from a query log by a IR system after analyzing the alternative queries received and obfuscated using diferent configurations, i.e., using diferent formal privacy parameters of an obfuscation mechanism. The measure takes into account the trade-of between privacy and utility, extending beyond the configuration parameters of the obfuscated mechanisms by calculating a modified version of the Area Under the Curve ( AUC) on the risk versus utility trend. Our ifndings show that formal privacy does not necessarily imply actual privacy, explicitly showing that there is a high probability of a correct query guess for low values of the privacy parameter.

The paper is structured by first presenting the Related Works and Background in Section 2, introducing the diferent measures used in the privacy evaluation and the background on the Query Obfuscation protocol. Section 3 presents the formal definition of QuIPU, showing the phases completed to evaluate the actual textual privacy provided by a mechanism. Finally, Section 4 reports the results and discussion of the formal and actual privacy analysis performed on diferent obfuscation mechanisms.

2. Related Works and Background 2.1. Related Works

Diferent metrics have been proposed to organize available privacy measures [ 20, 21 ]. Wagner and Eckhof [ 20 ] systematically classified over eighty privacy metrics, ofering a comprehensive framework for assessing privacy across diferent domains, e.g., communication, databases, and social networks. The survey highlights the significance of identifying the specific aspect of privacy that a metric aims to quantify, suggesting nine guiding aspects for selecting the appropriate privacy measures. Specifically, the authors stress the importance of considering the adversary’s knowledge and capability when evaluating privacy. In addition, Sousa and Kern [ 21 ] described how diferent mechanisms developed for NLP tasks provide privacy for textual data and which can be the threats in such scenarios. Moreover, Habernal [ 22 ] discussed the importance of not relying strictly on formal analysis of DP and its application on NLP tasks but to push research towards concrete measurements of the privacy provided to texts.

Traditional methods for evaluating privacy primarily focus on estimating the failure rates of obfuscation mechanisms [ 12 ] or assessing the similarities between original and obfuscated texts [ 23, 11 ]. On the one hand, uncertainty measures such as and [ 24, 25 ] estimate the probability that a term remains unchanged after obfuscation and the minimum cardinality of the set of words to which is mapped by the mechanism, respectively. However, such measures do not capture if the mechanism changes the original term with a closely related one. On the other hand, the similarity between the original and obfuscated texts is commonly estimated using metrics like the Jaccard index or cosine similarity between sentence embeddings computed by a Transformer, drawing inspiration from the use of BERTScores used to assess the quality of generated texts [ 14 ]. Meisenbacher et al. [ 23 ] proposed the -PUC score to compute an -weighted mean between uncertainty, similarity measures, and utility preserved. This score is tuned by the tuning parameter , which adjusts the focus on utility or privacy, allowing the user to decide whether to prioritize the former or the latter. However, none of the above measures ofer insights into the actual privacy aforded to the texts, nor do they assess the adversarial potential to infer the original meaning of the obfuscated text. Specifically, previous studies [ 26 ] criticized the reliance on formal privacy analysis solely based on the privacy budget parameter. DP mechanisms employing configurations where > 1 lack a comprehensive analysis of actual privacy guarantees, raising concerns about the suficiency of privacy protection methods employed1. In addition, Damie et al. [ 5 ] introduced a novel indicator to assess the risk of successful query recovery attacks within searchable encryption protocols. The study revealed that, even without additional background knowledge, an adversary can obtain the original queries with a success rate of 85%, encouraging analysis of privacy measures employed considering real attack scenarios.

2.2. Background

The research community has widely studied application of privacy measure when performing NLP and IR tasks [ 27, 28, 29 ]. The scenario discussed in this study assumes that the users are willingly paying part of the utility during the document retrieval phase to defend the privacy of their search activity with the Information Retrieval system. The system is considered non-cooperative, as it does not actively contribute to protecting user privacy, e.g., it does not provide any private online API to mask the information need of the user. Figure 1 illustrates the general query obfuscation protocol, the focus of application of QuIPU in IR [ 11, 30 ]. The process considers two distinct domains: on the user (safe) side, the original query is generated by the user and privatized using an obfuscation mechanism, i.e., an algorithm that, given an original sensitive query , generates non-sensitive obfuscated queries that (theoretically) prevent the unveiling of the original information need. These obfuscated queries are sent to the IR system without explicitly disclosing their information need, after the initial obfuscation process. During this step, the user sets the parameters, i.e., formal privacy guarantees, considering the utility lost on the tasks [ 31, 32 ]. On the (unsafe) IR system side, relevant documents are retrieved by the “honest” system considering the obfuscated queries received, thus putting such documents on a lower rank. Finally, the documents are returned to the users for the post-processing described above. To prevent a “curious” IR system from discovering the actual query, the obfuscation methods employed are divided into two families of mechanisms, either based on heuristics or -DP.

Heuristics Obfuscation. To protect privacy in IR tasks, non-formal obfuscation methods were proposed [ 6, 7 ]. Arampatzis et al. [ 6 ] employed the WordNet [ 33 ] database to replace original terms within the query using synonyms, hypernyms, and holonyms. The obfuscation was performed based 1The DP configurations with > 1 deviate from the “theoretically secure” privacy setting, i.e., strong assurance about the formal privacy introduced, see DP definition [ 8 ]. on a hierarchical degree, i.e., the level parameter, aligned with the desired obfuscation the user aims to achieve. Such an approach was further extended by Fröbe et al. [ 7 ]. More in detail, the obfuscation approach retrieves locally the top- documents from a local corpus. Then, using a sliding window, the sequences of terms within such documents are taken as candidate obfuscation queries, removing those queries that contain synonyms and holonyms. Using the top- documents retrieved locally as pseudo-relevant, the queries submitted are the ones that achieve the higher nDCG.

Diferential Privacy ( DP) Obfuscation. Dwork et al. [ 8 ] introduced the -DP framework to formalize the privacy guarantees when releasing data. Given a privacy budget ∈ R+, and any pair of neighbouring datasets , ′, i.e., datasets that difer for only one entry, an obfuscation mechanism ℳ is DP if it holds the inequality Pr [ℳ() ∈ ] ≤ · Pr [ℳ(′) ∈ ] ∀ ⊂ Im(ℳ). DP introduces calibrated noise levels during output computation using the privacy budget , which controls the balance between data privacy and utility. The adoption of the DP framework for metric spaces, and therefore for NLP tasks, has been proposed in [ 34 ]. Metric-DP extends the traditional DP definition by ensuring that the probability of obfuscating two distinct points , ′ is proportional to the distance (, ′) between them. The DP formal framework has enabled the privacy research community to propose diferent strategies based on noisy sampling [ 35, 36, 37, 30 ] and perturbed word embeddings [ 24, 25, 38 ].

3. The QuIPU Evaluation Framework

We present the Query Inference for Privacy and Utility (QuIPU) framework: we report the threat model for an obfuscation protocol and the settings of QuIA. Finally, we report the risk evaluation of the attack.

3.1. Defining the Threat Model

In this scenario, the adversary is represented by the IR system, which aims to understand the original user information need. In the query obfuscation protocol, the sweet spot for inferring the original queries is represented by the ones the system receives. The mechanism parameters, e.g., the privacy budget parameter of the DP obfuscation mechanisms, do not guarantee with absolute certainty that the original text is changed (or changed enough). Therefore, such queries may cause a leakage of the real information need. In addition, for the same parameters, diferent obfuscation strategies may produce texts with diferent obfuscation degrees. For instance, the efect of the parameter depends on the specific mechanism used [ 39, 40]. As a result, two DP mechanisms (one embedding-perturbation based and the other sampling-based) are both parametrized with = 3 and could lead to a situation where one method achieves an actual obfuscation while the other achieves only formal obfuscation. Therefore, the IR system aims to extract as much information as possible from the received queries, previously obfuscated on the user side, using this knowledge to infer the real text.

Consequently, the threat of a successful query inference stems not only from the obfuscation failure of the mechanism but also from the additional knowledge about the queries possessed by the adversary. The IR system might exploit its queries from the logs [41, 42]: by producing a classification on the information needs carried by the obfuscated queries received and the information in its logs, it aims to improve the chances of a correct guess of the original user query. Note that if the original query is not an extremely long tail one, it is reasonable to assume that the original information need has been previously submitted to the IR system, and thus, the attack can succeed with high probability.

Finally, a critical remark must be made regarding using cryptographic primitives in the protocol of the scenario we are analysing. Eavesdroppers or man-in-the-middle adversaries do not significantly threaten the user or the system. Cryptography can be employed while exchanging queries and documents between the client, i.e., the user, and server, i.e., the IR system, ensuring confidentiality among the internal parties of the protocol and security against external auditors. However, confidentiality does not imply privacy: if the IR system aims to disclose the user’s original query, cryptography techniques alone are insuficient to safeguard privacy concerning an internal adversary.

3.2. Introducing the Query Inference Attack (QuIA) Algorithm 1: The Query Inference Attack (QuIA).

Result: Ranked list of query logs ℒ.

Data: obf (obf. queries ′), logs (query log ), (transformer encoder). 1 Encoding (obf) = { (′) ∈ R} and ︀( logs︀) = { () ∈ R}; 6 return ℒ; 3 Compute = [︀ cos (ˆ, ()) , () ∈ 4 Define ℒ = [︀ (, ) , ∈ , ∈ logs︀] ; 2 Define ˆ as the centroid of the vectors in (obf);

︀( logs︀)] ; 5 Sort ℒ in descending order considering the similarity score ;

The class of attacks known as Membership Inference Attack (MIA) was introduced by Shokri et al. [ 19 ] to investigate the information leakage stemming from the output of machine learning models. The attack is defined under the assumption that the attacker sees a data record but has no information about either the model parameters or the actual model architecture, i.e., a so-called black-box scenario. The attack is successful when the attacker can correctly guess whether the data is in the training set.

In an obfuscation protocol, the Query Inference Attack (QuIA) uses the received obfuscated queries and the query logs to generate a ranked list of queries from the logs based on the similarity with the information need. Similarly to the black-box of the MIA scenario, the assumption is that the IR system does not know the obfuscation mechanism used on the user side and the privacy parameters of the obfuscation mechanism. Algorithm 1 reports the pseudo-code of the attack: the system receives the set of obfuscated queries obf and knows its query logs logs. Firstly, it uses a Transformer [43] encoder to obtain the embeddings of the queries in the sets2. Once the texts in obf are encoded, it calculates the centroid ˆ of the vectors in (obf), to capture the average contextual similarities among the obfuscated queries received. The system computes the cosine similarity between the embeddings of the queries from the logs () ∈ ︀( logs︀) and the query ˆ to understand which queries from the logs most closely represents the average information need carried by the obfuscated queries and saves it into the list . The algorithm generates a ranked list ℒ of the queries in the logs ∈ logs by sorting (, ) in descending order based on the similarities ∈ . In case of inefective obfuscations, then most likely, the higher a query from the logs is ranked in ℒ, the more it fits the user information need.

3.3. Assessing the QuIPU Risk Modelling

Privacy is strictly linked with the definition of risk [ 44], i.e., the possibility that an action or event generates consequences that have an impact on what users value, in this scenario, disclosing sensitive information. The higher the risk, the lower the privacy. For example, DP obfuscation mechanisms ofer the possibility that privacy and utility can be balanced by tuning the privacy budget . However, the framework does not provide any assurance against inference attacks [45]. To overcome this limitation, we need a formal definition of the risk against inference in the obfuscation protocol. After the QuIA algorithm has returned the ranked list ℒ, the IR system is tasked to guess the original query. This inference is based on the computed ranking, which considers the similarities between the obfuscated queries received (potentially leaking information) and the system’s query logs (auxiliary knowledge for a correct guess of the original user query). At this point, the IR system strategy to guess the correct query is sequential: knowing that the first query is the most similar to the average information need carried by the obfuscated queries, it represents the best choice for the guess. If the first query in the logs ℒ is the correct query, the attack is successful, and there is a 100% risk of correct inference. On the other hand, if the first one is not the correct guess, the adversary tries with the second query in the list, and so on, until the original query is guessed, decreasing the risk of success. Therefore, the risk of 2 (obf) , (logs) indicate the sets of text embeddings, and with (′), () the singular vector embedding of the queries. successful QuIA in guesses can be defined as the probability that the IR system correctly guesses ¯ as the original query , seeing the sets obf and logs, i.e., = P ︀[ {¯= } ∩ { ≤ } |obf, logs︀] , with the maximum number of guessing attempts the IR system is willing to take. The upper bound for the value of is determined by the size of the set logs. However, determining the precise threshold and assessing the risk the user faces is impossible without access to the IR system’s internal data and kind of attack. Therefore, we propose to model the malicious IR system with three kinds of attackers, representing relevant use cases: i) the “lazy” attacker, i.e., the one that looks only at the top position of the ranked list ℒ and makes only one guess; ii) the “active” attacker, i.e., an adversary that selects the top- queries and checks only with them if the guess is correct; and iii) the “motivated” attacker, i.e., the one that tries all the queries until the original one has been found. To model the probability of the risk a user faces against each of such attackers, we propose to use proxy indicators computed on where the original query appears in the ranked list ℒ: Precision at 1 ( @1) for the lazy attacker, Recall at (@) for the active attacker, and Reciprocal Rank (RR) for the motivated attacker.

Drawing inspiration from the usual ROC AUC, Figure 2 illustrates the evaluation plane that links the risk of a successful QuIA and the utility measure considering a set of queries obf() – an efectiveness measure such as nDCG in the IR case – obfuscated by a certain formal parameter – the parameter in case of DP. In the risk-utility plane, the Risk-Utility Boundary line, i.e., the diagonal, describes two regions where the risk-utility trend (, ) can be: i) above the line indicates that the utility exceeds the risk , and ii) below the line, where is less than . Therefore, the QuIPU score in Equation 1 considers the pairs (, ) estimated by submitting the set of obfuscated queries obf().

QuIPU = 2(+ + − ) = 2 ∫︁ (, ) + 2 ∫︁ (, )

(1) + − where represents an infinitesimal variation on the Risk-Utility Boundary line, and the factor 2 is introduced to map the score from [︀ − 2 2

1 , 1 ]︀ to [ − 1, 1 ] interval. The integrals are calculated with respect to the diagonal of the plane, such that regions where the curve lies below this diagonal, i.e., − , are assigned negative values, indicating that the risk is greater than the utility . Conversely, positive values are computed for regions where the utility exceeds the risk , i.e., +.

In Figure 2, four points are defined. The No Utility Point shows when the risk and utility are reduced to 0. It depicts the situation where the obfuscation mechanism fully modifies the original query, completely stopping a QuIA. However, the user completely renounces the efectiveness of the task, i.e., the submitted queries failed to retrieve any relevant documents. The No Privacy Point illustrates the efect of not using the obfuscation protocol. The queries are not obfuscated, meaning the original query is fully exposed to the IR system, resulting in 100% risk. Yet, utility is fully achieved, as the system uses the original query to retrieve the full list of relevant documents. Finally, the Optimal Point and the Trash Point present the best and worst cases theoretically obtainable. In the first, the obfuscation mechanism provides complete protection against Query Inference attacks, i.e., 0% of risk, maintaining maximum utility. The user’s information need are entirely met during the retrieval without exposing any information about the original query. The second is the opposite of the optimal point, i.e., the mechanism neither obfuscates the query nor can these queries retrieve any relevant documents. This case can happen in a “fully-dishonest” scenario, a.e., a phishing IR system [46].

4. Empirical Evaluation

We present the experimental setup and compare the results observed for the privacy analysis using the parameters and QuIPU score. Further analysis, the data used, and the code are publicly available3.

4.1. Experiments Setup

We test the QuIPU framework on three diferent TREC collections: Deep Learning (DL’19) [ 47] and Deep Learning (DL’20) [48], based on the MS MARCO passages corpus, containing 43 and 54 queries respectively, and the Robust ’04 (Robust ’04) [49] which relies on disks 4 and 5 of the TIPSTER corpus and contains 249 queries. As obfuscation mechanisms, we consider those available in the pyPANTERA framework [50], i.e., CMP, Mahalanobis, their Vickrey Variants, CusText, SanText, TEM, and WBB. As privacy budget , we followed the parametrization reported in the original papers, which is also the one used by the pyPANTERA package and other recent experiments [ 11 ]. In detail, we select ∈ {1, 5, 10, 12.5, 15, 17.5, 20, 25, 30, 50}. The heuristics obfuscation mechanisms, i.e., AEA [ 6 ] and FEA [ 7 ], using diferent synonyms levels {3, 4, 5}, and sliding windows sizes {12, 14, 16}, respectively. We generated 50 obfuscation variants for each query and mechanism configuration. Finally, the IR system used as re-ranker in the post-retrieval phase of the protocol is the neural dense model Contriever, as used in [ 11, 30 ]. The honest aspect of the IR system, i.e., the part that performs the document retrieval task, is also the Contriever model while, the curious part of the system uses as encoder model distilbert-base-uncased [51].We use two models for the tasks to obtain unbiased results, in line with [ 11, 30 ]. To simulate a realistic scenario for the curious IR to perform the QuIA, we use as query logs the AOL collection4 from which 750k queries were selected and added to the original ones.

4.2. Privacy Analysis Using the Mechanisms’ Parameters

The traditional privacy analysis evaluates the utility as a function of the formal privacy parameters, e.g., . Figure 3 reports the results of the nDCG@10 vs the formal privacy parameters on the three diferent collections analysed. Note that the x-axis, representing the PrivacyParameter, considers both the values for the parameter of the DP mechanisms and the parameters of the heuristics [ 6, 7 ]. From this traditional perspective, it emerges that with lower values of the privacy parameters, mechanisms based on a DP strategy, like TEM or SanText, achieve higher efectiveness for low values of the privacy parameter . On the other hand, obfuscation mechanisms based on the embedding obfuscation strategies perform with high efectiveness only if the formal parameter is high. Finally, the Heuristics show high nDCG@10 for AEA and the worst results for the FEA mechanism. These results show a misleading sense of privacy: high results do not imply actual privacy, i.e., the submitted queries are the originals.

4.3. Privacy Analysis Using the QuIPU Score

Table 1 reports the QuIPU scores obtained analysing the Risk vs. Utility on each set of obfuscated queries of the collections. The results show three distinct patterns that can be traced back to the three diferent obfuscation strategies. The Sampling-based mechanisms show weaker defences against the three attackers, and especially against the “active” one, i.e., QuIPU score more negative. In contrast, the Embedding Perturbation mechanisms are designed to protect user information from attackers, yielding higher QuIPU scores even against a “motivated” attacker. This suggests that, when using as DP obfuscation mechanisms, if the user wants to achieve strong actual privacy guarantees against the QuIA, it should select an obfuscation relying on changing the word embeddings of the queries. Finally, the heuristics strategies obtain a null QuIPU score due to the stable risk and utility achieved. FEA reaches a slightly positive QuIPU score against the three attackers, implying that it is impractical for an attacker to guess the original query even if “motivated” to do so.

3https://github.com/Kekkodf/QuIPU_Framework 4https://ir-datasets.com/aol-ia.html

0.7 0.6 00.5 1 CG@0.4 D tyn0.3 il itU0.2 in terms of = nDCG@10, and the risk of a successful Query Inference considering diferent adversary models. Positive values correspond to a better Utility-Privacy trade-of, cf. Section 3.

Obfuscation

Strategy Sampling Embedding Perturbation Heuristics

Mechanism CusText SanText TEM WBB CMP Mahalanobis VickreyCMP VickreyMhl AEA FEA

Lazy Attacker

Active Attacker

Motivated Attacker DL’19

5. Conclusion and Future Work

Assessing the privacy guarantees provided to users during IR tasks remains an open challenge. Relying solely on a formal privacy analysis considering the mechanism parameters is insuficient for concretely evaluating the privacy of obfuscation mechanisms. In this study, we introduced the QuIPU framework, a new benchmark designed to assess actual privacy provided to queries in an obfuscation protocol. We empirically evaluated the risk that an “honest-but-curious” IR system can accurately infer the original query from the obfuscated ones received using its queries from the logs. Our findings demonstrate that strong formal privacy guarantees do not necessarily imply actual privacy protection. In future work, we plan to explore additional proxy measures to investigate their correlation with the QuIPU score. In addition, we intend to explore the capabilities of Large Language Models in determining whether or not a query has been suficiently obfuscated, adopting such models as defensive mechanisms against a successful QuIA.

Declaration on Generative AI

During the preparation of this work, the authors used Grammarly for Readability and Spelling checks. After using this tool, the authors reviewed and edited the content as needed and took full responsibility for the publication’s content.

[1]

L. De Faveri , G. Faggioli,

Ferro , Measuring actual privacy of obfuscated queries in information retrieval , in: Advances in Information Retrieval: 47th European Conference on Information Retrieval , ECIR 2025 , Lucca, Italy, April 6- 10 , 2025 , Proceedings, Part

, SpringerVerlag, Berlin, Heidelberg, 2025 , p. 49 - 66 . URL: https://doi.org/10.1007/978-3- 031 -88708- 6 _4. doi: 10 .1007/978-3- 031 -88708- 6 _ 4 .

[2]

Le ,

Maragh ,

Ekdale ,

High ,

Havens ,

Shafiq , Measuring political personalization of google news search , in: The World Wide Web Conference , WWW '19, Association for Computing Machinery, New York, NY, USA, 2019 , p. 2957 - 2963 . URL: https://doi.org/10.1145/3308558.3313682. doi: 10 .1145/3308558.3313682.

[3]

Mustafaraj ,

Lurie , C. Devine, The case for voter-centered audits of search engines during political elections , in: Proceedings of the 2020 Conference on Fairness, Accountability, and Transparency , FAT* ' 20 , Association for Computing Machinery, New York, NY, USA, 2020 , p. 559 - 569 . URL: https://doi.org/10.1145/3351095.3372835. doi: 10 .1145/3351095.3372835.

[4]

Bavadekar ,

A. M.

Dai ,

Davis ,

Desfontaines , I. Eckstein,

Everett ,

Fabrikant ,

Flores ,

Gabrilovich ,

Gadepalli ,

Glass ,

Huang ,

Kamath ,

Kraft ,

Kumok ,

Marfatia ,

Mayer ,

Miller ,

Pearce ,

I. M.

Perera ,

Ramachandran ,

Raman ,

Roessler , I. Shafran,

Shekel ,

Stanton ,

Stimes ,

Sun , G. Wellenius,

Zoghi , Google

COVID

-19 search trends symptoms dataset: Anonymization process description (version 1 .0), CoRR abs/ 2009 .01265 ( 2020 ). URL: https://arxiv.org/abs/ 2009 .01265. arXiv: 2009 .01265.

[5]

Damie ,

Hahn ,

Peter , A highly accurate query-recovery attack against searchable encryption using non-indexed documents , in: M. D. Bailey , R. Greenstadt (Eds.), 30th USENIX Security Symposium, USENIX Security 2021 , August 11- 13 , 2021 ,

USENIX

Association , 2021 , pp. 143 - 160 . URL: https://www.usenix.org/conference/usenixsecurity21/presentation/damie.

[6]

Arampatzis , G. Drosatos,

Efraimidis , A versatile tool for privacy-enhanced web search , in: P. Serdyukov,

Braslavski ,

S. O.

Kuznetsov ,

Kamps ,

S. M.

Rüger ,

Agichtein ,

Segalovich , E. Yilmaz (Eds.), Advances in Information Retrieval - 35th European Conference on IR Research , ECIR 2013 , Moscow, Russia, March 24 -27, 2013 . Proceedings, volume 7814 of Lecture Notes in Computer Science, Springer, 2013 , pp. 368 - 379 . URL: https://doi.org/10.1007/978-3- 642 -36973-5_ 31 . doi: 10 .1007/978-3- 642 -36973-5\_ 31 .

[7]

Fröbe ,

E. O.

Schmidt ,

Hagen , Eficient query obfuscation with keyqueries , in: J. He , R.

Unland , E. S.

Jr. , X.

Tao , H.

Purohit , W. van den Heuvel, J. Yearwood, J. Cao (Eds.), WI-IAT '21: IEEE/WIC/ACM International Conference on Web Intelligence, Melbourne VIC Australia, December 14 - 17 , 2021 , ACM, 2021 , pp. 154 - 161 . URL: https://doi.org/10.1145/3486622.3493950. doi: 10 .1145/3486622.3493950.

[8]

Dwork ,

McSherry ,

Nissim ,

Smith , Calibrating noise to sensitivity in private data analysis , in: S. Halevi, T. Rabin (Eds.), Theory of Cryptography , Springer Berlin Heidelberg, Berlin, Heidelberg, 2006 , pp. 265 - 284 .

[9]

Feyisetan ,

Kasiviswanathan , Private release of text embedding vectors , in: Y. Pruksachatkun , A.

Ramakrishna , K.-W.

Chang , S.

Krishna , J.

Dhamala , T.

Guha , X. Ren (Eds.), Proceedings of the First Workshop on Trustworthy Natural Language Processing , Association for Computational Linguistics, Online, 2021 , pp. 15 - 27 . URL: https://aclanthology.org/ 2021 .trustnlp- 1 .3. doi: 10 .18653/v1/ 2021 .trustnlp- 1 .3.

[10]

Bo ,

S. H. H.

Ding ,

B. C. M.

Fung ,

Iqbal , ER-AE: diferentially private text generation for authorship anonymization , in: K. Toutanova , A.

Rumshisky , L.

Zettlemoyer , D. HakkaniTür, I. Beltagy, S.

Bethard , R.

Cotterell , T.

Chakraborty , Y. Zhou (Eds.), Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, NAACL-HLT 2021, Online , June 6-11, 2021 , Association for Computational Linguistics, 2021 , pp. 3997 - 4007 . URL: https://doi.org/10.18653/v1/ 2021 .naacl-main. 314 . doi: 10 .18653/V1/ 2021 .NAACL-MAIN. 314 .

[11]

Faggioli ,

Ferro , Query obfuscation for information retrieval through diferential privacy , in: N. Goharian , N.

Tonellotto , Y.

He , A.

Lipani , G.

McDonald , C.

Macdonald , I. Ounis (Eds.), Advances in Information Retrieval - 46th European Conference on Information Retrieval , ECIR 2024 , Glasgow, UK, March 24 -28, 2024 , Proceedings, Part

, volume 14608 of Lecture Notes in Computer Science, Springer, 2024 , pp. 278 - 294 . URL: https://doi.org/10.1007/978-3- 031 -56027-9_ 17 . doi: 10 .1007/978-3- 031 -56027-9\_ 17 .

[12]

Clauß ,

Schifner , Structuring anonymity metrics , in: A. Juels , M. Winslett , A . Goto (Eds.), Proceedings of the 2006 Workshop on Digital Identity Management , Alexandria, VA , USA, November 3, 2006 , ACM, 2006 , pp. 55 - 62 . URL: https://doi.org/10.1145/1179529.1179539. doi: 10 .1145/1179529.1179539.

[13]

Kang ,

Liu ,

Niu ,

Tong ,

Zhang , W. Wang, Input perturbation: A new paradigm between central and local diferential privacy , CoRR abs/ 2002 .08570 ( 2020 ). URL: https://arxiv.org/abs/ 2002 . 08570. arXiv: 2002 .08570.

[14]

Zhang ,

Kishore ,

Wu ,

K. Q.

Weinberger ,

Artzi , Bertscore: Evaluating text generation with BERT , in: 8th International Conference on Learning Representations, ICLR 2020 ,

Addis

Ababa , Ethiopia, April 26-30 , 2020 , OpenReview.net, 2020 . URL: https://openreview.net/forum?id= SkeHuCVFDr.

[15]

Domingo-Ferrer ,

Sánchez ,

Blanco-Justicia , The limits of diferential privacy (and its misuse in data release and machine learning ), Commun. ACM 64 ( 2021 ) 33 - 35 . URL: https: //doi.org/10.1145/3433638. doi: 10 .1145/3433638.

[16]

Mattern ,

Weggenmann ,

Kerschbaum , The limits of word level diferential privacy , in: M. Carpuat , M. de Marnefe, I. V. M. Ruíz (Eds.), Findings of the Association for Computational Linguistics: NAACL 2022 , Seattle, WA, United

States

, July 10-15 , 2022 , Association for Computational Linguistics, 2022 , pp. 867 - 881 . URL: https://doi.org/10.18653/v1/ 2022 .findings-naacl. 65 . doi: 10 .18653/V1/ 2022 .FINDINGS-NAACL. 65 .

[17]

F. L. D.

Faveri , G. Faggioli,

Ferro , Beyond the parameters: Measuring actual privacy in obfuscated texts , in: K. Roitero,

Viviani ,

Maddalena , S. Mizzaro (Eds.), Proceedings of the 14th Italian Information Retrieval Workshop , Udine, Italy, September 5- 6 , 2024 , volume 3802 of CEUR Workshop Proceedings, CEUR-WS.org , 2024 , pp. 53 - 57 . URL: https://ceur-ws. org/ Vol- 3802 /paper5.pdf.

[18]

Duncan ,

Keller-McNulty ,

Stokes , Disclosure risk vs. data utility: The ru confidentiality map , A Los Alamos National Laboratory Technical Report LA-UR-01-6428 ( 2001 ) 1 - 30 .

[19]

Shokri ,

Stronati ,

Song ,

Shmatikov , Membership inference attacks against machine learning models , in: 2017 IEEE Symposium on Security and Privacy , SP 2017 , San Jose, CA, USA, May 22 -26, 2017 , IEEE Computer Society, 2017 , pp. 3 - 18 . URL: https://doi.org/10.1109/SP. 2017 . 41 . doi: 10 .1109/SP. 2017 . 41 .

[20]

Wagner ,

Eckhof , Technical privacy metrics: A systematic survey , ACM Comput. Surv . 51 ( 2018 ) 57 : 1 - 57 : 38 . URL: https://doi.org/10.1145/3168389. doi: 10 .1145/3168389.

[21]

Sousa ,

Kern , How to keep text private? A systematic review of deep learning methods for privacy-preserving natural language processing , Artif. Intell. Rev . 56 ( 2023 ) 1427 - 1492 . URL: https://doi.org/10.1007/s10462-022-10204-6. doi: 10 .1007/S10462-022-10204-6.

[22] I. Habernal , When diferential privacy meets NLP: the devil is in the detail , in: M. Moens , X.

Huang , L.

Specia , S. W. Yih (Eds.), Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing, EMNLP 2021 , Virtual Event / Punta Cana, Dominican Republic, 7 - 11 November , 2021 , Association for Computational Linguistics, 2021 , pp. 1522 - 1528 . URL: https: //doi.org/10.18653/v1/ 2021 .emnlp-main. 114 . doi: 10 .18653/V1/ 2021 .EMNLP-MAIN. 114 .

[23]

S. J.

Meisenbacher ,

Nandakumar ,

Klymenko ,

Matthes , A comparative analysis of word-level metric diferential privacy: Benchmarking the privacy-utility trade-of , in: N. Calzolari , M.

Kan , V.

Hoste , A.

Lenci , S.

Sakti , N. Xue (Eds.), Proceedings of the 2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation, LREC/COLING 2024 , 20 -25 May, 2024 , Torino, Italy, ELRA and ICCL, 2024 , pp. 174 - 185 . URL: https://aclanthology.org/ 2024 . lrec-main. 16 .

[24]

Feyisetan ,

Balle ,

Drake , T. Diethe, Privacy- and utility-preserving textual analysis via calibrated multivariate perturbations , in: J. Caverlee , X. B.

Hu , M.

Lalmas , W. Wang (Eds.), Proceedings of the 13th International Conference on Web Search and Data Mining, ACM , 2020 , pp. 178 - 186 . doi: 10 .1145/3336191.3371856.

[25]

Xu ,

Aggarwal ,

Feyisetan ,

Teissier , A diferentially private text perturbation method using regularized mahalanobis metric , in: Proceedings of the Second Workshop on Privacy in NLP, Association for Computational Linguistics , 2020 . doi: 10 .18653/v1/ 2020 .privatenlp- 1 .2.

[26]

Blanco-Justicia ,

Sánchez ,

Domingo-Ferrer ,

Muralidhar , A critical review on the use (and misuse) of diferential privacy in machine learning , ACM Comput. Surv . 55 ( 2023 ) 160 : 1 - 160 : 16 . URL: https://doi.org/10.1145/3547139. doi: 10 .1145/3547139.

[27]

Zimmerman ,

Thorpe ,

Fox , U. Kruschwitz, Investigating the interplay between searchers' privacy concerns and their search behavior , in: B. Piwowarski , M.

Chevalier , É. Gaussier, Y.

Maarek , J.

Nie , F.

Scholer (Eds.), Proceedings of the 42nd International ACM SIGIR Conference on Research and Development in Information Retrieval , SIGIR 2019 , Paris, France, July 21-25 , 2019 , ACM, 2019 , pp. 953 - 956 . URL: https://doi.org/10.1145/3331184.3331280. doi: 10 .1145/3331184.3331280.

[28]

Zhao ,

Chen , A survey on diferential privacy for unstructured data content , ACM Comput. Surv . 54 ( 2022 ) 207 : 1 - 207 : 28 . URL: https://doi.org/10.1145/3490237. doi: 10 .1145/3490237.

[29]

Klymenko ,

Meisenbacher ,

Matthes , Diferential privacy in natural language processing the story so far , in: O. Feyisetan , S.

Ghanavati , P.

Thaine , I.

Habernal , F.

Mireshghallah (Eds.), Proceedings of the Fourth Workshop on Privacy in Natural Language Processing , Association for Computational Linguistics, Seattle, United States, 2022 , pp. 1 - 11 . URL: https://aclanthology.org/ 2022 .privatenlp- 1 .1. doi: 10 .18653/v1/ 2022 .privatenlp- 1 .1.

[30] F. L. De Faveri , G. Faggioli, N. Ferro , Words Blending Boxes. Obfuscating Queries in Information Retrieval using Diferential Privacy , CoRR abs/2405 .09306 ( 2024 ). URL: https://doi.org/10.48550/ arXiv.2405.09306. doi: 10 .48550/ARXIV.2405.09306. arXiv: 2405 . 09306 .

[31]

Clifton , T. Tassa, On syntactic anonymity and diferential privacy , in: C. Y. Chan , J.

Lu , K.

Nørvåg , E. Tanin (Eds.), Workshops Proceedings of the 29th IEEE International Conference on Data Engineering, ICDE 2013 , Brisbane, Australia, April 8- 12 , 2013 , IEEE Computer Society, 2013 , pp. 88 - 93 . URL: https://doi.org/10.1109/ICDEW. 2013 . 6547433 . doi: 10 .1109/ICDEW. 2013 . 6547433 .

[32]

Hsu ,

Gaboardi ,

Haeberlen ,

Khanna ,

Narayan ,

B. C.

Pierce ,

Roth , Diferential privacy: An economic method for choosing epsilon , in: IEEE 27th Computer Security Foundations Symposium, CSF 2014 , Vienna, Austria, 19 - 22 July , 2014 , IEEE Computer Society, 2014 , pp. 398 - 410 . URL: https://doi.org/10.1109/CSF. 2014 . 35 . doi: 10 .1109/CSF. 2014 . 35 .

[33]

G. A.

Miller , Wordnet: A lexical database for english , Commun. ACM 38 ( 1995 ) 39 - 41 . URL: https://doi.org/10.1145/219717.219748. doi: 10 .1145/219717.219748.

[34]

Chatzikokolakis ,

M. E.

Andrés ,

N. E.

Bordenabe , C. Palamidessi, Broadening the scope of diferential privacy using metrics , in: E. D. Cristofaro , M. K. Wright (Eds.), Privacy Enhancing Technologies - 13th International Symposium, PETS 2013 , Bloomington , IN, USA, July 10 - 12 , 2013 . Proceedings, volume 7981 of Lecture Notes in Computer Science, Springer, 2013 , pp. 82 - 102 . URL: https://doi.org/10.1007/978-3- 642 -39077- 7 _5. doi: 10 .1007/978-3- 642 -39077-7\_5.

[35]

Chen ,

Mo ,

Wang ,

Chen ,

J.-Y.

Nie ,

Wang ,

Cui , A customized text sanitization mechanism with diferential privacy , in: A. Rogers , J. Boyd-Graber , N. Okazaki (Eds.), Findings of the Association for Computational Linguistics: ACL 2023 , Association for Computational Linguistics , Toronto, Canada, 2023 , pp. 5747 - 5758 . URL: https://aclanthology.org/ 2023 .findings-acl. 355 . doi: 10 .18653/v1/ 2023 .findings-acl. 355 .

[36]

Yue ,

Du ,

Wang ,

Li ,

Sun ,

S. S. M.

Chow , Diferential privacy for text analytics via natural text sanitization , in: C. Zong , F.

Xia , W.

Li , R.

Navigli (Eds.), Findings of the Association for Computational Linguistics: ACL-IJCNLP 2021 , Association for Computational Linguistics , Online, 2021 , pp. 3853 - 3866 . URL: https://aclanthology.org/ 2021 .findings-acl. 337 . doi: 10 .18653/ v1/ 2021 .findings-acl. 337 .

[37]

R. S.

Carvalho ,

Vasiloudis ,

Feyisetan ,

Wang , TEM: high utility metric diferential privacy on text , in: S. Shekhar , Z.

Zhou , Y.

Chiang , G. Stiglic (Eds.), Proceedings of the 2023 SIAM International Conference on Data Mining, SDM 2023 , Minneapolis-St . Paul Twin Cities, MN, USA, April 27 - 29 , 2023 , SIAM, 2023 , pp. 883 - 890 . URL: https://doi.org/10.1137/1.9781611977653. ch99 .