=Paper=
{{Paper
|id=Vol-3631/paper6
|storemode=property
|title=Investigating Phishing Attacks using the Registration Data Access Protocol (RDAP)
|pdfUrl=https://ceur-ws.org/Vol-3631/paper6.pdf
|volume=Vol-3631
|authors=Hauke Jan Lübbers
|dblpUrl=https://dblp.org/rec/conf/apwg-eu/Lubbers23
}}
==Investigating Phishing Attacks using the Registration Data Access Protocol (RDAP)==
<pdf width="1500px">https://ceur-ws.org/Vol-3631/paper6.pdf</pdf>
<pre>
                                Investigating Phishing Attacks using the Registration Data
                                Access Protocol (RDAP)
                                Hauke Jan Lübbers1
                                1
                                    CSIS Security Group A/S, Vestergade 2B, 1456 Copenhagen, Denmark


                                                                       Abstract
                                                                       The Registration Data Access Protocol (RDAP) is a successor to the WHOIS protocol and enables programmatic access to the
                                                                       registration data of internet resources. Using RDAP, we investigated phishing attacks observed over four days, focusing on
                                                                       DNS domain names. In this paper, we present the opportunities and problems identified in the process. We find that low RDAP
                                                                       adoption among ccTLD registry operators, strict rate limiting, differences in data representation, and the (un-)availability
                                                                       of data due to privacy regulations continue to be hindrances to the widespread use of RDAP in cybercrime investigations.
                                                                       While these issues are currently preventing security researchers from solely relying on RDAP for accessing domain name
                                                                       registration data, we recognize its potential as a valuable data enrichment source for investigating phishing attacks at scale.

                                                                       Keywords
                                                                       Registration Data Access Protocol (RDAP), WHOIS, domain names, phishing, cybercrime investigation


                                1. Introduction                                                                                  in the context of phishing attack investigations.

                                Phishing is a common attack vector employed by threat
                                actors. While the attackers’ motives and sophistication 2. On the Registration Data Access
                                may vary, phishing attacks continue to be a relatively                                                      Protocol
                                simple method for threat actors to gather credentials for
                                later exploitation [1].                                                                                RDAP is a Registration Data Directory Service (RDDS)
                                   When investigating phishing attacks, a common first standard that enables programmatic access to informa-
                                step is to look up the domain registration data, or tion about different internet resources, such as DNS do-
                                "WHOIS information", of the involved domain names main names, DNS name servers, IP addresses, and Au-
                                suspected of hosting a phishing website or of sending tonomous System Numbers (ASNs). The protocol was
                                phishing emails [2]. This information can include details first standardized by the Internet Engineering Task Force
                                of the registrant, the time of registration and renewals, in March 2015 and has since been extended [6].
                                the registrar, and nameservers used. With it, phishing                                                    RDAP defines a RESTful API to be provided by TLD
                                attacks might be correctly identified as such, classified, registry operators and domain registrars. Accessible via
                                attributed to previously observed threat actors based on HTTP over TLS, these APIs can be queried for registra-
                                similar modi operandi, and actively defended against, by tion details of, for example, DNS domain names, as shown
                                sending takedown requests to the given abuse contacts. in 1 [7] [5].
                                   The commonly used method of looking up domain
                                registration data is the established WHOIS protocol, first Listing 1: Example RDAP API request to query informa-
                                standardized in 1982 [3]. Being a relatively old protocol, it                                                    tion on example.com
                                comes with several shortcomings. These were addressed                                                    GET https://rdap.registry.example/v1/domain/example.com
                                in the standardization of a new protocol: The Registration
                                Data Access Protocol (RDAP) [4] [5].                                                                      The expected response, assuming that the RDAP ser-
                                   However, RDAP is a comparatively new standard and vice has information about the queried object, is given in
                                presents some challenges: It is not available for all Top JSON following a schema defined by RFC 9083 [8]. Thus,
                                Level Domains (TLDs) and, if available, is often provided the response should be given in a machine-readable for-
                                with restrictive terms of service. Registration data is mat under the mime type application/rdap+json.
                                often redacted due to privacy regulations. In this paper,                                                 With RFC 9224, the IETF standardized a way to iden-
                                we investigate how these issues affect the use of RDAP tify the authoritative RDAP service for the internet re-
                                                                                                                                       source that should hold valid registration information
                                APWG.EU Technical Summit and Researchers Sync-Up 2023, Dublin, on a domain name, IP address, or ASN. The so-called
                                Ireland, June 21 & 22, 2023                                                                            "bootstrap services" associate TLDs, IP ranges, and ASN
                                $ hjl@csis.com (H. J. Lübbers)                                                                         ranges with their respective authoritative RDAP services
                                 0009-0007-3481-7616 (H. J. Lübbers)                                                                  [9]. Bootstrap services are provided by the Internet As-
                                          © 2023 Copyright for this paper by its authors. Use permitted under Creative Commons License
                                    CEUR
                                          Attribution 4.0 International (CC BY 4.0).
                                          CEUR Workshop Proceedings (CEUR-WS.org)
                                    Workshop
                                                  http://ceur-ws.org
                                                  ISSN 1613-0073
                                                                                                                                       signed Numbers Authority (IANA) [10].
                                    Proceedings


CEUR
                  ceur-ws.org
Workshop      ISSN 1613-0073
Proceedings
   Some registry operators do not hold all of the required        Finally, the domain name registration data should con-
domain name registration data for all their registered do-     tain the abuse contacts of the sponsoring registrar. If a
main names ("thin" registries). Instead, they direct RDDS      domain name is deemed to be registered with malicious
queries to the sponsoring registrar through which the          intent, security researchers can report it to the registrar,
domain name was registered [11]. In the case of RDAP,          which should have a procedure in place to react to these
this is done in the link section of the RDAP response, in      reports and suspend the domain name [15].
which a link to the RDAP API endpoint of the sponsoring
registrar is given with the relation-type rel "related" and
the (MIME-)type "application/rdap+json".                       4. Methodology
   With its properties of machine-readability, transport-
                                                               To understand whether RDAP can be used to gather do-
encryption, and internationalization, RDAP is a signifi-
                                                               main name registration data to be subsequently employed
cant improvement over its RDDS predecessor, the WHOIS
                                                               in the fight against phishing as described in 3, we built a
protocol. WHOIS does not define structured responses
                                                               software system that analyzes the availability and data
and instead returns unstandardized plain text, is not
                                                               quality of domain name registration data and tested it
transport-encrypted, and does not standardize handling
                                                               against a realistic workload of domain names suspected
of encodings other than ASCII, which is a problem for
                                                               to be involved in phishing.
languages with non-ASCII character sets [4].

                                                               4.1. The "rdapper" software system
3. Domain name registration data
                                                               The purpose of the "rdapper" software system was to
   in the fight against phishing                               correctly process requests for registration data of domain
                                                               names following the RDAP standard, temporarily store
Domain name registration data, accessed either via
                                                               results in a database for caching purposes, and schedule
WHOIS or RDAP, can be applied in the fight against
                                                               the requests to RDAP services in order to comply with
phishing for multiple purposes: The age of a domain
                                                               the providers’ terms of service.
name, as given by its registration date and subsequent
                                                                   Key operations of the system were configured to send
renewal dates, is often used as an indicator of its trust-
                                                               telemetry data to a monitoring and visualization system.
worthiness; with older domains being considered more
                                                               This allowed us to oversee the operation of the system
trustworthy than more recently registered ones. This
                                                               and extract metrics after the experiment had concluded.
practice has led to some threat actors waiting for a while
                                                                   The RDAP standard, which is heavily based on well
between registering a domain name and starting to use it
                                                               established web-standards and technologies, makes it
for malicious purposes, to "age" their domain names and
                                                               straightforward to implement an RDAP client that iden-
thereby giving them a higher chance of evading detection
                                                               tifies the authoritative RDAP service for a given internet
[12] [2].
                                                               resource [9], sends an HTTP request to the RESTful API
   If a domain name is confirmed to be involved in a phish-
                                                               [7] [5], and parses the JSON response [8]. But in this
ing attack, its age can be used to differentiate between
                                                               scenario, we would be ignoring the rate limits that are
domain names that were registered solely for this mali-
                                                               defined in the "Terms of Service" or "Acceptable Use"
cious purpose and benign domain names merely pointing
                                                               policies of the many RDAP services the client might
to a host that was compromised by the threat actors [13].
                                                               connect to. To adhere to these rate limits, we built a
A third option to be considered is the misuse of legiti-
                                                               scheduling system for our RDAP queries. It took into
mate file- or web-hosting services for phishing purposes.
                                                               account the time of the last lookup to this RDAP host
This classification is important, as the counter-actions
                                                               and possible back-off requests in the form of HTTP re-
taken by security researchers differ based on the type of
                                                               sponses with status code 429 ("Too many requests") and
phishing at hand.
                                                               their Retry-After response header values if they were
   While the registrant data provided by threat actors
                                                               defined by the server. This delay was individually con-
is often falsified, it can be used to cluster domains that
                                                               figurable for each RDAP host. When building such a
were registered in bulk, and thereby help to detect new
                                                               scheduling system, it is necessary to track queries per
phishing domains [14] [15]. This assumes that the reg-
                                                               RDAP host, not per RDAP service, as some RDAP ser-
istrant data is publicly available, which often is not the
                                                               vices for different TLDs are hosted on the same server
case, as discussed in section 5.4.2 "Data Redaction". In the
                                                               and track RDAP queries across all services.
absence of registrant data, other information on a domain
                                                                   We chose a default delay of five minutes between
name’s registration process, like the sponsoring registrar
                                                               queries to the same host based on the longest observed
or reseller used, can help to cluster attacks through simi-
                                                               required delay when we began this project. After each
lar modi operandi, and potentially attribute it to the same
                                                               run of the experiment, we identified the RDAP services
threat actor, often connected with other data.
that had accumulated the longest delay between the re-


                                                                                                            me
                                                                                                            be
                                                                                                            ph
                                                                                                            at
                                                                                                            jp
quest of domain registration data and the execution of


                                                                                                             es
                                                                                                             za
                                                                                                             se
                                                                                                              tw
                                                                                                               ch
                                                                                                                pl
                                                                                                                 in
                                                                                                                  ws
                                                                                                                   us
                                                                                                                    au
the RDAP query. We then tried to find stated rate lim-


                                                                                                                      it
                                                                                                                        co
                                                                                                                          eu
                                                                                                                            gq
its in the Terms of Service documents of these registry


                                                                                                         nl
                                                                                                    m
                                                                                                     l
                                                                                               cf
operators or registrars. If we could not find any public                                 ga

rate limit information, we would reach out to the RDAP                              ru                  cou
                                                                                                           ntr
                                                                                                                 y-c
service provider and ask them for safe rate limits for their               de
                                                                                                                    od
                                                                                                                        e
                                                                                                                                 No
                                                                                                                                                                                                           com
RDAP API. Then, we would adjust the delays for those                                                                                  RD
                                                                                                                                        AP


RDAP hosts accordingly.
   The system identified itself to the RDAP service
                                                                          tk


providers by sending request headers with a unique                                                                                                                       RD
                                                                                                                                                                           AP
                                                                                                                                                                                Su           gene

User-Agent and an email address in the From header,
                                                                                                                                                                                  pp              ri   c
                                                                                                                   ted                                                                 ort
                                                                               cn
                                                                                                              tric
                                                                                                        - res
in order to give RDAP service providers the ability to                              bcizz
                                                                                                    gen
                                                                                                                            -cod
                                                                                                                                 e
                                                                                                                     tr y
contact us should our queries cause issues or breach any                                ca
                                                                                          fr                  co
                                                                                                                un


terms of service. 1 We did not rotate our clients’ IP ad-
                                                                                               br


dresses or implemented any other attempts at evading


                                                                                                    uk

                                                                                                                  swoivezz
                                                                                                                 lotaorrek
                                                                                                                     lbu
                                                                                                                      n
any potential terms of service enforcement by the RDAP


                                                                                                              shvoip
                                                                                                             clu p
                                                                                                                 b

                                                                                                                                           ne
                                                                                                            site


                                                                                                                                                                                net
                                                                                                                                      onli
                                                                                                                                      icu
                                                                                                                                                info


                                                                                                                                                                   org
service providers.


                                                                                                                                                       top

                                                                                                                                                             xyz
4.2. Analyzed domain names                                             Figure 1: Visualization of the 50 most popular TLDs based on
                                                                       approximate numbers of registered domain names [16]
As part of this experiment, we analyzed 53,270 unique                                                       , grouped by TLD type and official RDAP support as of May 12th 2023.

domain names. All domain names were labeled by CSIS
Security Group’s Cyber Intelligence Platform to be poten-
tially involved in phishing attacks. This does not mean                challenges that we observed related to the use of RDAP
that all of the domains were confirmed to host phishing                to fight phishing.
pages: they include (1) false positive reports, (2) domain
names from URLs referred to in phishing emails that are                5.1. RDAP coverage in the domain name
not malicious, (3) domain names solely registered for
phishing purposes, (4) domain names of compromised
                                                                            ecosystem
benign websites, and (5) domain names of popular file-                 gTLD registry operators and registrars are required by
or web-hosting services misused for phishing attacks.                  ICANN to provide an RDAP service as of the 26th of Au-
   Data from four individual days 2 was used to aggregate              gust 2019 [17]. ccTLD registry operators are not required
all domain names suspected to be involved in phishing                  to implement RDAP, but a number of them do support it,
over the course of the past 24 hours (relative to the indi-            as shown in Figure 1.
vidual day). The TLD distribution in the test dataset is                  At the time of writing, 1192 of all 1479 Top Level Do-
shown in Table 3. They were analyzed by the "rdapper"                  mains (80.59%) have an authoritative RDAP service as-
system on the following days, again over the course of                 signed in IANAs RDAP bootstrap file for Domain Name
24 hours. This was done to simulate a realistic work-                  System registrations [10] [18]. In addition to these of-
load of a system that processes RDAP requests for an                   ficially announced RDAP services, we identified seven
anti-phishing system.                                                  RDAP services for ccTLDs which were not published
                                                                       in the IANA bootstrap file. Some of these services are
                                                                       in a testing phase and state this as the reason for their
5. Four days of phishing: The                                          absence from the bootstrap file. These unofficial RDAP
   results                                                             services bring the total of RDAP-supporting ccTLDs up
                                                                       to 34, or 11% of all ccTLDs.
We analyzed the RDAP availability and, if available, the                  These percentages show how far registry operators
results of 53,270 unique domains that were suspected to                have come in adopting RDAP. But for practitioners, these
be involved in phishing attacks over the four selected                 numbers need to be adjusted by the distribution of do-
days. In the following, we highlight and comment on the                main names registered across the TLDs – or even better,
                                                                       by the distribution of domains of interest across TLDs;
                                                                       in our case these are domain names suspected of being
1
    We did not receive any emails.                                     connected to phishing attacks.
2
    The data collection days were May 10th 2023, May 22rd 2023, May
    31st 2023, and June 5th 2023, with the processing days being the
                                                                          Using approximate numbers of registered domains for
    following day respectively.                                        each TLD based on domainnamestats.com data accessed
Table 1
RDAP coverage for active (assigned) TLDs according to the IANAs RDAP bootstrap file for Domain Name System registrations
by type as of May 11, 2023.

         Type of TLD                           Active TLDs       Official RDAP support   RDAP Support Percentage
         Generic & generic-restricted TLDs             1155                      1155                        100%
         Sponsored & infrastructure TLDs                 15                        10                       66.67%
         Country code TLDs                              309                        27                        8.73%
         Total                                         1479                      1192                       80.59%


on May 12th 2023 resulted in a calculated coverage of            services from this list.
67.45% of registered domain name having an official au-             One registrar-run RDAP service was served behind
thoritative RDAP service assigned to them [16]. This             a bot detection service of a content delivery network
coverage can be expected when the distribution of TLDs           provider, making programmatic access impossible.
in a workload is similar to the distribution of all registered      One registrar had restricted access to the individual
domain names across TLDs.                                        domain name lookup endpoint, an example of which is
   For our workload of domain names suspected to be              shown in 1. When notified about this, they pointed us to
involved in phishing, the distribution across TLDs differs       their RDAP endpoint to search for domains instead. This
slightly from the general population. In practice, and           is not a viable solution, as the thin registry RDAP ser-
including the seven unofficial RDAP services, we reached         vice continues to re-redirect queries to the RDAP query
a coverage of 78.86% of our domain name test dataset,            endpoint for individual domain lookups.
the TLD distribution of which can be seen in Table 3.
                                                                 5.3. RDAP service rate limiting
5.2. RDAP service availability
                                                                 As described in Section 4.1, scheduling RDAP queries to
Depending on the setup as a thin or thick registry, the          comply with the terms of service of the RDAP services
registration data lookup for one domain name might in-           and, in particular, the rate limits is a crucial aspect when
volve RDAP queries to one or two RDAP services run by            developing an RDAP client to operate at a certain scale.
a registry operator or a registrar. Generic and generic-         We found that our default delay of 300 seconds between
restricted TLD registry operators and registrars are re-         requests generally seemed to work, and for the most part,
quired by ICANN to operate RDAP services [17]. But               did not result in back-off responses or even permanent
these services are no profit centers and are not "mission-       IP blocks by RDAP service providers.
critical" for most paying customers of these organizations.         We identified rate limits either in public terms of ser-
During our experiment, we saw 399 authoritative RDAP             vice documents or in other parts of the the web presence
services run by registry operators and 229 RDAP services         of eight RDAP service providers. One of those has since
run by registrars, which we were pointed to by RDAP              removed the document, but the rate limits of the remain-
responses of registry operators.                                 ing seven RDAP service providers are listed in Table 4.
   We observed two RDAP services run by registry op-                We contacted eleven RDAP service providers, priori-
erators that continuously returned HTTP 500 ("Inter-             tizing those who had accumulated the longest processing
nal server error") responses. One of those services also         delays during our experiment. Two of those, both gTLD
started to serve an expired TLS certificate for around           registry operators, refused to share the rate limits they
five days, and returned to serving HTTP 500 responses            enforce. Four shared their rate limits after being con-
afterwards. Both RDAP services have since been fixed.            tacted, with the highest rate limit being one query per
   Of the 229 RDAP services run by registrars, we ob-            second and the lowest five queries per minute. In four
served eight that served an invalid or expired TLS certifi-      cases, we never got an answer to our contact attempts via
cate or that were only available via HTTP without TLS            contact forms on the organization’s websites or support
on port 80.                                                      email addresses. In one case, we did not manage to reach
   25 registrar-run RDAP services were consistently un-          people with knowledge about the RDAP service of that
available, meaning that they never served valid RDAP             organization.
responses. If we only contacted these services on two               The lowest rate limit we observed was determined
distinct days or less, we manually confirmed the unavail-        through experimentation, as the registrar did not react to
ability of the services after the experiment phase had con-      our contact attempts: Their RDAP service was configured
cluded, before categorizing them as consistently unavail-        to allow just one query per hour per IP, before responding
able. Through this method, we excluded six registrar-run         with an HTTP 429 "Too many requests" response.
Table 2
RDAP service availability. One host may host multiple RDAP APIs for different TLDs.

 Type of RDAP service                        Observed hosts                     Invalid TLS certificates                Unavailable or invalid API            Total            RDAP queries
 Registry operator                                                399                                              0                                2      2 (0.5%)     46 (0.01% of 42,014)
 Registrar                                                        229                                              8                               25     33 (11%)     1101 (4.3% of 25,593)
 Total                                                            628                                              8                               27    35 (5.6%)     1147 (1.7% of 67,607)


                                                                                                            hours after they were reported, and by that time, they
                                                                                                            might have already been suspended by the registry oper-
                                                                                                            ator or sponsoring registrar. However, in these cases, the
                                                                                                            status of the domain name is usually changed to "server
                                                                         Domain Registration Data: 26,834
                                                                                                            hold" if the action was taken by the registry operator, and
                          RDAP Support: 42,010
                                                 RDAP Response: 40,040
                                                                                                            "client hold" if the action was taken by the registrar [19].
  Total Domains: 53,270                                                                                     We observed 1,364 domain names with a "server hold"
                                                                         Not registered: 13,206             status, 2,447 domain names with a "client hold" status,
                                                                                                            and 437 domain names with both statuses, in total 7.97%.
                                                                                                            3
                                                                                                              Another explanation could be that some RDAP services
                                                                         API Error: 1,970


                                                                         No RDAP Support: 11,260            might have a significant delay between the registration
                                                                                                            of a domain name and the update of the registration data
                                                                                                            database, on which the Registration Data Directory Ser-
                                                                                                            vices rely. We investigate this option in the following
Figure 2: Visualization of RDAP availability and results for                                                section.
domains in the test set.
                                                                                                            5.4.1. Data Freshness
                                                           46,034 of the recorded RDAP responses include an event
5.4. RDAP responses                                        called "last update of RDAP database", which, according
In total, we sent 67,607 queries to RDAP services. This is to the ICANN RDAP Response Profile specification, must
because 25,873 (48.57%) of all domain names in the test contain "a value equal to the timestamp when the RDAP
dataset caused us to make two or more queries, the first database was last updated" [20].
to the TLD registry operator and the following ones to        We compared this self-reported timestamp to the time
the sponsoring registrars that we were pointed to in the   of our system storing the RDAP result. This timestamp
RDAP response of the registry operator. We had wrongly     is recorded   just after the HTTP response of the RDAP
assumed that only thin WHOIS registries would include service has been received. Because of that, and to account
links to the RDAP services of sponsoring registrars, but for minor time-keeping offsets between our database
this was incorrect: Some thick WHOIS registries, like server and the RDAP servers, we allowed for a 10-second
CentralNic for the .xyz TLD, include links to the RDAP buffer time.
service of the sponsoring registrars, too.                    866 RDAP database update timestamps were excluded
   In 13 cases (six .org and seven .info domain names)     because   they lay significantly in the future without spec-
we found "related" links to a third RDAP service in the ifying a time zone. 1,220 timestamps were excluded be-
response of the second RDAP service. None of the RDAP cause the RDAP service was likely implemented incor-
services behind these links responded to our queries.      rectly, as the supposed timestamp of the RDAP database
   Of the 42,010 domain names with RDAP support, update always matched the "last changed" event, stating
13,206 (31.44%) were not registered, according to the re- when the information about the object was last changed,
ceived RDAP responses. This might be an artifact of the or the "registration" event of the domain name [8].
test dataset, which includes domain names that were           36,260 RDAP database update timestamps (82.5% of the
parsed from spam emails and SMS. As the goal of this available and correct timestamps) were within 30 seconds
process is to extract any potential link from the mes- of our request, which indicates that they configured their
sages, some of the extracted domain names, while being RDAP service with a "live" registration data database
technically valid domain names, might not actually be setup. In this case, the time of the request is taken as the
registered. It might also be caused by the fact that we time of the last RDAP database update.
were querying the registration data of these domains 24
                                                                                                            3
                                                                                                                This includes variations like "client_hold" and "ServerHold"
5.4.2. Data Redaction                                                 structures of their RDDS backend. An anonymized ex-
                                                                      ample of such an entity object representation is shown
The General Data Protection Regulation of the Euro-
                                                                      in Appendix Listing 2.
pean Union (GDPR), which took effect in 2018, required
                                                                         Another two registrar-run RDAP services returned
ICANN to update its policies for gTLD registry operators,
                                                                      HTTP responses in which JSON arrays, as used in the
in order to enable them to stay compliant both with the
                                                                      RDAP standard to list the events, were instead repre-
law in these jurisdictions and their responsibilities as
                                                                      sented by JSON objects with the array indexes as names
registry operators [21]. As the GDPR also applies to data
                                                                      and the array elements as the respective values [8]. These
controllers not based in the EU who are storing data of
                                                                      two registrars were only responsible for eight domain
EU citizens, this also affected non-EU registry operators
                                                                      names in our dataset.
and registrars [22]. ICANN agreed on a "Temporary Spec-
ification for gTLD Registration Data" specifying which
registration data must be redacted by gTLD registry op- 6. Conclusion
erators and registrars [21].
    As it is not always possible to programmatically dis- Based on the collected data, we conclude that RDAP can
tinguish between redacted, removed, and un-available be a useful source of registration data of domain names
registration data, we check for the existence of markers for the fight against phishing, but it cannot currently
required by ICANNs RDAP Response Profile specification be the only source for this type of information. This is
[20] to indicate truncation or redaction of entity objects.4 mainly due to the slow adoption among ccTLD registry
    17,716 RDAP responses for 16,546 distinct domain operators.
names contained at least one entity that was truncated or                The RDAP standard itself is well-suited to cover the
redacted. This constitutes 61.66% of the domain names for registration data needs of security researchers, and we
which we got successful RDAP responses. Not all remain- observed acceptable adherence to the standard among
ing RDAP responses necessarily contain unobfuscated RDAP service providers, also taking into account data
registrant information, as some RDAP service providers freshness.
do not declare the information as redacted, even if it is                Restrictive terms of service or acceptable use policies
obfuscated.                                                           of RDAP service providers present a challenge to the
    Apart from attempts to cluster domain names regis- adoption of RDAP by security researchers. These policies
tered with malicious intent via the registrant information, were, in many cases, seemingly copied from the WHOIS
another use case for RDAP in the fight against phishing is terms of service and do not reflect the nature of RDAP as
the identification of abuse contacts of the sponsoring reg- a machine-readable API that allows for more fine-grained
istrar. For 19,250 unique domains, we got a thick RDAP access control compared to WHOIS. This often results in
response containing a contact entity with an "abuse" role. very aggressive, IP-based rate limits for RDAP queries,
This constitutes 71.72% of all domains for which we got which hinders the fight against phishing, as threat actors
successful thick RDAP responses.                                      are often registering domain names in bulk.
                                                                         Because of privacy regulations, data on the actual reg-
5.4.3. Schema adherence                                               istrant of the domain name is commonly not available
                                                                      to un-authenticated RDAP clients. This is not an RDAP-
We focused on parsing three sections of the RDAP re- specific issue and also affects other RDDS standards. But
sponse: the events, to gain information on the age of the in contrast to WHOIS, the RDAP standard enables RDAP
domain name, the related entities, to identify the abuse service operators to implement delegated authorization
contact, and links to follow the potential "related" links mechanisms, like OAuth 2.0, using a number of central-
to the RDAP service of the sponsoring registrar.                      ized identity providers.
    We covered the correctness and plausibility of the "Last             These providers could ensure that interested parties
update of RDAP database" event in section 5.4.1.                      have a legitimate use case for accessing domain name
    1,503 RDAP responses did not use jCard, a JSON rep- registration data. Instead of having to prove the legit-
resentation [23] for the vCard standard [6], which the imacy of their registration data access request to each
RDAP response standard prescribes [8]. These responses individual RDAP service provider, security researchers
were sent by two registrar-run RDAP services. Instead, would just have to prove this to a smaller number of
entity objects were represented using a similar schema, identity providers.
which might more closely represent the underlying data                   Because of RDAP’s extensive use of established web
                                                                      standards and its resulting extendability, we think that
4
  Required in the specification is a remark of type "object truncated it has the potential to become an important tool in the
  due to authorization", but we also included variations like "object fight against phishing.
 redacted due to authorization" and "object redacted due to privacy
 laws".
References                                                           1109/EuroSP48549.2020.00045.
                                                                [14] T. Vissers, J. Spooren, P. Agten, D. Jumpertz,
 [1] Anti-Phishing Working Group, Phishing ac-                       P. Janssen, M. Van Wesemael, F. Piessens, W. Joosen,
     tivity trends report, 4th quarter 2022, 2023.                   L. Desmet, Exploring the ecosystem of malicious
     URL: https://docs.apwg.org/reports/apwg_trends_                 domain registrations in the .eu tld, in: M. Dacier,
     report_q4_2022.pdf.                                             M. Bailey, M. Polychronakis, M. Antonakakis (Eds.),
 [2] A. Oest, Y. Safei, A. Doupé, G.-J. Ahn, B. Wardman,             Research in Attacks, Intrusions, and Defenses,
     G. Warner, Inside a phisher’s mind: Understanding               Springer International Publishing, Cham, 2017, pp.
     the anti-phishing ecosystem through phishing kit                472–493.
     analysis, in: 2018 APWG Symposium on Electronic            [15] G. Aaron, L. Chapin, D. Piscitello, C. Strutt, Phish-
     Crime Research (eCrime), 2018, pp. 1–12. doi:10.                ing landscape 2021, 2021. URL: https://interisle.net/
     1109/ECRIME.2018.8376206.                                       PhishingLandscape2021.pdf.
 [3] K. Harrenstien, V. White, NICNAME/WHOIS, RFC               [16] Domain Name Stat, LLC, Domain name registra-
     812, 1982. URL: https://www.rfc-editor.org/info/                tions, by tld, 2023. URL: https://domainnamestat.
     rfc812. doi:10.17487/RFC0812.                                   com/statistics/tld/others, accessed May 12, 2023.
 [4] L. Daigle, WHOIS Protocol Specification, RFC 3912,         [17] ICANN, Registration data access protocol timeline,
     2004. URL: https://www.rfc-editor.org/info/rfc3912.             2018. URL: https://www.icann.org/resources/pages/
     doi:10.17487/RFC3912.                                           rdap-background-2018-08-31-en, accessed May 11,
 [5] S. Hollenbeck, A. Newton, Registration Data Ac-                 2023.
     cess Protocol (RDAP) Query Format, RFC 9082,               [18] IANA, Root zone database, 2023. URL: https://www.
     2021. URL: https://www.rfc-editor.org/info/rfc9082.             iana.org/domains/root/db, accessed May 11, 2023.
     doi:10.17487/RFC9082.                                      [19] S. Hollenbeck, Extensible Provisioning Proto-
 [6] A. Newton, S. Hollenbeck, JSON Responses for the                col (EPP) Domain Name Mapping, RFC 5731,
     Registration Data Access Protocol (RDAP), RFC                   2009. URL: https://www.rfc-editor.org/info/rfc5731.
     7483, 2015. URL: https://www.rfc-editor.org/info/               doi:10.17487/RFC5731.
     rfc7483. doi:10.17487/RFC7483.                             [20] ICANN, RDAP Response Profile v2.1, 2019.
 [7] A. Newton, B. Ellacott, N. Kong, HTTP Usage in                  URL: https://www.icann.org/en/system/files/files/
     the Registration Data Access Protocol (RDAP), RFC               rdap-response-profile-15feb19-en.pdf.
     7480, 2015. URL: https://www.rfc-editor.org/info/          [21] ICANN,        Temporary        Specification      for
     rfc7480. doi:10.17487/RFC7480.                                  gTLD       Registration     Data,     2018.     URL:
 [8] S. Hollenbeck, A. Newton, JSON Responses for the                https://www.icann.org/en/system/files/files/
     Registration Data Access Protocol (RDAP), RFC                   gtld-registration-data-temp-spec-17may18-en.
     9083, 2021. URL: https://www.rfc-editor.org/info/               pdf.
     rfc9083. doi:10.17487/RFC9083.                             [22] European Commission, Regulation (EU) 2016/679
 [9] M. Blanchet, Finding the Authoritative Registration             of the European Parliament and of the Council of 27
     Data Access Protocol (RDAP) Service, RFC 9224,                  April 2016 on the protection of natural persons with
     2022. URL: https://www.rfc-editor.org/info/rfc9224.             regard to the processing of personal data and on the
     doi:10.17487/RFC9224.                                           free movement of such data, and repealing Direc-
[10] IANA, Bootstrap service registry for domain                     tive 95/46/EC (General Data Protection Regulation),
     name space, 2023. URL: https://www.iana.org/                    2016.
     assignments/rdap-dns/rdap-dns.xhtml, accessed              [23] P. Kewisch, jCard: The JSON Format for vCard, RFC
     May 11, 2023.                                                   7095, 2014. URL: https://www.rfc-editor.org/info/
[11] S. Liu, I. Foster, S. Savage, G. M. Voelker, L. K. Saul,        rfc7095. doi:10.17487/RFC7095.
     Who is. com? learning to parse whois records, in:
     Proceedings of the 2015 Internet Measurement Con-
     ference, 2015, pp. 369–380.
[12] G. Aaron, R. Rasmussen, Global phishing sur-
     vey: Trends and domain name use in 2016,
     2016. URL: https://docs.apwg.org/reports/APWG_
     Global_Phishing_Report_2015-2016.pdf.
[13] S. Maroofi, M. Korczyński, C. Hesselman, B. Am-
     peau, A. Duda, Comar: Classification of com-
     promised versus maliciously registered domains,
     in: 2020 IEEE European Symposium on Security
     and Privacy (EuroS&P), 2020, pp. 607–623. doi:10.
     A. Appendix

     Listing 2: Shortened and anonymized example a non-
                standard RDAP entity object representation
1     "entities": [
2      {
3        "objectClassName": "entity",
4        "vcardArray": {
5          "properties": [
6            {
7              "name": "FN",
8              "value": {
9                "stringValue": "Domain Administrator",
10               "typeName": "text"
11             }
12           },
13           {
14             "name": "ADR",
15             "value": {
16               "components": [
17                {
18                  "name": "street",
19                  "value": {
20                    "values": [
21                     {
22                       "stringValue": "1337 Lowland Ave.",
23                       "typeName": "text"
24                     },
25                     {
26                       "stringValue": "PMB# 333",
27                       "typeName": "text"
28                     }
29                    ],
30                    "typeName": "text"
31                  }
32                },
33                {
34                  "name": "locality",
35                  "value": {
36                    "values": [
37                     {
38                       "stringValue": "Example City",
39                       "typeName": "text"
40                     }
41                    ],
42                    "typeName": "text"
43                  }
44                },
45               ],
46               "typeName": "text"
47             }
48           },
49           {
50             "name": "TEL",
51             "parameters": {},
52             "value": {
53               "stringValue": "tel:+0.13371337",
54               "typeName": "uri"
55             }
56           },
57           {
58             "name": "EMAIL",
59             "value": {
60               "stringValue": "example@example.com",
61               "typeName": "text"
62             }
63           }
64         ]
65       },
66       "roles": [
67         "REGISTRANT"
68       ]
69     }
Table 3
Top 50 TLDs in the test data set and their (in some cases unofficial) RDAP support.

        TLD                        Number of domain names        Percentage of dataset    RDAP Support
        .com                                            19943                   37.44 %   RDAP Support
        .top                                             3322                    6.24 %   RDAP Support
        .xyz                                             2348                    4.41 %   RDAP Support
        .net                                             1643                    3.08 %   RDAP Support
        .cn                                              1431                    2.69 %   No RDAP
        .org                                             1049                    1.97 %   RDAP Support
        .info                                            1019                    1.91 %   RDAP Support
        .ru                                               908                    1.70 %   No RDAP
        .tk                                               844                    1.58 %   No RDAP
        .online                                           748                    1.40 %   RDAP Support
        .ml                                               730                    1.37 %   No RDAP
        .br                                               724                    1.36 %   RDAP Support
        .site                                             677                    1.27 %   RDAP Support
        .de                                               603                    1.13 %   Unofficial RDAP Support
        .ga                                               601                    1.13 %   No RDAP
        .shop                                             567                    1.06 %   RDAP Support
        .cf                                               547                    1.03 %   No RDAP
        .pl                                               536                    1.01 %   No RDAP
        .uk                                               512                    0.96 %   RDAP Support
        .in                                               509                    0.96 %   No RDAP
        .live                                             482                    0.90 %   RDAP Support
        .gq                                               432                    0.81 %   No RDAP
        .co                                               421                    0.79 %   No RDAP
        .au                                               382                    0.72 %   No RDAP
        .stream                                           363                    0.68 %   RDAP Support
        .cc                                               360                    0.68 %   RDAP Support
        .us                                               348                    0.65 %   Unofficial RDAP Support
        .fr                                               347                    0.65 %   RDAP Support
        .icu                                              338                    0.63 %   RDAP Support
        .club                                             325                    0.61 %   RDAP Support
        .cyou                                             303                    0.57 %   RDAP Support
        .it                                               268                    0.50 %   No RDAP
        .eu                                               267                    0.50 %   No RDAP
        .id                                               246                    0.46 %   RDAP Support
        .bid                                              239                    0.45 %   RDAP Support
        .nl                                               233                    0.44 %   No RDAP
        .buzz                                             221                    0.41 %   RDAP Support
        .me                                               218                    0.41 %   No RDAP
        .click                                            212                    0.40 %   RDAP Support
        .space                                            206                    0.39 %   RDAP Support
        .za                                               184                    0.35 %   No RDAP
        .win                                              178                    0.33 %   RDAP Support
        .ca                                               177                    0.33 %   RDAP Support
        .io                                               173                    0.32 %   No RDAP
        .store                                            171                    0.32 %   RDAP Support
        .asia                                             164                    0.31 %   RDAP Support
        .cloud                                            163                    0.31 %   RDAP Support
        .pw                                               163                    0.31 %   RDAP Support
        .biz                                              159                    0.30 %   RDAP Support
        .cl                                               157                    0.29 %   No RDAP
        277 other TLDs                                   3690                    6.93 %   RDAP Support
        142 other TLDs                                   2419                    4.54 %   No RDAP
        Total RDAP Support                              42010                   78.86 %   RDAP Support
        Total No RDAP Support                           11260                   31.14 %   No RDAP
Table 4
Incomplete list of public rate limits of RDAP service providers for un-authenticated RDAP clients

                                                                    Normalized
 RDAP host          Type                 RDAP query rate limits                    Source
                                                                    query delay
 centralnic.com     Registry operator    7,200/hour                         0.5s   https://registrar-console.centralnic.com/pub/whois_guidance
 godaddy.com        Registrar            100/hour                           36s    https://img1.wsimg.com//Sitecore/3/B/
                                                                                   GDR-RDAP-Access-Policy-0.2.pdf
 isnic.is           Registry operator    50/30min                           36s    https://www.isnic.is/en/rdap
 nic.tatar          Registry operator    30/min                               2s   https://domain.tatar/users/docs/WhoisTermsOfUse_en.php
 nominet.uk         Registry operator    1000/day and 5/s                  86.4s   https://media.nominet.uk/wp-content/uploads/2019/06/
                                                                                   gTLD-Acceptable-Use-Policies-version-1.pdf
 norid.no           Registry operator    300/day and 10/min                288s    https://teknisk.norid.no/en/integrere-mot-norid/rdap-tjenesten/
 tucows.com         Registrar            1/min                              60s    https://tucowsdomains.com/rdap/help/

</pre>