=Paper=
{{Paper
|id=Vol-3631/paper6
|storemode=property
|title=Investigating Phishing Attacks using the Registration Data Access Protocol (RDAP)
|pdfUrl=https://ceur-ws.org/Vol-3631/paper6.pdf
|volume=Vol-3631
|authors=Hauke Jan Lübbers
|dblpUrl=https://dblp.org/rec/conf/apwg-eu/Lubbers23
}}
==Investigating Phishing Attacks using the Registration Data Access Protocol (RDAP)==
Investigating Phishing Attacks using the Registration Data
Access Protocol (RDAP)
Hauke Jan Lübbers1
1
CSIS Security Group A/S, Vestergade 2B, 1456 Copenhagen, Denmark
Abstract
The Registration Data Access Protocol (RDAP) is a successor to the WHOIS protocol and enables programmatic access to the
registration data of internet resources. Using RDAP, we investigated phishing attacks observed over four days, focusing on
DNS domain names. In this paper, we present the opportunities and problems identified in the process. We find that low RDAP
adoption among ccTLD registry operators, strict rate limiting, differences in data representation, and the (un-)availability
of data due to privacy regulations continue to be hindrances to the widespread use of RDAP in cybercrime investigations.
While these issues are currently preventing security researchers from solely relying on RDAP for accessing domain name
registration data, we recognize its potential as a valuable data enrichment source for investigating phishing attacks at scale.
Keywords
Registration Data Access Protocol (RDAP), WHOIS, domain names, phishing, cybercrime investigation
1. Introduction in the context of phishing attack investigations.
Phishing is a common attack vector employed by threat
actors. While the attackers’ motives and sophistication 2. On the Registration Data Access
may vary, phishing attacks continue to be a relatively Protocol
simple method for threat actors to gather credentials for
later exploitation [1]. RDAP is a Registration Data Directory Service (RDDS)
When investigating phishing attacks, a common first standard that enables programmatic access to informa-
step is to look up the domain registration data, or tion about different internet resources, such as DNS do-
"WHOIS information", of the involved domain names main names, DNS name servers, IP addresses, and Au-
suspected of hosting a phishing website or of sending tonomous System Numbers (ASNs). The protocol was
phishing emails [2]. This information can include details first standardized by the Internet Engineering Task Force
of the registrant, the time of registration and renewals, in March 2015 and has since been extended [6].
the registrar, and nameservers used. With it, phishing RDAP defines a RESTful API to be provided by TLD
attacks might be correctly identified as such, classified, registry operators and domain registrars. Accessible via
attributed to previously observed threat actors based on HTTP over TLS, these APIs can be queried for registra-
similar modi operandi, and actively defended against, by tion details of, for example, DNS domain names, as shown
sending takedown requests to the given abuse contacts. in 1 [7] [5].
The commonly used method of looking up domain
registration data is the established WHOIS protocol, first Listing 1: Example RDAP API request to query informa-
standardized in 1982 [3]. Being a relatively old protocol, it tion on example.com
comes with several shortcomings. These were addressed GET https://rdap.registry.example/v1/domain/example.com
in the standardization of a new protocol: The Registration
Data Access Protocol (RDAP) [4] [5]. The expected response, assuming that the RDAP ser-
However, RDAP is a comparatively new standard and vice has information about the queried object, is given in
presents some challenges: It is not available for all Top JSON following a schema defined by RFC 9083 [8]. Thus,
Level Domains (TLDs) and, if available, is often provided the response should be given in a machine-readable for-
with restrictive terms of service. Registration data is mat under the mime type application/rdap+json.
often redacted due to privacy regulations. In this paper, With RFC 9224, the IETF standardized a way to iden-
we investigate how these issues affect the use of RDAP tify the authoritative RDAP service for the internet re-
source that should hold valid registration information
APWG.EU Technical Summit and Researchers Sync-Up 2023, Dublin, on a domain name, IP address, or ASN. The so-called
Ireland, June 21 & 22, 2023 "bootstrap services" associate TLDs, IP ranges, and ASN
$ hjl@csis.com (H. J. Lübbers) ranges with their respective authoritative RDAP services
0009-0007-3481-7616 (H. J. Lübbers) [9]. Bootstrap services are provided by the Internet As-
© 2023 Copyright for this paper by its authors. Use permitted under Creative Commons License
CEUR
Attribution 4.0 International (CC BY 4.0).
CEUR Workshop Proceedings (CEUR-WS.org)
Workshop
http://ceur-ws.org
ISSN 1613-0073
signed Numbers Authority (IANA) [10].
Proceedings
CEUR
ceur-ws.org
Workshop ISSN 1613-0073
Proceedings
Some registry operators do not hold all of the required Finally, the domain name registration data should con-
domain name registration data for all their registered do- tain the abuse contacts of the sponsoring registrar. If a
main names ("thin" registries). Instead, they direct RDDS domain name is deemed to be registered with malicious
queries to the sponsoring registrar through which the intent, security researchers can report it to the registrar,
domain name was registered [11]. In the case of RDAP, which should have a procedure in place to react to these
this is done in the link section of the RDAP response, in reports and suspend the domain name [15].
which a link to the RDAP API endpoint of the sponsoring
registrar is given with the relation-type rel "related" and
the (MIME-)type "application/rdap+json". 4. Methodology
With its properties of machine-readability, transport-
To understand whether RDAP can be used to gather do-
encryption, and internationalization, RDAP is a signifi-
main name registration data to be subsequently employed
cant improvement over its RDDS predecessor, the WHOIS
in the fight against phishing as described in 3, we built a
protocol. WHOIS does not define structured responses
software system that analyzes the availability and data
and instead returns unstandardized plain text, is not
quality of domain name registration data and tested it
transport-encrypted, and does not standardize handling
against a realistic workload of domain names suspected
of encodings other than ASCII, which is a problem for
to be involved in phishing.
languages with non-ASCII character sets [4].
4.1. The "rdapper" software system
3. Domain name registration data
The purpose of the "rdapper" software system was to
in the fight against phishing correctly process requests for registration data of domain
names following the RDAP standard, temporarily store
Domain name registration data, accessed either via
results in a database for caching purposes, and schedule
WHOIS or RDAP, can be applied in the fight against
the requests to RDAP services in order to comply with
phishing for multiple purposes: The age of a domain
the providers’ terms of service.
name, as given by its registration date and subsequent
Key operations of the system were configured to send
renewal dates, is often used as an indicator of its trust-
telemetry data to a monitoring and visualization system.
worthiness; with older domains being considered more
This allowed us to oversee the operation of the system
trustworthy than more recently registered ones. This
and extract metrics after the experiment had concluded.
practice has led to some threat actors waiting for a while
The RDAP standard, which is heavily based on well
between registering a domain name and starting to use it
established web-standards and technologies, makes it
for malicious purposes, to "age" their domain names and
straightforward to implement an RDAP client that iden-
thereby giving them a higher chance of evading detection
tifies the authoritative RDAP service for a given internet
[12] [2].
resource [9], sends an HTTP request to the RESTful API
If a domain name is confirmed to be involved in a phish-
[7] [5], and parses the JSON response [8]. But in this
ing attack, its age can be used to differentiate between
scenario, we would be ignoring the rate limits that are
domain names that were registered solely for this mali-
defined in the "Terms of Service" or "Acceptable Use"
cious purpose and benign domain names merely pointing
policies of the many RDAP services the client might
to a host that was compromised by the threat actors [13].
connect to. To adhere to these rate limits, we built a
A third option to be considered is the misuse of legiti-
scheduling system for our RDAP queries. It took into
mate file- or web-hosting services for phishing purposes.
account the time of the last lookup to this RDAP host
This classification is important, as the counter-actions
and possible back-off requests in the form of HTTP re-
taken by security researchers differ based on the type of
sponses with status code 429 ("Too many requests") and
phishing at hand.
their Retry-After response header values if they were
While the registrant data provided by threat actors
defined by the server. This delay was individually con-
is often falsified, it can be used to cluster domains that
figurable for each RDAP host. When building such a
were registered in bulk, and thereby help to detect new
scheduling system, it is necessary to track queries per
phishing domains [14] [15]. This assumes that the reg-
RDAP host, not per RDAP service, as some RDAP ser-
istrant data is publicly available, which often is not the
vices for different TLDs are hosted on the same server
case, as discussed in section 5.4.2 "Data Redaction". In the
and track RDAP queries across all services.
absence of registrant data, other information on a domain
We chose a default delay of five minutes between
name’s registration process, like the sponsoring registrar
queries to the same host based on the longest observed
or reseller used, can help to cluster attacks through simi-
required delay when we began this project. After each
lar modi operandi, and potentially attribute it to the same
run of the experiment, we identified the RDAP services
threat actor, often connected with other data.
that had accumulated the longest delay between the re-
me
be
ph
at
jp
quest of domain registration data and the execution of
es
za
se
tw
ch
pl
in
ws
us
au
the RDAP query. We then tried to find stated rate lim-
it
co
eu
gq
its in the Terms of Service documents of these registry
nl
m
l
cf
operators or registrars. If we could not find any public ga
rate limit information, we would reach out to the RDAP ru cou
ntr
y-c
service provider and ask them for safe rate limits for their de
od
e
No
com
RDAP API. Then, we would adjust the delays for those RD
AP
RDAP hosts accordingly.
The system identified itself to the RDAP service
tk
providers by sending request headers with a unique RD
AP
Su gene
User-Agent and an email address in the From header,
pp ri c
ted ort
cn
tric
- res
in order to give RDAP service providers the ability to bcizz
gen
-cod
e
tr y
contact us should our queries cause issues or breach any ca
fr co
un
terms of service. 1 We did not rotate our clients’ IP ad-
br
dresses or implemented any other attempts at evading
uk
swoivezz
lotaorrek
lbu
n
any potential terms of service enforcement by the RDAP
shvoip
clu p
b
ne
site
net
onli
icu
info
org
service providers.
top
xyz
4.2. Analyzed domain names Figure 1: Visualization of the 50 most popular TLDs based on
approximate numbers of registered domain names [16]
As part of this experiment, we analyzed 53,270 unique , grouped by TLD type and official RDAP support as of May 12th 2023.
domain names. All domain names were labeled by CSIS
Security Group’s Cyber Intelligence Platform to be poten-
tially involved in phishing attacks. This does not mean challenges that we observed related to the use of RDAP
that all of the domains were confirmed to host phishing to fight phishing.
pages: they include (1) false positive reports, (2) domain
names from URLs referred to in phishing emails that are 5.1. RDAP coverage in the domain name
not malicious, (3) domain names solely registered for
phishing purposes, (4) domain names of compromised
ecosystem
benign websites, and (5) domain names of popular file- gTLD registry operators and registrars are required by
or web-hosting services misused for phishing attacks. ICANN to provide an RDAP service as of the 26th of Au-
Data from four individual days 2 was used to aggregate gust 2019 [17]. ccTLD registry operators are not required
all domain names suspected to be involved in phishing to implement RDAP, but a number of them do support it,
over the course of the past 24 hours (relative to the indi- as shown in Figure 1.
vidual day). The TLD distribution in the test dataset is At the time of writing, 1192 of all 1479 Top Level Do-
shown in Table 3. They were analyzed by the "rdapper" mains (80.59%) have an authoritative RDAP service as-
system on the following days, again over the course of signed in IANAs RDAP bootstrap file for Domain Name
24 hours. This was done to simulate a realistic work- System registrations [10] [18]. In addition to these of-
load of a system that processes RDAP requests for an ficially announced RDAP services, we identified seven
anti-phishing system. RDAP services for ccTLDs which were not published
in the IANA bootstrap file. Some of these services are
in a testing phase and state this as the reason for their
5. Four days of phishing: The absence from the bootstrap file. These unofficial RDAP
results services bring the total of RDAP-supporting ccTLDs up
to 34, or 11% of all ccTLDs.
We analyzed the RDAP availability and, if available, the These percentages show how far registry operators
results of 53,270 unique domains that were suspected to have come in adopting RDAP. But for practitioners, these
be involved in phishing attacks over the four selected numbers need to be adjusted by the distribution of do-
days. In the following, we highlight and comment on the main names registered across the TLDs – or even better,
by the distribution of domains of interest across TLDs;
in our case these are domain names suspected of being
1
We did not receive any emails. connected to phishing attacks.
2
The data collection days were May 10th 2023, May 22rd 2023, May
31st 2023, and June 5th 2023, with the processing days being the
Using approximate numbers of registered domains for
following day respectively. each TLD based on domainnamestats.com data accessed
Table 1
RDAP coverage for active (assigned) TLDs according to the IANAs RDAP bootstrap file for Domain Name System registrations
by type as of May 11, 2023.
Type of TLD Active TLDs Official RDAP support RDAP Support Percentage
Generic & generic-restricted TLDs 1155 1155 100%
Sponsored & infrastructure TLDs 15 10 66.67%
Country code TLDs 309 27 8.73%
Total 1479 1192 80.59%
on May 12th 2023 resulted in a calculated coverage of services from this list.
67.45% of registered domain name having an official au- One registrar-run RDAP service was served behind
thoritative RDAP service assigned to them [16]. This a bot detection service of a content delivery network
coverage can be expected when the distribution of TLDs provider, making programmatic access impossible.
in a workload is similar to the distribution of all registered One registrar had restricted access to the individual
domain names across TLDs. domain name lookup endpoint, an example of which is
For our workload of domain names suspected to be shown in 1. When notified about this, they pointed us to
involved in phishing, the distribution across TLDs differs their RDAP endpoint to search for domains instead. This
slightly from the general population. In practice, and is not a viable solution, as the thin registry RDAP ser-
including the seven unofficial RDAP services, we reached vice continues to re-redirect queries to the RDAP query
a coverage of 78.86% of our domain name test dataset, endpoint for individual domain lookups.
the TLD distribution of which can be seen in Table 3.
5.3. RDAP service rate limiting
5.2. RDAP service availability
As described in Section 4.1, scheduling RDAP queries to
Depending on the setup as a thin or thick registry, the comply with the terms of service of the RDAP services
registration data lookup for one domain name might in- and, in particular, the rate limits is a crucial aspect when
volve RDAP queries to one or two RDAP services run by developing an RDAP client to operate at a certain scale.
a registry operator or a registrar. Generic and generic- We found that our default delay of 300 seconds between
restricted TLD registry operators and registrars are re- requests generally seemed to work, and for the most part,
quired by ICANN to operate RDAP services [17]. But did not result in back-off responses or even permanent
these services are no profit centers and are not "mission- IP blocks by RDAP service providers.
critical" for most paying customers of these organizations. We identified rate limits either in public terms of ser-
During our experiment, we saw 399 authoritative RDAP vice documents or in other parts of the the web presence
services run by registry operators and 229 RDAP services of eight RDAP service providers. One of those has since
run by registrars, which we were pointed to by RDAP removed the document, but the rate limits of the remain-
responses of registry operators. ing seven RDAP service providers are listed in Table 4.
We observed two RDAP services run by registry op- We contacted eleven RDAP service providers, priori-
erators that continuously returned HTTP 500 ("Inter- tizing those who had accumulated the longest processing
nal server error") responses. One of those services also delays during our experiment. Two of those, both gTLD
started to serve an expired TLS certificate for around registry operators, refused to share the rate limits they
five days, and returned to serving HTTP 500 responses enforce. Four shared their rate limits after being con-
afterwards. Both RDAP services have since been fixed. tacted, with the highest rate limit being one query per
Of the 229 RDAP services run by registrars, we ob- second and the lowest five queries per minute. In four
served eight that served an invalid or expired TLS certifi- cases, we never got an answer to our contact attempts via
cate or that were only available via HTTP without TLS contact forms on the organization’s websites or support
on port 80. email addresses. In one case, we did not manage to reach
25 registrar-run RDAP services were consistently un- people with knowledge about the RDAP service of that
available, meaning that they never served valid RDAP organization.
responses. If we only contacted these services on two The lowest rate limit we observed was determined
distinct days or less, we manually confirmed the unavail- through experimentation, as the registrar did not react to
ability of the services after the experiment phase had con- our contact attempts: Their RDAP service was configured
cluded, before categorizing them as consistently unavail- to allow just one query per hour per IP, before responding
able. Through this method, we excluded six registrar-run with an HTTP 429 "Too many requests" response.
Table 2
RDAP service availability. One host may host multiple RDAP APIs for different TLDs.
Type of RDAP service Observed hosts Invalid TLS certificates Unavailable or invalid API Total RDAP queries
Registry operator 399 0 2 2 (0.5%) 46 (0.01% of 42,014)
Registrar 229 8 25 33 (11%) 1101 (4.3% of 25,593)
Total 628 8 27 35 (5.6%) 1147 (1.7% of 67,607)
hours after they were reported, and by that time, they
might have already been suspended by the registry oper-
ator or sponsoring registrar. However, in these cases, the
status of the domain name is usually changed to "server
Domain Registration Data: 26,834
hold" if the action was taken by the registry operator, and
RDAP Support: 42,010
RDAP Response: 40,040
"client hold" if the action was taken by the registrar [19].
Total Domains: 53,270 We observed 1,364 domain names with a "server hold"
Not registered: 13,206 status, 2,447 domain names with a "client hold" status,
and 437 domain names with both statuses, in total 7.97%.
3
Another explanation could be that some RDAP services
API Error: 1,970
No RDAP Support: 11,260 might have a significant delay between the registration
of a domain name and the update of the registration data
database, on which the Registration Data Directory Ser-
vices rely. We investigate this option in the following
Figure 2: Visualization of RDAP availability and results for section.
domains in the test set.
5.4.1. Data Freshness
46,034 of the recorded RDAP responses include an event
5.4. RDAP responses called "last update of RDAP database", which, according
In total, we sent 67,607 queries to RDAP services. This is to the ICANN RDAP Response Profile specification, must
because 25,873 (48.57%) of all domain names in the test contain "a value equal to the timestamp when the RDAP
dataset caused us to make two or more queries, the first database was last updated" [20].
to the TLD registry operator and the following ones to We compared this self-reported timestamp to the time
the sponsoring registrars that we were pointed to in the of our system storing the RDAP result. This timestamp
RDAP response of the registry operator. We had wrongly is recorded just after the HTTP response of the RDAP
assumed that only thin WHOIS registries would include service has been received. Because of that, and to account
links to the RDAP services of sponsoring registrars, but for minor time-keeping offsets between our database
this was incorrect: Some thick WHOIS registries, like server and the RDAP servers, we allowed for a 10-second
CentralNic for the .xyz TLD, include links to the RDAP buffer time.
service of the sponsoring registrars, too. 866 RDAP database update timestamps were excluded
In 13 cases (six .org and seven .info domain names) because they lay significantly in the future without spec-
we found "related" links to a third RDAP service in the ifying a time zone. 1,220 timestamps were excluded be-
response of the second RDAP service. None of the RDAP cause the RDAP service was likely implemented incor-
services behind these links responded to our queries. rectly, as the supposed timestamp of the RDAP database
Of the 42,010 domain names with RDAP support, update always matched the "last changed" event, stating
13,206 (31.44%) were not registered, according to the re- when the information about the object was last changed,
ceived RDAP responses. This might be an artifact of the or the "registration" event of the domain name [8].
test dataset, which includes domain names that were 36,260 RDAP database update timestamps (82.5% of the
parsed from spam emails and SMS. As the goal of this available and correct timestamps) were within 30 seconds
process is to extract any potential link from the mes- of our request, which indicates that they configured their
sages, some of the extracted domain names, while being RDAP service with a "live" registration data database
technically valid domain names, might not actually be setup. In this case, the time of the request is taken as the
registered. It might also be caused by the fact that we time of the last RDAP database update.
were querying the registration data of these domains 24
3
This includes variations like "client_hold" and "ServerHold"
5.4.2. Data Redaction structures of their RDDS backend. An anonymized ex-
ample of such an entity object representation is shown
The General Data Protection Regulation of the Euro-
in Appendix Listing 2.
pean Union (GDPR), which took effect in 2018, required
Another two registrar-run RDAP services returned
ICANN to update its policies for gTLD registry operators,
HTTP responses in which JSON arrays, as used in the
in order to enable them to stay compliant both with the
RDAP standard to list the events, were instead repre-
law in these jurisdictions and their responsibilities as
sented by JSON objects with the array indexes as names
registry operators [21]. As the GDPR also applies to data
and the array elements as the respective values [8]. These
controllers not based in the EU who are storing data of
two registrars were only responsible for eight domain
EU citizens, this also affected non-EU registry operators
names in our dataset.
and registrars [22]. ICANN agreed on a "Temporary Spec-
ification for gTLD Registration Data" specifying which
registration data must be redacted by gTLD registry op- 6. Conclusion
erators and registrars [21].
As it is not always possible to programmatically dis- Based on the collected data, we conclude that RDAP can
tinguish between redacted, removed, and un-available be a useful source of registration data of domain names
registration data, we check for the existence of markers for the fight against phishing, but it cannot currently
required by ICANNs RDAP Response Profile specification be the only source for this type of information. This is
[20] to indicate truncation or redaction of entity objects.4 mainly due to the slow adoption among ccTLD registry
17,716 RDAP responses for 16,546 distinct domain operators.
names contained at least one entity that was truncated or The RDAP standard itself is well-suited to cover the
redacted. This constitutes 61.66% of the domain names for registration data needs of security researchers, and we
which we got successful RDAP responses. Not all remain- observed acceptable adherence to the standard among
ing RDAP responses necessarily contain unobfuscated RDAP service providers, also taking into account data
registrant information, as some RDAP service providers freshness.
do not declare the information as redacted, even if it is Restrictive terms of service or acceptable use policies
obfuscated. of RDAP service providers present a challenge to the
Apart from attempts to cluster domain names regis- adoption of RDAP by security researchers. These policies
tered with malicious intent via the registrant information, were, in many cases, seemingly copied from the WHOIS
another use case for RDAP in the fight against phishing is terms of service and do not reflect the nature of RDAP as
the identification of abuse contacts of the sponsoring reg- a machine-readable API that allows for more fine-grained
istrar. For 19,250 unique domains, we got a thick RDAP access control compared to WHOIS. This often results in
response containing a contact entity with an "abuse" role. very aggressive, IP-based rate limits for RDAP queries,
This constitutes 71.72% of all domains for which we got which hinders the fight against phishing, as threat actors
successful thick RDAP responses. are often registering domain names in bulk.
Because of privacy regulations, data on the actual reg-
5.4.3. Schema adherence istrant of the domain name is commonly not available
to un-authenticated RDAP clients. This is not an RDAP-
We focused on parsing three sections of the RDAP re- specific issue and also affects other RDDS standards. But
sponse: the events, to gain information on the age of the in contrast to WHOIS, the RDAP standard enables RDAP
domain name, the related entities, to identify the abuse service operators to implement delegated authorization
contact, and links to follow the potential "related" links mechanisms, like OAuth 2.0, using a number of central-
to the RDAP service of the sponsoring registrar. ized identity providers.
We covered the correctness and plausibility of the "Last These providers could ensure that interested parties
update of RDAP database" event in section 5.4.1. have a legitimate use case for accessing domain name
1,503 RDAP responses did not use jCard, a JSON rep- registration data. Instead of having to prove the legit-
resentation [23] for the vCard standard [6], which the imacy of their registration data access request to each
RDAP response standard prescribes [8]. These responses individual RDAP service provider, security researchers
were sent by two registrar-run RDAP services. Instead, would just have to prove this to a smaller number of
entity objects were represented using a similar schema, identity providers.
which might more closely represent the underlying data Because of RDAP’s extensive use of established web
standards and its resulting extendability, we think that
4
Required in the specification is a remark of type "object truncated it has the potential to become an important tool in the
due to authorization", but we also included variations like "object fight against phishing.
redacted due to authorization" and "object redacted due to privacy
laws".
References 1109/EuroSP48549.2020.00045.
[14] T. Vissers, J. Spooren, P. Agten, D. Jumpertz,
[1] Anti-Phishing Working Group, Phishing ac- P. Janssen, M. Van Wesemael, F. Piessens, W. Joosen,
tivity trends report, 4th quarter 2022, 2023. L. Desmet, Exploring the ecosystem of malicious
URL: https://docs.apwg.org/reports/apwg_trends_ domain registrations in the .eu tld, in: M. Dacier,
report_q4_2022.pdf. M. Bailey, M. Polychronakis, M. Antonakakis (Eds.),
[2] A. Oest, Y. Safei, A. Doupé, G.-J. Ahn, B. Wardman, Research in Attacks, Intrusions, and Defenses,
G. Warner, Inside a phisher’s mind: Understanding Springer International Publishing, Cham, 2017, pp.
the anti-phishing ecosystem through phishing kit 472–493.
analysis, in: 2018 APWG Symposium on Electronic [15] G. Aaron, L. Chapin, D. Piscitello, C. Strutt, Phish-
Crime Research (eCrime), 2018, pp. 1–12. doi:10. ing landscape 2021, 2021. URL: https://interisle.net/
1109/ECRIME.2018.8376206. PhishingLandscape2021.pdf.
[3] K. Harrenstien, V. White, NICNAME/WHOIS, RFC [16] Domain Name Stat, LLC, Domain name registra-
812, 1982. URL: https://www.rfc-editor.org/info/ tions, by tld, 2023. URL: https://domainnamestat.
rfc812. doi:10.17487/RFC0812. com/statistics/tld/others, accessed May 12, 2023.
[4] L. Daigle, WHOIS Protocol Specification, RFC 3912, [17] ICANN, Registration data access protocol timeline,
2004. URL: https://www.rfc-editor.org/info/rfc3912. 2018. URL: https://www.icann.org/resources/pages/
doi:10.17487/RFC3912. rdap-background-2018-08-31-en, accessed May 11,
[5] S. Hollenbeck, A. Newton, Registration Data Ac- 2023.
cess Protocol (RDAP) Query Format, RFC 9082, [18] IANA, Root zone database, 2023. URL: https://www.
2021. URL: https://www.rfc-editor.org/info/rfc9082. iana.org/domains/root/db, accessed May 11, 2023.
doi:10.17487/RFC9082. [19] S. Hollenbeck, Extensible Provisioning Proto-
[6] A. Newton, S. Hollenbeck, JSON Responses for the col (EPP) Domain Name Mapping, RFC 5731,
Registration Data Access Protocol (RDAP), RFC 2009. URL: https://www.rfc-editor.org/info/rfc5731.
7483, 2015. URL: https://www.rfc-editor.org/info/ doi:10.17487/RFC5731.
rfc7483. doi:10.17487/RFC7483. [20] ICANN, RDAP Response Profile v2.1, 2019.
[7] A. Newton, B. Ellacott, N. Kong, HTTP Usage in URL: https://www.icann.org/en/system/files/files/
the Registration Data Access Protocol (RDAP), RFC rdap-response-profile-15feb19-en.pdf.
7480, 2015. URL: https://www.rfc-editor.org/info/ [21] ICANN, Temporary Specification for
rfc7480. doi:10.17487/RFC7480. gTLD Registration Data, 2018. URL:
[8] S. Hollenbeck, A. Newton, JSON Responses for the https://www.icann.org/en/system/files/files/
Registration Data Access Protocol (RDAP), RFC gtld-registration-data-temp-spec-17may18-en.
9083, 2021. URL: https://www.rfc-editor.org/info/ pdf.
rfc9083. doi:10.17487/RFC9083. [22] European Commission, Regulation (EU) 2016/679
[9] M. Blanchet, Finding the Authoritative Registration of the European Parliament and of the Council of 27
Data Access Protocol (RDAP) Service, RFC 9224, April 2016 on the protection of natural persons with
2022. URL: https://www.rfc-editor.org/info/rfc9224. regard to the processing of personal data and on the
doi:10.17487/RFC9224. free movement of such data, and repealing Direc-
[10] IANA, Bootstrap service registry for domain tive 95/46/EC (General Data Protection Regulation),
name space, 2023. URL: https://www.iana.org/ 2016.
assignments/rdap-dns/rdap-dns.xhtml, accessed [23] P. Kewisch, jCard: The JSON Format for vCard, RFC
May 11, 2023. 7095, 2014. URL: https://www.rfc-editor.org/info/
[11] S. Liu, I. Foster, S. Savage, G. M. Voelker, L. K. Saul, rfc7095. doi:10.17487/RFC7095.
Who is. com? learning to parse whois records, in:
Proceedings of the 2015 Internet Measurement Con-
ference, 2015, pp. 369–380.
[12] G. Aaron, R. Rasmussen, Global phishing sur-
vey: Trends and domain name use in 2016,
2016. URL: https://docs.apwg.org/reports/APWG_
Global_Phishing_Report_2015-2016.pdf.
[13] S. Maroofi, M. Korczyński, C. Hesselman, B. Am-
peau, A. Duda, Comar: Classification of com-
promised versus maliciously registered domains,
in: 2020 IEEE European Symposium on Security
and Privacy (EuroS&P), 2020, pp. 607–623. doi:10.
A. Appendix
Listing 2: Shortened and anonymized example a non-
standard RDAP entity object representation
1 "entities": [
2 {
3 "objectClassName": "entity",
4 "vcardArray": {
5 "properties": [
6 {
7 "name": "FN",
8 "value": {
9 "stringValue": "Domain Administrator",
10 "typeName": "text"
11 }
12 },
13 {
14 "name": "ADR",
15 "value": {
16 "components": [
17 {
18 "name": "street",
19 "value": {
20 "values": [
21 {
22 "stringValue": "1337 Lowland Ave.",
23 "typeName": "text"
24 },
25 {
26 "stringValue": "PMB# 333",
27 "typeName": "text"
28 }
29 ],
30 "typeName": "text"
31 }
32 },
33 {
34 "name": "locality",
35 "value": {
36 "values": [
37 {
38 "stringValue": "Example City",
39 "typeName": "text"
40 }
41 ],
42 "typeName": "text"
43 }
44 },
45 ],
46 "typeName": "text"
47 }
48 },
49 {
50 "name": "TEL",
51 "parameters": {},
52 "value": {
53 "stringValue": "tel:+0.13371337",
54 "typeName": "uri"
55 }
56 },
57 {
58 "name": "EMAIL",
59 "value": {
60 "stringValue": "example@example.com",
61 "typeName": "text"
62 }
63 }
64 ]
65 },
66 "roles": [
67 "REGISTRANT"
68 ]
69 }
Table 3
Top 50 TLDs in the test data set and their (in some cases unofficial) RDAP support.
TLD Number of domain names Percentage of dataset RDAP Support
.com 19943 37.44 % RDAP Support
.top 3322 6.24 % RDAP Support
.xyz 2348 4.41 % RDAP Support
.net 1643 3.08 % RDAP Support
.cn 1431 2.69 % No RDAP
.org 1049 1.97 % RDAP Support
.info 1019 1.91 % RDAP Support
.ru 908 1.70 % No RDAP
.tk 844 1.58 % No RDAP
.online 748 1.40 % RDAP Support
.ml 730 1.37 % No RDAP
.br 724 1.36 % RDAP Support
.site 677 1.27 % RDAP Support
.de 603 1.13 % Unofficial RDAP Support
.ga 601 1.13 % No RDAP
.shop 567 1.06 % RDAP Support
.cf 547 1.03 % No RDAP
.pl 536 1.01 % No RDAP
.uk 512 0.96 % RDAP Support
.in 509 0.96 % No RDAP
.live 482 0.90 % RDAP Support
.gq 432 0.81 % No RDAP
.co 421 0.79 % No RDAP
.au 382 0.72 % No RDAP
.stream 363 0.68 % RDAP Support
.cc 360 0.68 % RDAP Support
.us 348 0.65 % Unofficial RDAP Support
.fr 347 0.65 % RDAP Support
.icu 338 0.63 % RDAP Support
.club 325 0.61 % RDAP Support
.cyou 303 0.57 % RDAP Support
.it 268 0.50 % No RDAP
.eu 267 0.50 % No RDAP
.id 246 0.46 % RDAP Support
.bid 239 0.45 % RDAP Support
.nl 233 0.44 % No RDAP
.buzz 221 0.41 % RDAP Support
.me 218 0.41 % No RDAP
.click 212 0.40 % RDAP Support
.space 206 0.39 % RDAP Support
.za 184 0.35 % No RDAP
.win 178 0.33 % RDAP Support
.ca 177 0.33 % RDAP Support
.io 173 0.32 % No RDAP
.store 171 0.32 % RDAP Support
.asia 164 0.31 % RDAP Support
.cloud 163 0.31 % RDAP Support
.pw 163 0.31 % RDAP Support
.biz 159 0.30 % RDAP Support
.cl 157 0.29 % No RDAP
277 other TLDs 3690 6.93 % RDAP Support
142 other TLDs 2419 4.54 % No RDAP
Total RDAP Support 42010 78.86 % RDAP Support
Total No RDAP Support 11260 31.14 % No RDAP
Table 4
Incomplete list of public rate limits of RDAP service providers for un-authenticated RDAP clients
Normalized
RDAP host Type RDAP query rate limits Source
query delay
centralnic.com Registry operator 7,200/hour 0.5s https://registrar-console.centralnic.com/pub/whois_guidance
godaddy.com Registrar 100/hour 36s https://img1.wsimg.com//Sitecore/3/B/
GDR-RDAP-Access-Policy-0.2.pdf
isnic.is Registry operator 50/30min 36s https://www.isnic.is/en/rdap
nic.tatar Registry operator 30/min 2s https://domain.tatar/users/docs/WhoisTermsOfUse_en.php
nominet.uk Registry operator 1000/day and 5/s 86.4s https://media.nominet.uk/wp-content/uploads/2019/06/
gTLD-Acceptable-Use-Policies-version-1.pdf
norid.no Registry operator 300/day and 10/min 288s https://teknisk.norid.no/en/integrere-mot-norid/rdap-tjenesten/
tucows.com Registrar 1/min 60s https://tucowsdomains.com/rdap/help/