Estimating Parameters of Target’s Detection Methods Martin Drašar, Jana Medková Institute of Computer Science Masaryk University Brno, Czech Republic {drasar,medkova}@ics.muni.cz Abstract— Dictionary attacks are a prevalent phenomenon, for better and more realistic evaluation of current and future which was lately amplified with the onset of insecure SOHO detection methods. and IoT devices. Some of these devices use their own protection This paper contributes to the state of the art in four ways: against dictionary attacks and some offload their security to mechanisms deployed in the infrastructure, such as flow- • We present a simple model of interaction between an based IPS systems. These mechanism often operate with the attacker and host or network based systems for detection notion of a typical attacker, who represent common attacks of dictionary attacks. This model covers most currently against the infrastructure and lacks sophistication. However, available and deployed systems. the emergence of stealthy and distributed dictionary attacks • We analyze available detection methods and derive a indicate that detection mechanisms should take sophisticated attackers into account. In this paper we explore the capability set of three classes of detection methods which are of a sophisticated attacker, who can estimate parameters of recognizable by an attacker and useful for parameter detection methods deployed on target and can then craft an estimation. attack which could go undetected for arbitrary long time. For • We present an algorithm for estimation of detection this, we propose a new model of attacker-target interaction dur- ing dictionary attacks, which is based on attacker’s perspective. methods’ parameters, which can be used for crafting Using this model we then postulate and experimentally evaluate attacks undetectable by target detection mechanism. an algorithm for estimating parameters of target detection • We provide a tool for generation of datasets of detec- method, illustrating that an attacker requires only a handful tors’ behavior. This tool can be used to create training of attack attempts to correctly guess these parameters. and testing datasets and also to evaluate efficiency of different attack strategies. I. INTRODUCTION This paper is divided into six sections. The Section II sur- Despite introduction of various authentication methods, veys the state of the art in the area of host and network-based one-factor password-based authentication is still prevalent. detection of dictionary attacks against common protocols and Although home computers and servers are usually reasonably derives three classes of detection mechanisms. The Section protected by local monitoring, there exist two groups of III derives a model of interaction between attacker and host devices which gain notoriety for their lack of security and and network-based detectors based on six distinct features. susceptibility to dictionary attacks: SOHO devices, such as The Section IV presents a tool for generation of detectors’ routers, and IoT devices, as can be clearly demonstrated datasets and for evaluation of attack strategies. The Section by existence of tools like Shodan. [1] Their widespread V introduces an algorithm for estimation parameters of de- deployment, tendency to use default credentials and subse- tection methods, and presents its evaluation. The Section VI quent abuse in various botnets returns the issue of dictionary concludes the paper and outlines further research directions. attacks back to forefront. Although many of these devices are, especially in a corporate environment, monitored by II. STATE OF THE ART IDS/IPS solutions to offset their lack of security, these The majority of dictionary attacks target the following pro- solutions expect a certain class of attackers with relatively tocols: HTTP(S), SSH and RDP. The measures for preventing visible behavior. However, as the emergence of stealthy and dictionary attacks can be divided into two groups: industrial distributed attacks, and advanced persistent threats revealed, and academic. there are attacks that are predominantly stealthy and which The industrial group is represented by host-based tools can go unnoticed for a long time. [2] In our networks such as SSHGuard [3], fail2ban [4], denyhosts [5], or ssh- we have detected such long-running low-profile attacks, black [6], which, after a predefined number of login attempts which convince us that they are commonly used and remain is reached, block the attacker by means of firewall configu- unchecked by many today systems. ration. Despite some of them having SSH in the name, they In this paper we analyze the method for estimating pa- are useful for a number of protocols susceptible to dictionary rameters of target defense and for tailoring dictionary attacks attacks, such as IMAP, POP3, FTP, or HTTP. For securing which can go unnoticed with maximum efficiency. We hope web applications, there are many options to prevent or delay that our research will motivate further research in detection attacker, ranging from custom scripts, to large commercial of dictionary attacks and that the provided data will be used solutions like Sucuri firewall [7] or Wordfence [8]. There are also network-based industrial solutions, such as Flowmon In this model, every detection method is described by the ADS [9] following six parameters: The academic group focuses mostly on network-based • Processing mode: Either continuous or batch. The attack detection to alleviate the need for end machine man- processing mode largely differentiates between flow- agement and to enable security of otherwise unsecurable de- level network-based methods from the host-based ones. vices. The SSH protocol is being the most common use-case. The flow-based methods usually process data in batches The approaches ranges from heuristics [10], statistics [11], of a few minutes (processing window). detecting abnormal behavior using flat traffic detection [12], • Response delay: The delay between an attempt and [13] or using hidden Markov models [14], to abnormalities any possible action. In the batch processing mode in DNS traffic [15]. Recently, most work is focused on this represents the length of a processing window. In correctly distinguishing attack attempts in the network from continuous processing mode this parameter is seldom legitimate traffic, e.g., [2] who detects stealthy dictionary at- used and can represent a processing delay caused by tacks by measuring transition points of SSH protocol or [16], technical means (e.g., time required to update firewalls). [17], who used machine learning for traffic classification. • Time window: A time span in which attacks are Additionally [14] presented an algorithm for detecting an evaluated. actual compromise of an ssh service. From the point of • Blocking threshold: The number of unsuccessful login defender, these academic approaches are all striving to make attempts in a time window before block is issued. the detection more precise, with less false positives and less • Inter-attempt delay: A delay between particular login noise generated. From the point of an attacker, however, these attempts. The delay does not have to be a fixed number, enhancements make very little difference, unless they are it can, e.g., grow exponentially with each unsuccessful actively trying to be stealthy. attempt. Despite a large number of methods and tools, detection • Block duration: A time required for attacker to be mechanisms can be divided into three types, depending on able to attempt another login on one target after being how they would react to the attack: blocked. • immediate detection [e.g., SSHGuard, fail2ban]: After An attacker and a target work in request-response fashion. the attacker tries a predefined threshold of attempts, they An attacker initiates a login attempt and the target responds are blocked. with one of the following: • delayed flow-based detection [e.g., SSHCure, Flowmon • Success: A login attempt succeeded in trying an authen- ADS]: If the attacker tries a certain number of attempts tication. This does not mean that an attacker guessed in a given measured time window, they are blocked. the correct credentials, as this is an implementation Such evaluation is done in batches of typically five detail depending on attacker’s dictionary and target’s minutes. authentication policy. • rate limiting [e.g., custom delay scripts]: The attacker • Delay: A login attempt did not lead to authentication, faces delays in allowed attempts. because of being delayed. These types can be combined [e.g., fail2ban on host and • Block: A login attempt led to a blocking of the attacker. flow-based solution on the network-level] and each of these Despite a target and its associated detector being two groups have to be coupled with blocking mechanism to have separate entities, as is in case of network-based detection, any impact on the attacker. they act as one target with blocking capability. In our model, an attacker interacts with a target only III. MODEL OF HOST- AND NETWORK-BASED through authentication attempts and acquires all knowledge DETECTION MECHANISMS only from targets’ responses. The attacker is thus able to In this section we present a model of detectors based collect the following statistics, which are later used for clas- on a number of assumptions, which are discussed in the sification of detection methods and for estimating detection text. As the aim of this paper is an estimation of detection parameters: methods’ parameters, the model is built from the perspec- • Attempted intensity: The number of login attempts per tive of an attacker who has a number of unique attacking given timeframe, which the attacker tried. machines at hand. It is expected that every detection method • Achieved intensity: The number of login attempts per is accompanied by a mechanism for blocking an attacker, given timeframe, which the attacker managed (i.e., this regardless whether this block happens on the host or on the number is lowered by delayed attempts). network level. Because this model is considered from the • Successful attempt count: The number of successfully perspective of an attacker, a detection without action would tried authentication attempts. be inconsequential for the attacker. It is also expected that • Block time: Time elapsed before the attacker was a detector can differentiate each authentication attempt and blocked. that the attacker can discern target response correctly. This does not preclude modelling of detection methods which use A. Limitations technical means to fool automated attackers, but it leaves out The model as described has several limitations, which the part that is largely implementation specific. were considered and deemed as not considerably limiting model’s descriptive power. proxies the attacker can use. The attacker then made attempts The model cannot fully describe detection methods based against a detector, until it was blocked. After the block, if on trend analyses, similarity search, etc. However, from the the attacker has some lives left, it subtracted life, changed attacker’s point of view, the specifics of these methods can the intensity and attacked again. The evaluation criteria were be treated as an implementation detail, because on smaller losing as little life possible and maximizing the number of time scales, even these detection methods can be expressed attempts the attacker could do in span of fourteen days. by this model. V. ESTIMATION OF DETECTION PARAMETERS The model currently considers only 1:1 relation between an attacker and a target. However, distributed attacks against In order to optimize the attack, we need to know the exact an infrastructure with network-based detection indicate the type and parameters of the detector protecting the target. need for N:M relation. This will very likely require an The estimates must be derived solely from the reactions additive change to the model, but is unlikely to considerably of the detector. In this chapter we describe an algorithm, impact the proposed detection parameters estimation. It is, which determines the type and parameters of the detection however, one of future research directions. by modifying the intensity of the attack. The model does not consider attacks with large variations A. Detection Method Type in intensity. However, given the applicability of constant/flat traffic characteristics as a basis for detection [13], [12], this The estimation of the detection method type is rather limitation is not that important. simple: • if the achieved intensity is lower than attempted inten- B. Examples sity, the detector exerts rate limiting To give an example of an application of the model, we rep- • if the successful attempt count is independent of the in- resent three detection methods: fail2ban [4], SSHCure [18], tensity and low, the detector exerts immediate detection and a delaying web authentication script. The parameters are • otherwise the detector uses delayed flow-based detec- in Table I. Fail2ban was configured to block after 5 attempts tion. in a day and to block for a day. SSHCure is processing data B. Detection Method Parameters in five-minute windows and blocks if there are more than 20 attempts a minute in a time window. Because we are We will now discuss how the parameters can be derived considering only a simple attacker, who attack with constant for each detection method type. intensity, this translates to 100 attempts in the entire time 1) Rate Limiting: The parameters of the rate limiting window. The script allowed 10 attempts with exponential method can be easily derived from the timing of the actual delays between them before blocking. Similitude between authentication attempts. the model and real-life implementation was evaluated by 2) Immediate detection: The estimation of detection pa- comparing the results of the detection evaluation tool from rameter for immediate detector is very simple. The only the Section IV and the respective detection tools. parameter is the successful attempt count, which remains constant through attacks. IV. DETECTION EVALUATION TOOL 3) Delayed flow-based detection: The estimation of de- We have developed a tool, which is able to build detection layed flow-based detection parameters is slightly more com- methods according to our model for evaluation. The tool is plicated. Three parameters of the model play a role: available at [19]. The tool consist of two types of objects: • response delay: how often is the detection run (e.g., detectors and attackers. Detectors are created by supplying every 5 minutes), model parameters, attackers have to be implemented from • time window: the length of the interval over which scratch. However, we readily provide two types of attackers: the number of attempts is counted and compared to a a simple attacker which takes a number of attempts and a detection threshold, time window and then performs these attempts against a • blocking threshold: the number of attempts in a detec- selected detector, and a sophisticated attacker which esti- tion window, which is considered to be an indication of mates detection parameters of selected detector and attempts attack and a signal to block. to maximize the successful attempt count. From the attacker’s point of view, the most important is the The tool can be used for two tasks: generating a detection ratio between blocking threshold and time window (detection dataset for a detector and direct evaluation of attacker and intensity), because it specifies the maximal intensity of an its strategy. undetected attack. To generate our datasets, we ran an attacker with in- The only way to estimate the detection intensity is by creasing intensity against particular detectors and gathered trial and error. The attacker attacks the target with given the statistics with attempted and achieved intensities, the intensity, waits for the detector’s reaction, improves their successful attempt count, and a time to block. The scripts for estimate of the detection intensity and selects an intensity generating these datasets are also a part of the repository. for the next attack until they are certain about the detection To directly evaluate attackers’ strategy, each attacker was intensity. The goal is to choose the attacks’ intensity so that given a number of lives, which represented the number of the attacker minimizes the number of iterations to achieve Method Processing mode Response delay Time window Blocking threshold Inter-attempt delay Block duration Fail2ban Continuous 0 sec. 86400 sec. 5 0 sec. 86400 sec. SSHCure Batch 300 sec. 300 sec. 100 0 sec. 86400 sec. Delay script Continuous 0 sec. 86400 sec. 10 Exponential 86400 sec. TABLE I M ODEL PARAMETERS FOR DIFFERENT DETECTION METHODS precise estimation of the detection intensity, as well as the higher) and the time the detector needed to block the attack number of times the attacker is detected during the iterations. is not higher than twice the detection window. Therefore we We propose an algorithm, that selects efficiently the attack can update the upper bound on the blocking threshold and intensity for next iteration based on the history of detector’s the lower bound on the time window as follows: reactions to previous attacks, so that both number of required attacks and detections are minimized. r0k+1 = min(r0k , i · t), Assume the detector has the time window of length W t w0k+1 = max(w0k , ). and the blocking threshold R and the detection intensity 2 I = R/W . The algorithm bounds the area containing the To evaluate the efficiency of our method, we compared actual combination (W, R) of time window and blocking it with the bisection method. The bisection method selects threshold (Figure 1) and refines the bounds with each attack. the intensity of the next attack as the average of the upper The boundary consists of a lower and upper bound on and lower bound on the detection intensity and updates the the detection intensity (i0 , i1 ), an upper bound on blocking upper/lower bound when the attack is or is not blocked. threshold (r0 ) and a lower bound for time window (w0 ). Each We tested both methods on 180 combinations of blocking step, the intensity of the next attempt i0 is selected so that thresholds and time windows. The thresholds ranged from the bounded area is halved by the line i0 w. The algorithm 10 to 40 with the step of 2. The time windows ranged from stops once the span between i0 and i1 is lower than the error 5 seconds to 60 seconds with the step of 5 seconds. level established in advance. r i 1w 14 i' w 12 detection count i 0w 10 r0 8 6 4 2 bisection optimized w0 w Fig. 1. Area Bounds Fig. 2. Number of Detections The algorithm improves the bounds as follows: assume we First, we compared the average number of times the did an attack with given intensity i attempts per second. If attacker was detected by the detector. The overall results are the attack was not blocked, then the detection intensity must shown in the Figure 2. The average number of detections for be greater than the intensity of the attack, therefore we can each step is shown in the Figure 3. Our method was detected update the lower bound for the detection intensity fewer times during the process of estimating the detection method parameters than the bisection method. Since the ik+1 0 = max(ik0 , i). attacker either needs a new proxy server or new attacking On the other hand, if the attack was blocked after t seconds, machine each time they are blocked by the detector, our following the same logic, we can update the upper bound on method puts less strain on resources. the detection intensity. Another important property is how fast a method con- verges to the detection intensity. The Figure 4 summarizes ik+1 1 = min(ik1 , i) how many attacks with different intensity the attacker needs Moreover, the total number of attempts in the attack is to correctly estimate the detection intensity. The Figure 5 definitely higher than the threshold (can be actually much displays the speed of the convergence to the result. For each optimized achieved intensity for low number of attempts and the 14 bisection number of attempts is limited by the immediate method. 12 In the rare case that rate limiting is triggered by a few 8 10 attempts, the rate limiting parameters can be estimated detections from attempt timing. • rate limiting + delayed flow-based detection: First, the 6 rate limiting parameters are estimated from the timing of 4 the attempts. For the purpose of estimating the delayed 2 flow-based detection parameters, we can use the same algorithm with a minor modification, that the achieved 0 2 4 6 8 10 12 14 16 intensity, not the attempted intensity, must be considered steps for limiting the area. • immediate detection + delayed flow-based detection: Fig. 3. Detections Similarly to the combination of rate limiting and im- mediate, the delayed flow-based detection usually does 20 not have any effect when combined with immediate detection, because it is always triggered before the 18 delayed flow-based detection is triggered. • rate limiting + immediate detection + delayed flow- 16 attack count based detection: see rate limiting + immediate de- 14 tection and immediate detection + delayed flow-based detection. 12 VI. CONCLUSION 10 The increase in attackers’ sophistication contests the typ- ical notion of an attacker, which is prevalent in currently bisection optimized deployed and researched detection methods. In this paper we take on the role of a moderately sophisticated attacker Fig. 4. Number of Attempts and demonstrate, how such an attacker can craft dictionary attacks undetectable by target’s detection mechanisms. To achieve this, we first present a short survey of currently used detection methods and demonstrate how, from the perspective 1.0 optimized of an attacker, all methods can be classified into one of three bisection categories which dictate an attack strategy. We then present a 0.8 model of attacker-target interaction during an attack, which is then used to derive an algorithm for precise estimation 0.6 of parameters of target’s detection mechanisms. With these span information at hand, crafting an undetectable attack against 0.4 arbitrary detection method is easy. To test capabilities of detection methods and to evaluate different attack strategies, 0.2 we provide open access to a simulation tools which builds upon the presented model. 0.0 0 2 4 6 8 10 12 14 steps A. Future work Despite the presented model’s ability to capture current Fig. 5. Intensity Interval Span detection methods, it is still quite simple and not suitable for modeling more complicated attack scenarios and strategies. This include multiple attackers and multiple targets, and attacks with variable timing. In our future work, we want step the average value of the span between the lower and to provide such extension to this model and to evaluate it upper bound is shown. against deployed detection methods. C. Combination of detection methods types R EFERENCES If the detector employs more than one detection methods [1] J. Matherly, “Shodan.” types, the algorithm proceeds as follows: [2] A. Satoh, Y. Nakamura, and T. Ikenaga, “A flow-based detection • rate limiting + immediate detection: from the at- method for stealthy dictionary attacks against secure shell,” Journal of Information Security and Applications, vol. 21, pp. 31 – 41, 2015. tacker’s point of view, the immediate method is dom- [3] “Sshguard.” https://www.sshguard.net/. Accessed: 2017- inant. The rate limiting usually does not influence the 07-01. [4] “Fail2ban.” https://www.fail2ban.org/. Accessed: 2017-07- 01. [5] “denyhosts.” https://github.com/denyhosts/denyhosts. Accessed: 2017-07-01. [6] “sshblack.” http://www.pettingers.org/code/ sshblack.html. Accessed: 2017-07-01. [7] “Sucuri website application firewall.” https://sucuri.net/ website-firewall/. Accessed: 2017-07-01. [8] “Wordfence.” https://www.wordfence.com/. Accessed: 2017- 07-01. [9] “Flowmon ads.” https://www.flowmon.com/en/products/ flowmon/anomaly-detection-system. Accessed: 2017-07- 01. [10] J. Vykopal, A Flow-Level Taxonomy and Prevalence of Brute Force Attacks, pp. 666–675. Berlin, Heidelberg: Springer Berlin Heidelberg, 2011. [11] C. Callegari, S. Giordano, and M. Pagano, “New statistical approaches for anomaly detection,” Security and Communication Networks, vol. 2, no. 6, pp. 611–634, 2009. [12] M. Drašar, “Protocol-independent detection of dictionary attacks,” in Meeting of the European Network of Universities and Companies in Information and Communication Engineering, pp. 304–309, Springer, Berlin, Heidelberg, 2013. [13] M. Jonker, R. Hofstede, A. Sperotto, and A. Pras, “Unveiling flat traffic on the internet: An ssh attack case study,” in 2015 IFIP/IEEE International Symposium on Integrated Network Management (IM), pp. 270–278, May 2015. [14] R. Hofstede, L. Hendriks, A. Sperotto, and A. Pras, “Ssh compromise detection using netflow/ipfix,” SIGCOMM Comput. Commun. Rev., vol. 44, pp. 20–26, Oct. 2014. [15] K. Takemori, D. Romaa, S. Kubota, K. Sugitani, and Y. Musashi, “Detection of ns resource record dns resolution traffic, host search, and ssh dictionary attack activities,” International Journal of Intelligent Engineering and Systems, vol. 2, pp. 35–42, 12 2009. [16] G. K. Sadasivam, C. Hota, and B. Anand, “Classification of ssh attacks using machine learning algorithms,” in 2016 6th International Conference on IT Convergence and Security (ICITCS), pp. 1–6, Sept 2016. [17] M. M. Najafabadi, T. M. Khoshgoftaar, C. Kemp, N. Seliya, and R. Zuech, “Machine learning for detecting brute force attacks at the network level,” in 2014 IEEE International Conference on Bioinfor- matics and Bioengineering, pp. 379–385, Nov 2014. [18] L. Hellemons, L. Hendriks, R. Hofstede, A. Sperotto, R. Sadre, and A. Pras, SSHCure: A Flow-Based SSH Intrusion Detection System, pp. 86–97. Berlin, Heidelberg: Springer Berlin Heidelberg, 2012. [19] “Tool for modeling attacker-target interaction.” https://github. com/CSIRT-MU/Interactor3000. Accessed: 2017-07-01.