Integrating Collaborative Cognitive Assistants into Cybersecurity Operations Centers Steven Meckl, Gheorghe Tecuci, Dorin Marcu, Mihai Boicu Learning Agents Center, Volgenau School of Engineering, George Mason University, Fairfax, VA 22030, USA smeckl@masonlive.gmu.edu, teuci@gmu.edu, dmarcu@gmu.edu, mboicu@gmu.edu Abstract security research over the past several years as threat intel- This paper presents current work in integrating cognitive ligence companies began tracking their specific tools, tech- agents, trained to detect advanced persistent threats (APTs), niques, and procedures (TTPs), attributing those TTPs to into cybersecurity operations centers (CSOCs). After intro- threat actors, and publishing reports on the groups. ducing APTs, the cognitive APT detection model, and the FireEye/Mandiant has published reports on 30 APT groups training of the agents, it overviews how the collection agents required to gather evidence for cognitive agents are selected, since 2013, naming them simply APT1 through APT30 how abductive triggers are generated using collected data and (FireEye 2015). APT1, the most infamous of the APT threat intelligence, how the Collection Manager software is groups, was attributed to Unit 61398 of China’s People’s used to integrate cognitive agents with selected collection Liberation Army (Mandiant 2013). agents, and how results of searches are added to the The responsibility of a CSOC’s analysts is to monitor knowledge base as evidence. These concepts are illustrated with an example of how the system uses these components to alerts and log information from available sources, each hav- search for evidence required to detect an APT attack. ing differing levels of credibility, and use them to make de- cisions about the presence or absence of intrusion activity. However, modern detection technologies are error-prone, Introduction because log information can be ambiguous. As a result, each The science and art of intrusion detection and prevention alert must be carefully examined and investigated by a hu- has evolved over the last decade, largely due to the shift man analyst (Zimmerman 2014). In a large enterprise, thou- from cyber vandals and pranksters to multi-billion-dollar sands of alerts can be reported daily, and most organizations criminal enterprises and state-sponsored APT intrusion report they are able to investigate less than 50 in a typical methodology (Zimmerman, 2014). What was once the realm work week (Ponemon Institute 2017), leaving most alerts of criminals with a small collection of easily discovered au- uninvestigated and increasing risk to the enterprise. Improv- tomated tools is now ruled by well-funded and highly so- ing the efficiency and accuracy of analysis and threat intel- phisticated sets of hackers carefully orchestrating intrusions ligence tasks could drastically lower the cost of running a as a means to advance their criminal enterprise or intelli- CSOC and reduce risk to the organization. gence collection mission. One approach to increasing CSOC efficiency is to employ State-sponsored intrusion sets such as the People’s Re- collections of agile cognitive assistants, able to capture and public of China’s (PRC) APT1 have demonstrated that or- automatically apply the expertise of cybersecurity analysts, ganization, funding, and lack of consequences can be more to automate detection of APTs and other sophisticated effective than use of sophisticated intrusion tools (Mandiant, threats (Meckl et al. 2017). These cognitive agents use ab- 2013). An APT is an adversary that leverages superior re- ductive, deductive, and inductive reasoning to learn from sources, knowledge, and tactics to achieve its goals through cybersecurity experts how to autonomously respond to computer network exploitation and are notorious for their CSOC alerts, capture digital evidence related to the alert, persistence and ability to adapt to efforts of network defend- and analyze it to detect threats. ers in order to gain persistent access to victim networks A significant architectural challenge to this approach is (Mandiant 2013). APT groups have become a major area of integration of the agents into real-world CSOCs, which have ______________________________________________________________________ a wide variety of security infrastructure. There are thou- Copyright © 2018 by the papers authors. Copying permitted for private sands of commercial and open source security products and academic purposes. In: Joseph Collins, Prithviraj Dasgupta, available for CSOC managers to choose from. For this ap- Ranjeev Mittu (eds.): Proceedings of the AAAI Fall 2018 Symposium proach to be successful, cognitive agents must be able to on Adversary-Aware Learning Techniques and Trends in Cybersecurity, Arlington, VA, USA, 18-19 October, 2018, published at http://ceur-ws.org seamlessly interact with security sensors in place in the erate alternative hypotheses that could explain similar trig- CSOC with minimal re-architecture of the system. This pa- gers, as will be discussed later and illustrated in Fig.6. per builds on previous research to present current results on Once the hypotheses that can explain the trigger are gen- researching the automation of CSOCs with agile cognitive erated, the expert would need to assess which of them is assistants. Specifically, we focus on how our agents interact most likely, and thus to determine whether there is an intru- with real-world security infrastructure to detect attacks. sion or not. For this, however, the expert would need more We start with an overview of our approach to teaching a evidence. The expert will put each hypothesis to work to learning agent shell how to detect APTs. Then, we discuss guide him/her in the process of collecting this evidence, as how the search/collection agents required to gather evidence abstractly illustrated in the right-hand side of Fig.1. In par- for cognitive agents are selected, how abductive triggers are ticular, the expert will decompose H1 into simpler and sim- generated using collected data and threat intelligence, how pler hypotheses, down to the level of hypotheses that show the Collection Manager software is used to integrate cogni- very clearly what evidence is needed to prove them. For ex- tive agents with selected search/collection agents, and how ample, H1 would be true if 𝐻"" and … and 𝐻#" would be true. their results are added to the knowledge base as evidence. Then, to determine whether 𝐻#" is true, one would need to invoke a search procedure 𝑆#% that may return evidence 𝐸#% , if present. If 𝐸#% is found, its credibility and relevance deter- APT Detection: Teaching and Learning mine the probability of 𝐻#" (Tecuci et al. 2016b). Once the For many years we have researched the Disciple theory, probabilities of the sub-hypotheses are assessed, the proba- methodology, and tools for the development of knowledge- bilities of the upper-level hypotheses are determined, and based cognitive assistants that: (1) learn complex problem- one may conclude whether there is an intrusion or not. solving expertise directly from subject matter experts; (2) From a decomposition tree like that in Fig.1, the agent support experts and non-experts in problem solving; and (3) will learn both hypothesis analysis rules and collection teach their problem-solving expertise to students (Tecuci rules. These rules will enable the agent to automatically de- 1988; 1998; Boicu et al. 2000; Tecuci et al. 2005; 2016a). compose similar hypotheses and search for evidence, as will In the following we summarize how the Disciple-EBR be discussed later and illustrated in Fig.7. learning agent shell is taught to detect APT intrusions. In Each of the rules mentioned above is initially partially essence, an expert cybersecurity analyst teaches Disciple- learned as an ontology-based generalization of one example EBR in a way that is similar to how the expert would teach and its explanation. They are then used in reasoning to dis- a student or an apprentice, by explaining APT detection ex- cover additional positive and negative examples and are fur- amples to it. The agent learns general rules from these ex- ther incrementally refined based on these new examples and amples and applies them to new situations. Then the expert their explanations. The Disciple approach is based on meth- critiques the attempted detections by the agent, and the agent ods for integrating machine learning with knowledge acqui- improves its learned rules and its ontology, to more accu- sition (Tecuci and Kodratoff 1995), and on multi-strategy rately simulate the detection activity of the expert. Fig.1 is learning (Tecuci, 1988; 1993; Tecuci et al. 2016a). an abstract illustration of this process. Agent teaching and learning is a continuous process re- First, the expert specifies an event of interest, or trigger sulting in the customization of Disciple-EBR into an agent (T1 in Fig.1), that should alert the agent of a potential intru- sion, for example, an alert generated by the BRO IDS (Paxson, 1999) in the case of an intrusion by the Auriga mal- ware of APT1. The problem is that such an alert may be generated by BRO also in cases when there is no intrusion, called false positives. Thus, once such an event is generated by BRO in a real situation, the expert’s question is: What hypotheses would explain it? Therefore, the next step in the teaching process is to specify the abductive steps of generating alternative hy- potheses that may explain this alert, as abstractly shown in the left part of Fig.1. Some of these hypotheses are APT1 intrusion hypotheses (e.g., H1), but others are false positive hypotheses (e.g., Hq). From these abductive reasoning steps, the agent will learn trigger, indicator, and question rules. These rules will enable the agent to automatically gen- Figure 1: Agent teaching and learning. that not only has reasoning modules for all the phases of the Manager. APT1 intrusion detection process, but also a knowledge The Collection Manager uses collection rules to generate base (KB) with a developed ontology and reasoning rules search requests and invokes specialized collection agents to for all these phases. This agent is used to generate several search for evidence on the network in response to these re- autonomous agents, each specialized for a specific phase of quests. Then it uses matching collection rules to represent APT intrusion detection in a CSOC, as discussed next. the retrieved evidence into the corresponding KB and places the KB back into the hypothesis analysis queue for further analysis (if needed), until the hypothesis with the highest Cognitive Assistants for APT Detection probability is determined. Fig.2 shows an overview of the architecture of the Cogni- tive Assistants for APT Detection (CAAPT). Selection and Integration of Collection Agents The Trigger Agent receives alerts from a variety of sources, such as network IDSs or endpoint protection sys- Abstract searches requested by the analysis agents require tems, uses a matching trigger rule to represent the alert in evidence from multiple types of data sources available on a ontological form in a KB, and places that KB into the hy- typical network. There are hundreds of security appliances, pothesis generation queue from which KBs are extracted by log source, and data store combinations in real-world net- the Hypothesis Generation Agent. works. Therefore, a comprehensive set of corresponding The Hypothesis Generation Agent uses indicator and collection agents is required. Industry research has deter- question rules to generate alternative hypotheses that could mined that the most critical security technologies are a Se- explain the trigger and places the KB into the hypothesis curity Incident/Event Management (SIEM) system, a net- analysis queue from which KBs are extracted by the Auto- work detection/collection solution, and a host detection and matic Analysis Agents. query solution (Chuvakin, 2018), so our selection of agents The Automatic Analysis Agents use hypothesis analysis focused on those areas. We have chosen to use those critical rules to decompose the hypotheses from such a KB, as much technologies, as well as others, as needed, broken down into as possible, down to the level of evidence collection re- the following categories from the ontology in Fig. 3. quests. Then they place the KB into the evidence collection Passive network monitors are responsible for passively queue from where each KB is extracted by the Collection watching data as it moves across the network and either re- cording it in full or recording metadata in the form of logs, Passive host monitors, which watch operating system and ap- plication activity on a host computer and record metadata as logs. On-demand host agents, which allow for collection and analysis of raw forensic arti- facts, including disk and memory data, from work- stations and servers. They can also be used to retrieve log data generated by passive host mon- itors in an on-demand fashion. For CAAPT, collection agents were chosen based pri- marily on their ability to collect and query data required for de- tecting sophisticated attacks. Based on the requirements for modeling detection for APT1 malware, we chose a collection of agents for netflow (network Figure 2: CAAPT architecture overview. connection) data, full packet capture, DNS logs, volatile memory artifacts, Windows intelligence to data collected by security sensors, which in- Registry keys and values, file-based artifacts, endpoint logs, clude endpoint security such as anti-virus software or net- domain controller logs, EDR logs, and passive DNS data. work-based firewalls and intrusion detection systems. Next, free or open source solutions were prioritized to elim- Threat intelligence data is distributed in the form of indica- inate cost as a barrier to adoption. Lastly, we chose tools tors of compromise (IOCs), which include file hashes, anti- supporting a RESTful API (MuleSoft, 2016) for uniformity virus signatures, and the fully-qualified domain names of integration. (FQDNs) or IP addresses of known malicious servers. Secu- GRR is used as our sole on-demand host agent and is rity alerts come in a variety of formats but are normalized comprised of an agent which must be run on each host com- and sent to the SIEM. puter on the network and a server responsible for managing Fig.5 shows an overview of the process by which a BRO search requests. GRR’s collection functions are managed alert log entry becomes an alert message sent to the CAAPT using its RESTful API. Trigger Agent. BRO was chosen as the IDS because of its The output of passive collection agents is log data which ability to easily consume network threat intelligence and ef- must be stored in a SIEM based on tools such as Elas- ficiently apply it to identify threats, it also generates logs for ticsearch or Splunk. For CAAPT, we use Elasticsearch as a network metadata for use in follow-on analysis. SIEM. As shown in Fig.4, all passive collectors used by the The first step in the process is to convert logs from the system send log data, in the form of JSON documents, to CSV format to a JSON message and transport the log entry Elasticsearch for storage and indexing. to our Elasticsearch database. This is done using FileBeat. For collection agents having a non-JSON log format Next, a process is required to search Elasticsearch for new (such as SYSMON and BRO) Elasticsearch Beats (Beats, alerts and send them to the Trigger Agent. We developed a 2018) are used to convert the logs to JSON before sending custom Windows program called the Alert Generation Ser- them to Elasticsearch. This includes Filebeat for collecting vice to perform this task. This service polls Elasticsearch on BRO logs, Winlogbeat for collecting SYSMON and Win- a specified interval, looking for alert log entries generated dows system logs, and Packetbeat for collecting raw net- by BRO. A new Trigger Agent message is created for each work data. new alert, using relevant information from the BRO alert, and sent to the Trigger Agent to start the analytic process. In the example in Fig.5, an alert was generated by BRO because a computer it was monitoring made a DNS request to resolve a domain known, via threat intelligence, to be as- sociated with APT1. The message sent to Elasticsearch con- tains several extemporaneous data elements added by FileBeat. For the purposes of threat detection, the system is Figure 3: Ontology of search and collection agents. primarily concerned with the information contained in the message data element. The Alert Generation Service parses VirusTotal provides a free passive DNS database which that field, converting the relevant data fields into appropriate is used for historical domain/IP resolution queries, and data elements required for the JSON message on the right. Rekall (2017) is used by GRR to retrieve raw memory arti- When the message is received by the Trigger Agent, it is facts from hosts. added to the knowledge base in the form of an ontology frag- ment, as shown on the far right of Fig.5. Each field in the JSON message is ingested as an instance of a fact in the on- Generating Abductive Triggers from Threat tology. In this example, knowledge of the connection Intelligence (source and destination IPs and ports) is captured directly The cognitive assistants in CAAPT respond to and inves- tigate security alerts generated by a CSOC’s security infra- structure, where an analyst is typically required to conduct follow-on analysis to determine whether a threat was accu- rately identified. The first step in this process is to use avail- able detection technologies to identify potential threats based on threat intelligence and use the resulting security alerts to trigger the abductive reasoning process. This sec- tion describes the process CAAPT uses to generate abduc- tive triggers from threat intelligence. At its core, security alerts are created by applying threat Figure 4: Passive collector integration architecture. Figure 5: How security alerts become abductive triggers. from the BRO alert. The Trigger Agent can further specify, on the network. Then a question rule is matched, to generate based on a learned rule, that the destination port is for DNS a question which the previously generated hypothesis could because it uses standard port 53. The domain a-jsm.infobusi- answer. From the question rule, multiple competing plausi- nessus.org is associated with the connection. Because the ble hypotheses are generated, which also could answer the knowledge base includes modeled knowledge of adversary question. Generation of the set of multiple competing hy- TTPs, the domain’s association with APT1 is automatically potheses is called abductive reasoning and completes the recognized. The alert information is added to the knowledge first phase of the theoretical model of APT detection. base using ontology actions associated with the learned trig- Fig.6 shows an example of the abductive reasoning pro- ger rule that matched the BRO alert. cess. Using an indicator rule, the agent generates the hypoth- When the process described in Fig.5 is complete, the Trig- esis that the connection is part of an APT1 intrusion. How- ger Agent places a new knowledge base to be used for fur- ever, there are multiple hypotheses which could explain the ther analysis into the hypothesis generation queue. The Hy- connection, including both those generated based on mod- pothesis Generation Agent then uses the new knowledge eled adversary knowledge and those that would indicate a base for the abductive reasoning process, using learned rules false positive. In this example, we offer two plausible false to create a set of multiple competing hypotheses which positive hypotheses. The first is that connection was gener- could explain why the alert was generated. First, an indica- ated as part of security intelligence gathering. Security op- tor rule is matched, which generates a hypothesis from the erations or research personnel often accidentally trigger se- suspicious connection that there is an active APT1 intrusion curity alerts performing their duties. The second hypothesis is that a trusted application made the connection. Secu- rity tools often perform DNS lookups as part of their data enrichment features. Unless the IDS is configured to exclude applications performing this type of activity, false positive alerts can be generated. It should be noted both of these types of false positives were accidentally triggered during the course of this research. Hypothesis-driven Search for Evidence Once a set of hypotheses are generated, the next phase is the deductive reasoning process, where each top-level hypothesis is decomposed into one or more sub-hypoth- eses. The process, executed by an Automatic Analysis Agent, continues until a set of leaf hypotheses are gen- erated requiring one or more searches for evidence. This overall process is called hypothesis-driven search. Figure 6: Hypotheses generation from abductive trigger. Fig.7 shows an example of the initial hypothesis decom- Modeling Search Results as Evidence position tree for the detection of an APT1 intrusion, based on modeled adversary knowledge. At the top level, we de- When the Collection Manager completes the Get- compose the hypothesis stating the network connection DomainIPResolution search, it will respond to the calling causing the BRO alert is the result of an APT1 intrusion, agent with the JSON-formatted response from the upper into two sub-hypotheses. The sub-hypothesis on the left right part of Fig.8. states the connection involves an active C2 server. This hy- The response message contains the input parameters from pothesis is further broken down to its two components: the the search request, a requestID used by the Collection Man- domain a-jsm.infobusinessus.org was active at the time of ager and calling agent to map response messages to the cor- the connection and was registered using a dynamic DNS responding request messages, and the output parameters. In provider. These are typically true when there is an active this example, it was determined, based on passive DNS data, APT1 attack. The sub-hypothesis on the right states the pro- that the domain a-jsm.infobusinessus.org was mapped to the gram used in the attack is APT1 malware. IP address 69.195.129.72 at time 12/23/2017 12:18:07 PM The leaf nodes of the decomposition tree result in three because that IP address was assigned to the domain at the different searches for evidence. All three searches will even- time in mappingStartTime (01/15/2017 11:11 GMT” and it tually lead to evidence being added to the KB for this secu- is still assigned (mappingEndTime is set to “present”). rity alert investigation. The search for the program that made The response message also includes a human-readable the network connection will result in that branch of the de- description of what the found evidence means (evidenceDe- composition tree being further decomposed, asking more scription), and the value denoted in the evidenceCredibility, detailed questions about the behavior of the malware. which is a value on a linear probability scale ranging from L0 (Lack of Support) to L11 (Certain). In this case, the Col- lection Manager is certain the returned evidence is credible. Using Search Results as Evidence The response is matched with the JSON response template associated with the corresponding collection rule, and the The abstract searches from Fig.7 must be turned into con- ontology actions associated with the same rule are used to crete searches for real evidence on the network. The Collec- generate the ontology fragments corresponding to the data tion Manager is responsible for that process. Let’s consider, elements in the response message and store them in the for example, the left-most search from Fig.7, which is look- knowledge base as evidence as shown in Fig.8. ing for the IP address to which a domain was mapped at a This evidence can now be used by automatic analysis specific point in time. Using the JSON request template as- agents to decide on the likelihood of an actual APT1 attack sociated with the collection rule that generated that leaf, the present on the network, or to request additional evidence, if JSON-formatted search message at the top center of Fig.8 is needed. created and sent to the Collection Manager. When the Collection Manager receives this mes- sage, it will call the func- tion GetDomainI- PResolution, which is one of the programmed search functions supported by the Collection Manager, map- ping data elements from the search into function pa- rameters. GetDomainI- PResolution uses a passive DNS database, such as the one maintained by Vi- rusTotal, to determine the IP address mapped to APT1 domain a-jsm.info- businessus.org at the time specified in the timeStamp field in the JSON request. Figure 7: Hypothesis-driven search for evidence. Figure 8: Automatic evidence collection and use. Collection Manager Architecture for search agent wrappers, allowing it to easily translate ab- stract search requests into requests for information from real The Collection Manager is the main integration point be- search agents. tween the cognitive agents and CSOC infrastructure. The Depending on the amount of time required to complete a analysis agents know what information is needed to expand search, requests to a search agent can be either synchronous their analyses, but the search requests are in abstract form. or asynchronous. From the perspective of analysis agents, The primary function of the Collection Manager is translat- all requests are asynchronous, but internally, the Collection ing high-level (abstract) search instructions into specific Manager supports both call models. API calls to host and network agents, determining which In the synchronous call model, there is a single thread for such agent to send search requests to, and wrapping calls to responsible for dequeuing abstract search requests from the specific search agents with a RESTful API. Results returned Request Queue, formatting a concrete search for a specific from search agents are then converted into evidence and search agent and dispatching the request to the appropriate added to the knowledge bases of the analysis agents. search agent. It then waits on the TCP connection for a syn- Fig.9 is an overview of the Collection Manger process. chronous response. When it is received, the thread is respon- When the analysis agents analyze competing hypotheses, sible for parsing the response to extract digital artifacts, for- many of the searches generated by the hypothesis-driven matting it as evidence, and sending it to the caller. search process (such as those in Fig.7) are sent to the Col- In the asynchronous call model, the call happens in two lection Manager and added to the request queue. Requests threads. In one thread, the abstract search request is received are then dispatched for processing and a receipt message is from the Request Queue. The request is parsed, and a con- sent back to the caller. The receipt includes the requestID so crete search request is prepared and sent to the intended the response can be matched to the search. To minimize re- search target. A second thread is responsible for polling the dundant searching and increase performance, the Collection search target on an interval for the response. When it is Manager performs caching of search results. Each search re- available, the response data is parsed, artifacts are extracted, quest is hashed, and the hash value is used as a key to the and a response is prepared and sent to the Response Queue. cache table. A duplicate search will have a matching hash The Collection Manager’s flexible architecture and sup- value and its result can be used instead of re-executing the port of multiple call models allows it to easily integrate with search. When a search is conducted, results of each search a wide variety of security infrastructure, making the process are added to the cache with the appropriate key and a time of porting CAAPT to a new CSOC straightforward. to live (TTL) value. Once the TTL is expired, the search re- sults are considered invalid and are purged from the cache. Status and Future Research Abstract searches requested by analysis agents require ev- idence from multiple types of data sources available to the A first prototype of the presented system was completed in CSOC. In order for the analysis agents to integrate with real January 2018. A final prototype is currently under develop- networks, the Collection Manager uses a plugin architecture ment, with a focus on improving agent teaching, experimen- endpoint-investigator FireEye. 2015. APT30 and the Mechanics of a Long-Running Cyber Espionage Operation, https://www.fireeye.com/blog/threat- research/2015/04/apt_30_and_the_mecha.html GRR, 2013. Grr: GRR Rapid Response: Remote Live Forensics for Incident Response, https://github.com/google/grr Mandiant, 2013. APT1 - Exposing One of China’s Cyber Espio- nage Units, https://www.fireeye.com/content/dam/fireeye- www/services/pdfs/mandiant-apt1-report.pdf Meckl, S., Tecuci, G., Marcu, D., Boicu, M., and Bin Zaman, A., 2017. Collaborative Cognitive Assistants for Advanced Persistent Threat Detection, in Proceedings of the 2017 AAAI Fall Sympo- sium “Cognitive Assistance in Government and Public Sector Ap- plications,”171-178, AAAI Technical Report FS-17-02, Arling- ton, VA: AAAI Press, Palo Alto, CA. MuleSoft, 2016. What Is REST API Design? https://www.mulesoft.com/resources/api/what-is-rest-api-design. Splunk, 2015. Operational Intelligence, Log Management, Appli- cation Management, Enterprise Security and Compliance, http://www.splunk.com/ Figure 9: Collection Manager Overview. Paxson, V., 1999. Bro: A System for Detecting Network Intruders in Real-Time, Computer Networks 31, 23–24, tation within a simulated CSOC, and experimental integra- https://doi.org/10.1016/S1389-1286(99)00112-7 tion into a real CSOC. The CAAPT system will be evaluated Rekall, 2017. Rekall Memory Forensic Framework, using a testing methodology designed to demonstrate its http://www.rekall-forensic.com/ ability to detect and adapt to APT attacks. Additionally, the Tecuci, G. 1988. Disciple: A Theory, Methodology and System for system’s performance will be compared against data col- Learning Expert Knowledge, Thése de Docteur en Science, Uni- lected from a large CSOC to compare it to manual analysis versity of Paris South. processes in real-world situations. We expect the final pro- Tecuci G., 1998. Building Intelligent Agents: An Apprenticeship totype will significantly increase the probability of detecting Multistrategy Learning Theory, Methodology, Tool and Case intrusion activity while drastically reducing the workload of Studies, San Diego: Academic Press. the operators. Tecuci G., Boicu M., Boicu C., Marcu D., Stanescu B., and Bar- bulescu M., 2005. The Disciple-RKF Learning and Reasoning Agent, Computational Intelligence, Vol.21, No.4, pp. 462-479. Acknowledgements Tecuci G., Boicu M., Marcu D., Boicu C., and Barbulescu M. 2008. Disciple-LTA: Learning, Tutoring and Analytic Assistance, This research was sponsored by the Air Force Research La- Journal of Intelligence Community Research and Development. boratory (AFRL) under contract number FA8750-17-C- Tecuci G., Marcu D., Boicu M., Schum D.A., 2016a. Knowledge 0002, and by George Mason University. The views and con- Engineering: Building Cognitive Assistants for Evidence-based clusions contained in this document are those of the authors Reasoning, Cambridge University Press. and should not be interpreted as necessarily representing the Tecuci, G., Schum, D. A., Marcu, D., and Boicu, M. 2016b. Intel- official policies or endorsements, either expressed or im- ligence Analysis as Discovery of Evidence, Hypotheses, and Argu- plied, of the U.S. Government. ments: Connecting the Dots, Cambridge University Press. Tecuci, G., 1993. Plausible justification trees: A framework for deep and dynamic integration of learning strategies. Machine References Learning, 11(2-3), pp.237-261. Tecuci G., Kodratoff Y. (eds.) 1995. Machine Learning and Beats. 2017. Beats, https://www.elastic.co/products/beats Knowledge Acquisition: Integrated Approaches, Academic Press. Boicu, M., Tecuci, G., Marcu, D., Bowman, M., Shyr, P., Ciucu, Tecuci G., Marcu D., Meckl S., Boicu M., 2018. Evidence-based F., and Levcovici, C. 2000. Disciple-COA: From Agent Program- Detection of Advanced Persistent Threats, in Computing in Sci- ming to Agent Teaching, Proc. 27th Int. Conf. on Machine Learn- ence and Engineering, November. ing (ICML), Stanford, California: Morgan Kaufman. Ponemon Institute, 2017. The Cost of Insecure Endpoints. Chuvakin, A., 2018. The Best Starting Technology for Detection? https://datasecurity.dell.com/wp-content/up- https://blogs.gartner.com/anton-chuvakin/2018/03/06/the-best- loads/2017/09/ponemon-cost-of-insecure-endpoints.pdf starting-technology-for-detection/ Volatility, 2015. Volatility, GitHub, https://github.com/volatili- Elasticsearch, 2015. Elasticsearch: RESTful, Distributed Search & tyfoundation/volatility Analytics, https://www.elastic.co/products/elasticsearch Zimmerman, C., 2014. Ten Strategies of a World-Class Cyberse- EnCase, 2017. EnCase Endpoint Investigator - Remote Digital In- curity Operations Center. MITRE Press. vestigation Solution, https://www.guidancesoftware.com/encase-