<?xml version="1.0" encoding="UTF-8"?>
<TEI xml:space="preserve" xmlns="http://www.tei-c.org/ns/1.0" 
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" 
xsi:schemaLocation="http://www.tei-c.org/ns/1.0 https://raw.githubusercontent.com/kermitt2/grobid/master/grobid-home/schemas/xsd/Grobid.xsd"
 xmlns:xlink="http://www.w3.org/1999/xlink">
	<teiHeader xml:lang="en">
		<fileDesc>
			<titleStmt>
				<title level="a" type="main">AISecKG: Knowledge Graph Dataset for Cybersecurity Education</title>
			</titleStmt>
			<publicationStmt>
				<publisher/>
				<availability status="unknown"><licence/></availability>
			</publicationStmt>
			<sourceDesc>
				<biblStruct>
					<analytic>
						<author>
							<persName><forename type="first">Garima</forename><surname>Agrawal</surname></persName>
							<email>garima.agrawal@asu.edu</email>
							<affiliation key="aff0">
								<orgName type="department">School of Computing and Augmented Intelligence</orgName>
								<orgName type="institution">Arizona State University</orgName>
								<address>
									<settlement>Tempe</settlement>
									<region>AZ</region>
									<country key="US">USA</country>
								</address>
							</affiliation>
						</author>
						<author>
							<persName><forename type="first">Kuntal</forename><surname>Pal</surname></persName>
							<email>kkpal@asu.edu</email>
							<affiliation key="aff0">
								<orgName type="department">School of Computing and Augmented Intelligence</orgName>
								<orgName type="institution">Arizona State University</orgName>
								<address>
									<settlement>Tempe</settlement>
									<region>AZ</region>
									<country key="US">USA</country>
								</address>
							</affiliation>
						</author>
						<author>
							<persName><forename type="first">Yuli</forename><surname>Deng</surname></persName>
							<affiliation key="aff0">
								<orgName type="department">School of Computing and Augmented Intelligence</orgName>
								<orgName type="institution">Arizona State University</orgName>
								<address>
									<settlement>Tempe</settlement>
									<region>AZ</region>
									<country key="US">USA</country>
								</address>
							</affiliation>
						</author>
						<author>
							<persName><forename type="first">Huan</forename><surname>Liu</surname></persName>
							<email>huanliu@asu.edu</email>
							<affiliation key="aff0">
								<orgName type="department">School of Computing and Augmented Intelligence</orgName>
								<orgName type="institution">Arizona State University</orgName>
								<address>
									<settlement>Tempe</settlement>
									<region>AZ</region>
									<country key="US">USA</country>
								</address>
							</affiliation>
						</author>
						<author>
							<persName><forename type="first">Chitta</forename><surname>Baral</surname></persName>
							<email>chitta@asu.edu</email>
							<affiliation key="aff0">
								<orgName type="department">School of Computing and Augmented Intelligence</orgName>
								<orgName type="institution">Arizona State University</orgName>
								<address>
									<settlement>Tempe</settlement>
									<region>AZ</region>
									<country key="US">USA</country>
								</address>
							</affiliation>
						</author>
						<title level="a" type="main">AISecKG: Knowledge Graph Dataset for Cybersecurity Education</title>
					</analytic>
					<monogr>
						<idno type="ISSN">1613-0073</idno>
					</monogr>
					<idno type="MD5">B2559BED94FD50823AA684EC6C5C1CBE</idno>
				</biblStruct>
			</sourceDesc>
		</fileDesc>
		<encodingDesc>
			<appInfo>
				<application version="0.7.2" ident="GROBID" when="2023-12-29T08:27+0000">
					<desc>GROBID - A machine learning software for extracting information from scholarly documents</desc>
					<ref target="https://github.com/kermitt2/grobid"/>
				</application>
			</appInfo>
		</encodingDesc>
		<profileDesc>
			<textClass>
				<keywords>
					<term>Knowledge Graph</term>
					<term>Cybersecurity Education</term>
					<term>Ontology</term>
					<term>Knowledge Base</term>
					<term>KG Dataset</term>
					<term>Language Model</term>
				</keywords>
			</textClass>
			<abstract>
<div xmlns="http://www.tei-c.org/ns/1.0"><p>Cybersecurity education is exceptionally challenging as it involves learning the complex attacks; tools and developing critical problem-solving skills to defend the systems. For a student or novice researcher in the cybersecurity domain, there is a need to design an adaptive learning strategy that can break complex tasks and concepts into simple representations. An AI-enabled automated cybersecurity education system can improve cognitive engagement and active learning. Knowledge graphs (KG) provide a visual representation in a graph that can reason and interpret from the underlying data, making them suitable for use in education and interactive learning. However, there are no publicly available datasets for the cybersecurity education domain to build such systems. The data is present as unstructured educational course material, Wiki pages, capture the flag (CTF) writeups, etc. Creating knowledge graphs from unstructured text is challenging without an ontology or annotated dataset. However, data annotation for cybersecurity needs domain experts. To address these gaps, we made three contributions in this paper. First, we propose an ontology for the cybersecurity education domain for students and novice learners. Second, we develop AISecKG, a triple dataset with cybersecurity-related entities and relations as defined by the ontology. This dataset can be used to construct knowledge graphs to teach cybersecurity and promote cognitive learning. It can also be used to build downstream applications like recommendation systems or self-learning question-answering systems for students. The dataset would also help identify malicious named entities and their probable impact. Third, using this dataset, we show a downstream application to extract custom-named entities from texts and educational material on cybersecurity.</p></div>
			</abstract>
		</profileDesc>
	</teiHeader>
	<text xml:lang="en">
		<body>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="1.">Introduction</head><p>Learning cybersecurity requires mastering the academic content and developing critical thinking and problem-solving skills based on cyber attacks and defense scenarios. We can achieve this interactive and active learning by creating an AI-powered education system where students can control their learning process <ref type="bibr">[1,</ref><ref type="bibr" target="#b3">2,</ref><ref type="bibr" target="#b4">3]</ref>. Knowledge graphs have been effectively used in education and improving the learning experience <ref type="bibr">[4]</ref>.</p><p>A knowledge graph combines two things, a graph with domain-specific data and an explicit representation of knowledge. The graph can capture the domain-related key concepts and their interactions with each other. It allows the user to analyze and understand the connections or relationships between different entities. Using explicit knowledge or metadata provides the relevant background and important information about the domain. This metadata allows the system to establish a common vocabulary and use shared references. The knowledge graphs are thus an integrated tool that can use the underlying data and knowledge for concept visualization and contextual reasoning <ref type="bibr" target="#b6">[5]</ref>. Their ability to translate data into usable knowledge makes them suitable for education. They can promote cognitive engagement in the problem-based learning environment.</p><p>However, to build such systems, there is a need for annotated datasets. There are no public datasets in the cybersecurity education domain. The education material includes unstructured texts in lecture notes, lab manuals, Wiki pages, capture the flag (CTF) writeups, and others. Scraping unstructured text and creating domain-specific knowledge graphs is challenging, especially without standard ontology and annotated datasets. The task of annotating data is expensive and time-consuming as it can be done only by cybersecurity domain experts accurately. The increase in demand for cybersecurity professionals requires preparing an effective cybersecurity specialist workforce and equipping them with intelligent learning tools. A comprehensive dataset with cybersecurity-related named entities is a significant bottleneck in this area.</p><p>In this paper, we address this issue by making three main contributions. First, using domain knowledge, we propose an ontology for self-paced cybersecurity learning for novice users. Second, we create an annotated named entity dataset. Using this dataset, we show one downstream application to extract named entities from texts and educational material on cybersecurity. Third, we present a triple dataset AISecKG, for cybersecurity education as defined by our ontology. Using the triples data, we show one downstream task to construct a concept flow graph for a cybersecurity tool. It is possible to write graph queries to generate sub-graphs focusing on the specific learning needs of a user. Also, by combining the AISecKG ontology schema and the cybersecurity named entities, knowledge graphs can be created from any unstructured texts on cybersecurity. Our ontology and labeled dataset can also be used to build applications like question-answering and recommendation systems for students.</p><p>The paper is organized as follows. In the next section, we discuss the related work. Section 3 describes the method and ontology. Section 4 presents our work's results and two applications. Finally, we conclude the paper in Section 5.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="2.">Related Work</head><p>Security plays an integral role in software development and has become more critical with the Internet of Things (IoT). Various studies to formalize security and develop security ontologies and knowledge models address different security aspects. Souag et al. <ref type="bibr" target="#b7">[6]</ref> gave a security ontology to elicit security requirements. The Unified Cybersecurity Ontology (UCO) <ref type="bibr" target="#b8">[7]</ref> focused on identifying the vulnerabilities and threat levels to assess the system security. Doynikova <ref type="bibr" target="#b9">[8]</ref> proposed an ontology on security metrics for cybersecurity assessment to determine the attack goal. An extensive study on formalizing information security focuses on security concepts, and threat mitigation and control process <ref type="bibr" target="#b10">[9]</ref>. A cybersecurity ontology was proposed to build and monitor the security in the cloud for IoT environment <ref type="bibr" target="#b11">[10]</ref>. Iannacone et al. <ref type="bibr" target="#b12">[11]</ref> developed an ontology for managing cybersecurity knowledge database from different data sources to propose a search mechanism for blacklisted systems. MALOnt <ref type="bibr" target="#b13">[12]</ref> gives the ontology and knowledge graphs for malware threat intelligence. Martins et al. <ref type="bibr" target="#b14">[13]</ref> presented a conceptual characterization of available cybersecurity ontologies based on their application. In this work, we introduce an ontology AISecKG which covers a broader spectrum of the fundamental concepts, tools, techniques, and applications used in the cybersecurity ecosystem. Essentially this ontology is helpful for any first-time user or new learner in the cybersecurity domain.</p><p>Many datasets have also been proposed in the cybersecurity domain, but most are based on network flow data and are used to train machine learning algorithms to build intrusion detection systems. Alshaibi et al. <ref type="bibr" target="#b15">[14]</ref> gave a comparative study of these datasets. Recently more efforts have been made to create cybersecurity named-entity datasets. A new dataset for event detection in cybersecurity texts <ref type="bibr" target="#b16">[15]</ref> annotated 30 types of critical events in cybersecurity to train the machine learning models. A named-entity recognition (NER) python library called CyNer <ref type="bibr" target="#b17">[16]</ref> based on MALOnt ontology <ref type="bibr" target="#b13">[12]</ref> was developed to extract the malware and threat indicators. Language models were also built for cybersecurity <ref type="bibr" target="#b18">[17,</ref><ref type="bibr" target="#b19">18,</ref><ref type="bibr" target="#b20">19]</ref> using the open source CVE <ref type="bibr" target="#b21">[20]</ref>, and NVD Mitre <ref type="bibr" target="#b22">[21]</ref> datasets on vulnerability and attacks. Dasgupta et al. <ref type="bibr" target="#b23">[22]</ref> gave a comparative study on NER algorithms based on these datasets for cybersecurity. In our current work, we develop a labeled named-entity dataset for cybersecurity based on AISecKG ontology which is used to build knowledge graphs from any unstructured text on cybersecurity.</p><p>Most of the available cybersecurity ontology and datasets are used to build intrusion detection systems or perform vulnerability analysis and threat detection. Table <ref type="table">1</ref> compares our ontology and cybersecurity dataset with existing works. There is limited research to educate novice learners on cybersecurity concepts, tools, and techniques. Deng et al. <ref type="bibr" target="#b24">[23]</ref> proposed using knowledge graphs as lab project guidance to teach cybersecurity. They focused on finding similar concepts on the web using similarity measures <ref type="bibr" target="#b25">[24]</ref> and word embeddings <ref type="bibr" target="#b26">[25]</ref>. In our paper <ref type="bibr" target="#b27">[26]</ref>, we proposed a semi-automated approach to build knowledge graphs from the unstructured cybersecurity course material and conducted a survey and interview with students to assess the perception of students on using knowledge graphs as a problem-solving education tool aid. The students found the knowledge graphs very useful, which motivated us to propose AISecKG, a comprehensive ontology and a labeled dataset that can be used to build AI systems to learn about cybersecurity. In this work, we give a detailed ontology to understand the cybersecurity ecosystem from different views and present an annotated named-entity recognition dataset to extract cybersecurity-related entities.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="3.">Method for Development of AISecKG</head><p>Cybersecurity is the application of state-of-the-art technologies, control processes, policies, tools, and procedures for protecting or recovering systems and information from malicious attacks <ref type="bibr" target="#b28">[27]</ref>. As a novice learner, one must know the cybersecurity ecosystem comprising fundamental concepts, tools, and techniques and how to use and deploy them to assess and</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>Cybersecurity Model</head></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>Purpose</head><p>Ontology Dataset Souag et al. <ref type="bibr" target="#b7">[6]</ref> Security Requirement Elicitation ✓ X UCO <ref type="bibr" target="#b8">[7]</ref> Vulnerability Assessment ✓ X Doynikova et al. <ref type="bibr" target="#b9">[8]</ref> Security Metrics ✓ X Fenz et al. <ref type="bibr" target="#b10">[9]</ref> Threat Mitigation and control ✓ X Mozzaquatro et al. <ref type="bibr" target="#b11">[10]</ref> IoT Security Monitoring ✓ X Iannacone et al. <ref type="bibr" target="#b12">[11]</ref> Search cybersecurity knowledge base ✓ X Alshaibi et al. <ref type="bibr" target="#b15">[14]</ref> Intrusion Detection Models X Network Flow datasets Tikhomirov et al. <ref type="bibr" target="#b19">[18]</ref> Vulnerability/Attack detection X Open Source (CVE/NVD) Ma et al. <ref type="bibr" target="#b20">[19]</ref> Vulnerability/Attack detection X Open Source (CVE/NVD) Gao et al. <ref type="bibr" target="#b18">[17]</ref> Vulnerability/Attack detection X Open Source (CVE/NVD) Trong et al. <ref type="bibr" target="#b16">[15]</ref> Event Detection X ✓(Annotated dataset) MALOnt <ref type="bibr" target="#b13">[12]</ref> Malware Threat Intelligence KG ✓ ✓(Annotated dataset) CyNer <ref type="bibr" target="#b17">[16]</ref> Malware Threat Entity Extraction X (using MALOnt) X AISecKG Cybersecurity education KG ✓ ✓(Annotated Triples KG Dataset)</p><p>Table <ref type="table">1</ref> Comparative analysis of existing ontologies and datasets on cybersecurity.</p><p>detect vulnerabilities and attacks. This section presents our ontology design and dataset for cybersecurity education called AISecKG.</p><p>We propose a comprehensive view of concepts, applications, and roles involved in the cybersecurity ecosystem. Since the objective is to build a self-paced learning tool for cybersecurity students and novice learners, we use the graduate-level course material and hands-on lab instruction manuals to teach graduate students majoring in cybersecurity as the data source. Our ontology, AISecKG, is built using domain knowledge and motivated by the lab guides. The dataset is created by annotating the lab documents using AISecKG ontology. We then use this annotated dataset to develop two applications. The first application is to train a language model on our dataset to extract named entities related to cybersecurity. The second application is to create a triple dataset with entity-relation-entity pairs to construct knowledge graphs. Both these applications are described in the next section.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="3.1.">Data Source</head><p>We collected data from the laboratory instruction manuals. The manuals are for projects of advanced cybersecurity courses for graduate students. These courses cover topics such as using tools like NMap and Snort <ref type="bibr" target="#b29">[28]</ref> to build intrusion detection systems, employing honeypot techniques in Metasploit framework to deceive attackers, setting up Kali Linux systems, and monitoring system activities and attack events using Syslog. The manuals are in standard English and explain the concepts and instructions for implementing laboratory tasks. Each manual is 15-20 pages long. For annotation, we used six such lab manuals with a total of 100 pages with approximately 26886 words and 110953 characters.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="3.2.">Ontology</head><p>Ontology is a formal and explicit schematic representation of a system using a well-defined taxonomy. It allows semantic modeling of the domain knowledge and thus can be used as the skeleton to build any AI application for that system <ref type="bibr" target="#b30">[29]</ref>. Ontology also defines the rules and constraints of the system and facilitates the validation of semantic relationships and conclusions or inferences from known facts. For a knowledge-based system, ontology serves as a backbone of the system and should be meticulously developed. Some deep learning-based methods rely on automatically building the AI application from the data without using an ontology, but they fail to capture the comprehensive view of the domain, and the quality of applications becomes questionable <ref type="bibr" target="#b31">[30]</ref>. On the other hand, if a well-defined structured dataset is unavailable and most of the knowledge is present in unstructured texts, it is overwhelming for a domain expert to scrape long texts and create a domain-specific ontology. Also, it is costly to find domain experts in cybersecurity.</p><p>The domain experts should have practiced or significantly demonstrated sufficient knowledge and experience. In this work, the first and second authors are graduate researchers in cybersecurity, and the third author is a cybersecurity expert and instructor. He teaches graduate-level cybersecurity courses at his university.</p><p>To develop AISecKG ontology, we used the bottom-up approach given in the paper <ref type="bibr" target="#b27">[26]</ref>. We used the lab documents as a reference and then used domain knowledge to design the ontology. First, we extracted the generic entities and relations from the lab documents using the parts of speech tagging and the dependency parsing given by spacy-based named-entity recognition (NER) <ref type="bibr" target="#b32">[31]</ref> natural language processing (NLP) methods. The entities extracted using NER are the subject-object pairs, and relations are the predicates in the sentences. These entity-relationentity triples are not specific to cybersecurity, but they help break down the long texts into simple graph-like structures and create a preliminary visual representation of information. They serve as a good reference point for domain experts. This step semi-automated the ontology construction process and significantly reduced time and effort. It helped in discovering the schematic and semantic relationships of core entities.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="3.2.1.">Key Entities</head><p>The cybersecurity ecosystem essentially has three foundational pillars, namely, concept, application, and role. We can classify concepts into features, functions, data, attacks, vulnerabilities, and techniques. In addition to defensive and attack methods, the techniques here include security policies and management processes. The application denotes the tools, systems, and apps. The user, attacker, and securityTeam are the three roles. Thus in our ontology, we defined these three categories with 12 types of entities.</p><p>Figure <ref type="figure" target="#fig_0">1</ref> depicts the cybersecurity education ecosystem with each category and entity type. The attributes or metadata considered for the entities are entityID, entityName, entityType, and entityCategory. The examples from each entity type within the respective category are shown in Table <ref type="table" target="#tab_1">3</ref>.2.1.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="3.2.2.">Relations</head><p>We used the nine most common and appropriate relations to represent the real-world interactions between cybersecurity entities. Table <ref type="table" target="#tab_1">3</ref> shows the relations along with examples from our dataset as entity-relation-entity triples.  </p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="3.2.3.">Cybersecurity Schema</head><p>We now present the schema design for learning cybersecurity. We illustrate the interactions between different components from the perspective of the roles. Figure <ref type="figure" target="#fig_1">2</ref> shows the user's view. It depicts how users use the data, applications, and systems routinely. The system and apps, in turn, use different tools for their usual operations and defensive techniques to monitor and analyze the environment. The icons in the diagram represent the respective entities, and the labeled edges show the relationship between different entities. Figure <ref type="figure" target="#fig_2">3</ref> gives the attacker view. When the applications expose vulnerabilities and the attacker can exploit them using various tools and attack techniques, the attacker and attacks can harm the data and applications.</p><p>The third view is the security view. Figure <ref type="figure" target="#fig_3">4</ref> shows how the security team uses tools and   </p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="3.3.">AISecKG Dataset Annotation</head><p>Using the AISecKG Ontology, we identified 964 cybersecurity-related unique entities from the course materials. There are 12 entity types in three categories in the ontology. We labeled the attributes, entity Id, entity type, and entity category against each entity and created an entity info list. To train the model to predict custom cybersecurity-related entities, we created the annotated dataset using BIO (Beginning-Inside-Outside) sequence tagging scheme <ref type="bibr" target="#b33">[32]</ref>. The entity boundary is defined by tags 'B' and 'I' called Beginning and Inside the label. All the words other than entity are labeled as 'O'. The lab documents were first split into sentences using a simple python script. There were 593 sentences, and 2354 entities were annotated in these sentences. The code and commands were discarded from the text. The annotation was done by the first and second authors and was validated by the third author. from public texts related to the education cybersecurity ecosystem.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="4.1.1.">Dataset Preparation:</head><p>We split the AISecKG annotated dataset into train, dev, and test keeping 3, 1, and 2 documents, respectively. The train, dev, and test splits contain 5772, 3591, and 195 entities in 372, 214, and 13 sentences, respectively. We keep an empty line as a separator for each cybersecurity sentence. This dataset is provided as input to each model.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="4.1.2.">Models</head><p>We experiment with six variations of two transformer-based language models: BERT <ref type="bibr" target="#b34">[33]</ref>, and RoBERTa <ref type="bibr" target="#b35">[34]</ref>. For BERT, we use cased and uncased versions of the base (110M parameters) and large (340M parameters) variations, and for RoBERTa, we use both the base (125M parameters) and large (355M parameters) models. The BERT-base and RoBERT-base architectures have 12 layers, 12 attention heads, and 768 hidden dimensions, whereas both the BERT-large and RoBERTa-large have 24 layers, 16 attention heads, and 1024 hidden dimensions.</p><p>First, the model tokenizes the input sentence and generates embeddings of the tokens. Then we consider the sequence labeling approach of the language models, that is, classifying each </p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head>Table 4</head><p>Performance of BERT and RoBERTa on the AISecKG dataset: Bold represents best performance, higher value is better for each metric token of a given sentence into any one of the 25 classes (12 entity types with B and I tags along with O representing other. We aggregate the classified continuous beginning and intermediate tokens into entities. In this approach, we not only extract the entities but also identify the type of these entities.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="4.1.3.">NER Results</head><p>We train each model for 30 epochs with a maximum sequence length of 128 per GPU batch size of 32. Table <ref type="table">4</ref>.1.3 shows the performance of our sequence classification on the test set. It can be seen from the table that case-sensitive BERT performs best in all the metrics. This shows that case sensitivity positively impacts the cybersecurity NER model. As expected, all smaller versions of the models perform comparatively poorly compared to their larger counterparts because of less number of parameters. Our accuracy in predicting the entities is over 80% across all the models. Our precision, recall, and F1 scores are pretty good, given the fewer training samples and many diverse class categories. This shows the effectiveness of our model in identifying entities involved in our ontology from cybersecurity texts.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="4.2.">Triples for Knowledge Graph</head><p>The second application uses the annotated dataset to create triples for the knowledge graphs.</p><p>The triples are a way to store the graph data in the form of 'entity-relation-entity,' where the entity represents the nodes, and the relation represents the labeled edge. The triples data can be used to construct knowledge graphs to provide a visual representation. For this work, since the focus is to provide learning aids to students, we build visual concept graphs from the lab documents using these triples. The conceptual graphs help break down complex information and allow the students to visually analyze the underlying concepts and the interconnections between different concepts. The constraints and rules of creating edges were defined based on the schema definition of AISecKG Ontology. There are 68 schema edges between the 12 entity types as per the schema given in Section 3. We use the annotated sentences from lab documents. The relations between labeled entities in each sentence were extracted automatically by matching with the tuples in the ontology. We manually validated the triples and removed the redundant and ambiguous triples. Around 812 triples were auto-generated, which were reduced to 730 triples after validation in the final dataset. Table <ref type="table">5</ref> gives a list of sample triples from the dataset for each tuple in our ontology.</p><p>Figure <ref type="figure" target="#fig_4">5</ref> shows one of the sub-graphs generated using the subset of triples. This graph shows the knowledge graph on NMap tool. The visual graphs related to a specific entity can be created by writing the graph queries. Any graph database, such as RDF or Neo4j, can be used, and graph query languages like GraphQL, SPARQL, or CyPher can query and generate the graphs <ref type="bibr" target="#b36">[35]</ref>. We have used the Networkx library in Python to generate the graph. We store the triple in a csv file to make it publicly available. The triple dataset, annotated data and implementation code for both applications are available in our github repository 1 . Thus the ontology and labeled entities in AISecKG can be used to create knowledge graphs from any unstructured texts on cybersecurity by extracting the cybersecurity-related named entities using the model and relations per the ontology.</p></div>
<div xmlns="http://www.tei-c.org/ns/1.0"><head n="5.">Conclusion and Future Works</head><p>In this work, we present a novel ontology on Cybersecurity Education, AISecKG, and show that this ontology is vital to building self-paced AI-based learning tools for cybersecurity learners. More research must be done in this direction as these tools can be crucial to prepare the cybersecurity specialist workforce. Additionally, we introduce a manually annotated named entity dataset based on ontology. We also show how our AISecKG can be used in downstream tasks. First, we present how the language models can be trained with our annotated dataset to extract cybersecurity-related named entities from the cybersecurity documents. There are minimal works <ref type="bibr" target="#b37">[36]</ref> on extracting such information from public forum cybersecurity learning materials written by professionals for novice vulnerability researchers. We want to extend this work beyond lab manuals to cybersecurity educational texts in public forums. Secondly, we show the process of creating triples by automatically extracting the relations based on the schema definition given by the ontology. We present one application as the construction of knowledge graphs from triple data for concept visualization. Other downstream applications like question-answering systems and learning recommendation systems can be built using the triple dataset. </p></div><figure xmlns="http://www.tei-c.org/ns/1.0" xml:id="fig_0"><head>Figure 1 :</head><label>1</label><figDesc>Figure 1: Cybersecurity Education Ecosystem: Concepts, Roles, Applications</figDesc><graphic coords="6,169.14,84.19,257.00,166.50" type="bitmap" /></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" xml:id="fig_1"><head>Figure 2 :</head><label>2</label><figDesc>Figure 2: User View shows the interaction of users with apps, system and data which in turn use different tools and techniques.</figDesc><graphic coords="7,118.24,260.01,358.80,246.60" type="bitmap" /></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" xml:id="fig_2"><head>Figure 3 :</head><label>3</label><figDesc>Figure 3: Attacker View shows the interactions between different entities when a system is exposed to attacks.</figDesc><graphic coords="8,120.34,84.19,354.60,282.60" type="bitmap" /></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" xml:id="fig_3"><head>Figure 4 :</head><label>4</label><figDesc>Figure 4: Security View shows the vulnerability and attack analysis by security team using different tools and techniques.</figDesc><graphic coords="9,123.34,84.19,348.60,294.60" type="bitmap" /></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" xml:id="fig_4"><head>Figure 5 :</head><label>5</label><figDesc>Figure 5: A sub-graph on Nmap to show Knowledge graph generated from the triples in dataset</figDesc><graphic coords="12,89.29,84.19,416.69,416.69" type="bitmap" /></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" type="table" xml:id="tab_0"><head>Table 2</head><label>2</label><figDesc>The table shows the category and types of key entities in the ontology with examples for each.</figDesc><table><row><cell>Category</cell><cell>Type</cell><cell>Examples of Entity Names</cell></row><row><cell>Concept</cell><cell>feature</cell><cell>session ID, cookies, protocol</cell></row><row><cell></cell><cell>function</cell><cell>tcpdump, snort rules, hash, XOR</cell></row><row><cell></cell><cell>attack</cell><cell>smurf attack, sql injection, spyware</cell></row><row><cell></cell><cell cols="2">vulnerability bad config, weak password</cell></row><row><cell></cell><cell>technique</cell><cell>honeypot, security policy, risk assessment</cell></row><row><cell></cell><cell>data</cell><cell>files, logs, message, packet</cell></row><row><cell>Application</cell><cell>tool</cell><cell>burp, wireshark, snort, sniffer</cell></row><row><cell></cell><cell>system</cell><cell>linux, server, client, host</cell></row><row><cell></cell><cell>app</cell><cell>browser, webapp, service</cell></row><row><cell>Role</cell><cell>attacker</cell><cell>black hat, attack host</cell></row><row><cell></cell><cell cols="2">securityTeam security engineer, white hat</cell></row><row><cell></cell><cell>user</cell><cell>employee, user</cell></row></table></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" type="table" xml:id="tab_1"><head>Table 3</head><label>3</label><figDesc>The table shows the sample triples from the dataset in entity-relation-entity form as per the schema.</figDesc><table><row><cell>Relation</cell><cell>Sample Triples</cell></row><row><cell>has_a</cell><cell>Nmap has_a network mapper</cell></row><row><cell cols="2">can_analyze Packet Decoder can_analyze header anomaly</cell></row><row><cell cols="2">can_expose Intel CPU can_expose CVE-2017-5754</cell></row><row><cell cols="2">can_exploit Attack host can_exploit TCP syn packet</cell></row><row><cell cols="2">implements Network administrators implements map</cell></row><row><cell>uses</cell><cell>Team defense uses firewall</cell></row><row><cell>can_harm</cell><cell>Attack can_harm target host</cell></row><row><cell>can_detect</cell><cell>Full scan can_detect Trojan horses</cell></row><row><cell>is_part_of</cell><cell>Metasploit Framework is_part_of Kali Linux</cell></row></table></figure>
<figure xmlns="http://www.tei-c.org/ns/1.0" type="table" xml:id="tab_2"><head></head><label></label><figDesc>MetricBERT-base-uncased BERT-large-uncased BERT-base-cased BERT-large-cased RoBERTa-base RoBERTa-large</figDesc><table><row><cell cols="2">Accuracy (%) ↑ 81.91</cell><cell>82.17</cell><cell>81.43</cell><cell>83.30</cell><cell>80.63</cell><cell>82.71</cell></row><row><cell>Precision ↑</cell><cell>45.69</cell><cell>45.49</cell><cell>47.32</cell><cell>48.73</cell><cell>44.20</cell><cell>47.97</cell></row><row><cell>Recall ↑</cell><cell>51.58</cell><cell>53.81</cell><cell>51.70</cell><cell>56.04</cell><cell>48.65</cell><cell>51.23</cell></row><row><cell>F1-score ↑</cell><cell>48.46</cell><cell>49.30</cell><cell>49.41</cell><cell>52.13</cell><cell>46.32</cell><cell>49.55</cell></row></table></figure>
		</body>
		<back>

			<div type="acknowledgement">
<div xmlns="http://www.tei-c.org/ns/1.0"><head>Acknowledgments</head><p>We are thankful to National Science Foundation under Grant No. 2114789 for supporting this research work. We would also like to acknowledge Dijiang Huang for his vision and guidance.</p></div>
			</div>

			<div type="annex">
<div xmlns="http://www.tei-c.org/ns/1.0"><head>Table 5</head><p>The table shows the triples generated from the labeled text. The relations were extracted using the ontology defined in Section 3</p></div>			</div>
			<div type="references">

				<listBibl>

<biblStruct xml:id="b0">
	<analytic>
		<title/>
	</analytic>
	<monogr>
		<title level="j">Applications of AISecKG</title>
		<imprint>
			<biblScope unit="volume">4</biblScope>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b1">
	<monogr>
		<title level="m">Using our AISecKG dataset and its Named Entity annotations, automated systems can be developed to help identify the named entities References</title>
				<imprint/>
	</monogr>
	<note>NLP Language Model to Extract custom Named-Entities Here we present the first application of AISecKG</note>
</biblStruct>

<biblStruct xml:id="b2">
	<analytic>
		<title level="a" type="main">Development of an instrument designed to investigate elements of science students&apos; metacognition, self-efficacy and learning processes: The semli-s</title>
		<author>
			<persName><forename type="first">G</forename><surname>Thomas</surname></persName>
		</author>
		<author>
			<persName><forename type="first">D</forename><surname>Anderson</surname></persName>
		</author>
		<author>
			<persName><forename type="first">S</forename><surname>Nashon</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">International Journal of Science Education</title>
		<imprint>
			<biblScope unit="volume">30</biblScope>
			<biblScope unit="page" from="1701" to="1724" />
			<date type="published" when="2008">2008</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b3">
	<monogr>
		<author>
			<persName><forename type="first">R</forename><forename type="middle">R</forename><surname>Brief</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><surname>Ly</surname></persName>
		</author>
		<author>
			<persName><forename type="first">B</forename><forename type="middle">S E A</forename><surname>Ion</surname></persName>
		</author>
		<title level="m">A framework for k-12 science education: Practices, crosscutting concepts, and core ideas</title>
				<imprint>
			<date type="published" when="2012">2012</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b4">
	<analytic>
		<title level="a" type="main">Jedai: A system for skill-aligned explainable robot planning</title>
		<author>
			<persName><forename type="first">N</forename><surname>Shah</surname></persName>
		</author>
		<author>
			<persName><forename type="first">P</forename><surname>Verma</surname></persName>
		</author>
		<author>
			<persName><forename type="first">T</forename><surname>Angle</surname></persName>
		</author>
		<author>
			<persName><forename type="first">S</forename><surname>Srivastava</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Proceedings of the 21st International Conference on Autonomous Agents and Multiagent Systems, AAMAS &apos;22, International Foundation for Autonomous Agents and Multiagent Systems</title>
				<meeting>the 21st International Conference on Autonomous Agents and Multiagent Systems, AAMAS &apos;22, International Foundation for Autonomous Agents and Multiagent Systems<address><addrLine>Richland, SC</addrLine></address></meeting>
		<imprint>
			<date type="published" when="2022">2022</date>
			<biblScope unit="page" from="1917" to="1919" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b5">
	<analytic>
		<title level="a" type="main">Knowledge graphs in education and employability: A survey on applications and techniques</title>
		<author>
			<persName><forename type="first">Y</forename><surname>Fettach</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><surname>Ghogho</surname></persName>
		</author>
		<author>
			<persName><forename type="first">B</forename><surname>Benatallah</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">IEEE Access</title>
		<imprint>
			<date type="published" when="2022">2022</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b6">
	<analytic>
		<title level="a" type="main">Knowledge graphs</title>
		<author>
			<persName><forename type="first">A</forename><surname>Hogan</surname></persName>
		</author>
		<author>
			<persName><forename type="first">E</forename><surname>Blomqvist</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><surname>Cochez</surname></persName>
		</author>
		<author>
			<persName><forename type="first">C</forename><surname>Amato</surname></persName>
		</author>
		<author>
			<persName><forename type="first">G</forename><forename type="middle">D</forename><surname>Melo</surname></persName>
		</author>
		<author>
			<persName><forename type="first">C</forename><surname>Gutierrez</surname></persName>
		</author>
		<author>
			<persName><forename type="first">S</forename><surname>Kirrane</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><forename type="middle">E L</forename><surname>Gayo</surname></persName>
		</author>
		<author>
			<persName><forename type="first">R</forename><surname>Navigli</surname></persName>
		</author>
		<author>
			<persName><forename type="first">S</forename><surname>Neumaier</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">ACM Computing Surveys (CSUR)</title>
		<imprint>
			<biblScope unit="volume">54</biblScope>
			<biblScope unit="page" from="1" to="37" />
			<date type="published" when="2021">2021</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b7">
	<monogr>
		<title level="m" type="main">A security ontology for security requirements elicitation</title>
		<author>
			<persName><forename type="first">A</forename><surname>Souag</surname></persName>
		</author>
		<author>
			<persName><forename type="first">C</forename><surname>Salinesi</surname></persName>
		</author>
		<author>
			<persName><forename type="first">R</forename><surname>Mazo</surname></persName>
		</author>
		<author>
			<persName><forename type="first">I</forename><surname>Comyn-Wattiau</surname></persName>
		</author>
		<editor>ESSoS</editor>
		<imprint>
			<date type="published" when="2015">2015</date>
			<publisher>Springer</publisher>
			<biblScope unit="page" from="157" to="177" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b8">
	<monogr>
		<author>
			<persName><forename type="first">Z</forename><surname>Syed</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Padia</surname></persName>
		</author>
		<author>
			<persName><forename type="first">T</forename><surname>Finin</surname></persName>
		</author>
		<author>
			<persName><forename type="first">L</forename><surname>Mathews</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Joshi</surname></persName>
		</author>
		<title level="m">Uco: A unified cybersecurity ontology</title>
				<imprint>
			<publisher>UMBC Student Collection</publisher>
			<date type="published" when="2016">2016</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b9">
	<analytic>
		<title level="a" type="main">Ontology of metrics for cyber security assessment</title>
		<author>
			<persName><forename type="first">E</forename><surname>Doynikova</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Fedorchenko</surname></persName>
		</author>
		<author>
			<persName><forename type="first">I</forename><surname>Kotenko</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Proceedings of the 14th International Conference on Availability, Reliability and Security</title>
				<meeting>the 14th International Conference on Availability, Reliability and Security</meeting>
		<imprint>
			<date type="published" when="2019">2019</date>
			<biblScope unit="page" from="1" to="8" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b10">
	<analytic>
		<title level="a" type="main">Formalizing information security knowledge</title>
		<author>
			<persName><forename type="first">S</forename><surname>Fenz</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Ekelhart</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Proceedings of the 4th international Symposium on information, Computer, and Communications Security</title>
				<meeting>the 4th international Symposium on information, Computer, and Communications Security</meeting>
		<imprint>
			<date type="published" when="2009">2009</date>
			<biblScope unit="page" from="183" to="194" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b11">
	<analytic>
		<title level="a" type="main">An ontology-based cybersecurity framework for the internet of things</title>
		<author>
			<persName><forename type="first">B</forename><forename type="middle">A</forename><surname>Mozzaquatro</surname></persName>
		</author>
		<author>
			<persName><forename type="first">C</forename><surname>Agostinho</surname></persName>
		</author>
		<author>
			<persName><forename type="first">D</forename><surname>Goncalves</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><surname>Martins</surname></persName>
		</author>
		<author>
			<persName><forename type="first">R</forename><surname>Jardim-Goncalves</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">Sensors</title>
		<imprint>
			<biblScope unit="volume">18</biblScope>
			<biblScope unit="page">3053</biblScope>
			<date type="published" when="2018">2018</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b12">
	<analytic>
		<title level="a" type="main">Developing an ontology for cyber security knowledge graphs</title>
		<author>
			<persName><forename type="first">M</forename><surname>Iannacone</surname></persName>
		</author>
		<author>
			<persName><forename type="first">S</forename><surname>Bohn</surname></persName>
		</author>
		<author>
			<persName><forename type="first">G</forename><surname>Nakamura</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><surname>Gerth</surname></persName>
		</author>
		<author>
			<persName><forename type="first">K</forename><surname>Huffer</surname></persName>
		</author>
		<author>
			<persName><forename type="first">R</forename><surname>Bridges</surname></persName>
		</author>
		<author>
			<persName><forename type="first">E</forename><surname>Ferragut</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><surname>Goodall</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Proceedings of the 10th Annual Cyber and Information Security Research Conference</title>
				<meeting>the 10th Annual Cyber and Information Security Research Conference</meeting>
		<imprint>
			<date type="published" when="2015">2015</date>
			<biblScope unit="page" from="1" to="4" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b13">
	<analytic>
		<title level="a" type="main">Malont: An ontology for malware threat intelligence</title>
		<author>
			<persName><forename type="first">N</forename><surname>Rastogi</surname></persName>
		</author>
		<author>
			<persName><forename type="first">S</forename><surname>Dutta</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><forename type="middle">J</forename><surname>Zaki</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Gittens</surname></persName>
		</author>
		<author>
			<persName><forename type="first">C</forename><surname>Aggarwal</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Deployable Machine Learning for Security Defense: First International Workshop</title>
				<meeting><address><addrLine>MLHat; San Diego, CA, USA</addrLine></address></meeting>
		<imprint>
			<publisher>Springer</publisher>
			<date type="published" when="2020-08-24">2020. August 24, 2020. 2020</date>
			<biblScope unit="page" from="28" to="44" />
		</imprint>
	</monogr>
	<note>Proceedings 1</note>
</biblStruct>

<biblStruct xml:id="b14">
	<analytic>
		<title level="a" type="main">Conceptual characterization of cybersecurity ontologies</title>
		<author>
			<persName><forename type="first">B</forename><forename type="middle">F</forename><surname>Martins</surname></persName>
		</author>
		<author>
			<persName><forename type="first">L</forename><surname>Serrano</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><forename type="middle">F</forename><surname>Reyes</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><forename type="middle">I</forename><surname>Panach</surname></persName>
		</author>
		<author>
			<persName><forename type="first">O</forename><surname>Pastor</surname></persName>
		</author>
		<author>
			<persName><forename type="first">B</forename><surname>Rochwerger</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">The Practice of Enterprise Modeling: 13th IFIP Working Conference, PoEM 2020</title>
				<meeting><address><addrLine>Riga, Latvia</addrLine></address></meeting>
		<imprint>
			<publisher>Springer</publisher>
			<date type="published" when="2020">November 25-27, 2020. 2020</date>
			<biblScope unit="page" from="323" to="338" />
		</imprint>
	</monogr>
	<note>Proceedings 13</note>
</biblStruct>

<biblStruct xml:id="b15">
	<analytic>
		<title level="a" type="main">The comparison of cybersecurity datasets</title>
		<author>
			<persName><forename type="first">A</forename><surname>Alshaibi</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><surname>Al-Ani</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Al-Azzawi</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Konev</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Shelupanov</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">Data</title>
		<imprint>
			<biblScope unit="volume">7</biblScope>
			<biblScope unit="page">22</biblScope>
			<date type="published" when="2022">2022</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b16">
	<analytic>
		<title level="a" type="main">Introducing a new dataset for event detection in cybersecurity texts</title>
		<author>
			<persName><forename type="first">H</forename><forename type="middle">M D</forename><surname>Trong</surname></persName>
		</author>
		<author>
			<persName><forename type="first">D.-T</forename><surname>Le</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><forename type="middle">P B</forename><surname>Veyseh</surname></persName>
		</author>
		<author>
			<persName><forename type="first">T</forename><surname>Nguyen</surname></persName>
		</author>
		<author>
			<persName><forename type="first">T</forename><forename type="middle">H</forename><surname>Nguyen</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP)</title>
				<meeting>the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP)</meeting>
		<imprint>
			<date type="published" when="2020">2020</date>
			<biblScope unit="page" from="5381" to="5390" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b17">
	<monogr>
		<author>
			<persName><forename type="first">M</forename><forename type="middle">T</forename><surname>Alam</surname></persName>
		</author>
		<author>
			<persName><forename type="first">D</forename><surname>Bhusal</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Y</forename><surname>Park</surname></persName>
		</author>
		<author>
			<persName><forename type="first">N</forename><surname>Rastogi</surname></persName>
		</author>
		<idno type="arXiv">arXiv:2204.05754</idno>
		<title level="m">Cyner: A python library for cybersecurity named entity recognition</title>
				<imprint>
			<date type="published" when="2022">2022</date>
		</imprint>
	</monogr>
	<note type="report_type">arXiv preprint</note>
</biblStruct>

<biblStruct xml:id="b18">
	<analytic>
		<title level="a" type="main">Data and knowledge-driven named entity recognition for cyber security</title>
		<author>
			<persName><forename type="first">C</forename><surname>Gao</surname></persName>
		</author>
		<author>
			<persName><forename type="first">X</forename><surname>Zhang</surname></persName>
		</author>
		<author>
			<persName><forename type="first">H</forename><surname>Liu</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">Cybersecurity</title>
		<imprint>
			<biblScope unit="volume">4</biblScope>
			<biblScope unit="page" from="1" to="13" />
			<date type="published" when="2021">2021</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b19">
	<analytic>
		<title level="a" type="main">Using bert and augmentation in named entity recognition for cybersecurity domain</title>
		<author>
			<persName><forename type="first">M</forename><surname>Tikhomirov</surname></persName>
		</author>
		<author>
			<persName><forename type="first">N</forename><surname>Loukachevitch</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Sirotina</surname></persName>
		</author>
		<author>
			<persName><forename type="first">B</forename><surname>Dobrov</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Natural Language Processing and Information Systems: 25th International Conference on Applications of Natural Language to Information Systems, NLDB 2020</title>
				<meeting><address><addrLine>Saarbrücken, Germany</addrLine></address></meeting>
		<imprint>
			<publisher>Springer</publisher>
			<date type="published" when="2020">June 24-26, 2020. 2020</date>
			<biblScope unit="page" from="16" to="24" />
		</imprint>
	</monogr>
	<note>Proceedings 25</note>
</biblStruct>

<biblStruct xml:id="b20">
	<analytic>
		<title level="a" type="main">Cybersecurity named entity recognition using bidirectional long short-term memory with conditional random fields</title>
		<author>
			<persName><forename type="first">P</forename><surname>Ma</surname></persName>
		</author>
		<author>
			<persName><forename type="first">B</forename><surname>Jiang</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Z</forename><surname>Lu</surname></persName>
		</author>
		<author>
			<persName><forename type="first">N</forename><surname>Li</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Z</forename><surname>Jiang</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">Tsinghua Science and Technology</title>
		<imprint>
			<biblScope unit="volume">26</biblScope>
			<biblScope unit="page" from="259" to="265" />
			<date type="published" when="2020">2020</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b21">
	<monogr>
		<title/>
		<author>
			<persName><forename type="first">C</forename><forename type="middle">V</forename><surname>Cve</surname></persName>
		</author>
		<author>
			<persName><forename type="first">N</forename><surname>Exposures</surname></persName>
		</author>
		<ptr target="http://cve.mitre.org" />
		<imprint>
			<date type="published" when="2014-01">January (2014</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b22">
	<monogr>
		<author>
			<persName><forename type="first">C</forename><surname>Mitre</surname></persName>
		</author>
		<ptr target="https://nvd.nist.gov/(2017" />
		<title level="m">National vulnerability database (nvd)</title>
				<imprint/>
	</monogr>
</biblStruct>

<biblStruct xml:id="b23">
	<analytic>
		<title level="a" type="main">A comparative study of deep learning based named entity recognition algorithms for cybersecurity</title>
		<author>
			<persName><forename type="first">S</forename><surname>Dasgupta</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Piplai</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Kotal</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Joshi</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">2020 IEEE International Conference on Big Data (Big Data)</title>
				<imprint>
			<publisher>IEEE</publisher>
			<date type="published" when="2020">2020</date>
			<biblScope unit="page" from="2596" to="2604" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b24">
	<analytic>
		<title level="a" type="main">Knowledge graph based learning guidance for cybersecurity hands-on labs</title>
		<author>
			<persName><forename type="first">Y</forename><surname>Deng</surname></persName>
		</author>
		<author>
			<persName><forename type="first">D</forename><surname>Lu</surname></persName>
		</author>
		<author>
			<persName><forename type="first">D</forename><surname>Huang</surname></persName>
		</author>
		<author>
			<persName><forename type="first">C.-J</forename><surname>Chung</surname></persName>
		</author>
		<author>
			<persName><forename type="first">F</forename><surname>Lin</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Proceedings of the ACM conference on global computing education</title>
				<meeting>the ACM conference on global computing education</meeting>
		<imprint>
			<date type="published" when="2019">2019</date>
			<biblScope unit="page" from="194" to="200" />
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b25">
	<analytic>
		<title level="a" type="main">Problem-based cybersecurity lab with knowledge graph as guidance</title>
		<author>
			<persName><forename type="first">Y</forename><surname>Deng</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Z</forename><surname>Zeng</surname></persName>
		</author>
		<author>
			<persName><forename type="first">K</forename><surname>Jha</surname></persName>
		</author>
		<author>
			<persName><forename type="first">D</forename><surname>Huang</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">Journal of Artificial Intelligence and Technology</title>
		<imprint>
			<biblScope unit="volume">2</biblScope>
			<biblScope unit="page" from="55" to="61" />
			<date type="published" when="2022">2022</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b26">
	<analytic>
		<title level="a" type="main">Neocyberkg: enhancing cybersecurity laboratories with a machine learning-enabled knowledge graph</title>
		<author>
			<persName><forename type="first">Y</forename><surname>Deng</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Z</forename><surname>Zeng</surname></persName>
		</author>
		<author>
			<persName><forename type="first">D</forename><surname>Huang</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="m">Proceedings of the 26th ACM Conference on Innovation and Technology in Computer Science Education V</title>
				<meeting>the 26th ACM Conference on Innovation and Technology in Computer Science Education V</meeting>
		<imprint>
			<date type="published" when="2021">2021</date>
			<biblScope unit="volume">1</biblScope>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b27">
	<analytic>
		<title level="a" type="main">Building knowledge graphs from unstructured texts: Applications and impact analyses in cybersecurity education</title>
		<author>
			<persName><forename type="first">G</forename><surname>Agrawal</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Y</forename><surname>Deng</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><surname>Park</surname></persName>
		</author>
		<author>
			<persName><forename type="first">H</forename><surname>Liu</surname></persName>
		</author>
		<author>
			<persName><forename type="first">Y.-C</forename><surname>Chen</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">Information</title>
		<imprint>
			<biblScope unit="volume">13</biblScope>
			<biblScope unit="page">526</biblScope>
			<date type="published" when="2022">2022</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b28">
	<analytic>
		<title level="a" type="main">Defining cybersecurity</title>
		<author>
			<persName><forename type="first">D</forename><surname>Craigen</surname></persName>
		</author>
		<author>
			<persName><forename type="first">N</forename><surname>Diakun-Thibault</surname></persName>
		</author>
		<author>
			<persName><forename type="first">R</forename><surname>Purse</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">Technology Innovation Management Review</title>
		<imprint>
			<biblScope unit="volume">4</biblScope>
			<date type="published" when="2014">2014</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b29">
	<analytic>
		<title level="a" type="main">Snort: Lightweight intrusion detection for networks</title>
		<author>
			<persName><forename type="first">M</forename><surname>Roesch</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">Lisa</title>
		<imprint>
			<biblScope unit="volume">99</biblScope>
			<biblScope unit="page" from="229" to="238" />
			<date type="published" when="1999">1999</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b30">
	<analytic>
		<title level="a" type="main">Towards a definition of knowledge graphs</title>
		<author>
			<persName><forename type="first">L</forename><surname>Ehrlinger</surname></persName>
		</author>
		<author>
			<persName><forename type="first">W</forename><surname>Wöß</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">SEMANTiCS (Posters, Demos, SuCCESS)</title>
		<imprint>
			<biblScope unit="volume">48</biblScope>
			<biblScope unit="page">2</biblScope>
			<date type="published" when="2016">2016</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b31">
	<monogr>
		<author>
			<persName><forename type="first">M</forename></persName>
		</author>
		<title level="m">Domain-specific knowledge graph construction</title>
				<imprint>
			<publisher>Springer</publisher>
			<date type="published" when="2019">2019</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b32">
	<monogr>
		<author>
			<persName><forename type="first">Y</forename><surname>Vasiliev</surname></persName>
		</author>
		<title level="m">Natural language processing with Python and spaCy: A practical introduction</title>
				<imprint>
			<publisher>Starch Press</publisher>
			<date type="published" when="2020">2020</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b33">
	<monogr>
		<author>
			<persName><forename type="first">E</forename><forename type="middle">F</forename><surname>Sang</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><surname>Veenstra</surname></persName>
		</author>
		<idno>arXiv preprint cs/9907006</idno>
		<title level="m">Representing text chunks</title>
				<imprint>
			<date type="published" when="1999">1999</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b34">
	<monogr>
		<author>
			<persName><forename type="first">J</forename><surname>Devlin</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M.-W</forename><surname>Chang</surname></persName>
		</author>
		<author>
			<persName><forename type="first">K</forename><surname>Lee</surname></persName>
		</author>
		<author>
			<persName><forename type="first">K</forename><surname>Toutanova</surname></persName>
		</author>
		<idno type="arXiv">arXiv:1810.04805</idno>
		<title level="m">Bert: Pre-training of deep bidirectional transformers for language understanding</title>
				<imprint>
			<date type="published" when="2018">2018</date>
		</imprint>
	</monogr>
	<note type="report_type">arXiv preprint</note>
</biblStruct>

<biblStruct xml:id="b35">
	<monogr>
		<author>
			<persName><forename type="first">Y</forename><surname>Liu</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><surname>Ott</surname></persName>
		</author>
		<author>
			<persName><forename type="first">N</forename><surname>Goyal</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><surname>Du</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><surname>Joshi</surname></persName>
		</author>
		<author>
			<persName><forename type="first">D</forename><surname>Chen</surname></persName>
		</author>
		<author>
			<persName><forename type="first">O</forename><surname>Levy</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><surname>Lewis</surname></persName>
		</author>
		<author>
			<persName><forename type="first">L</forename><surname>Zettlemoyer</surname></persName>
		</author>
		<author>
			<persName><forename type="first">V</forename><surname>Stoyanov</surname></persName>
		</author>
		<idno type="arXiv">arXiv:1907.11692</idno>
		<title level="m">Roberta: A robustly optimized bert pretraining approach</title>
				<imprint>
			<date type="published" when="2019">2019</date>
		</imprint>
	</monogr>
	<note type="report_type">arXiv preprint</note>
</biblStruct>

<biblStruct xml:id="b36">
	<analytic>
		<title level="a" type="main">Foundations of modern query languages for graph databases</title>
		<author>
			<persName><forename type="first">R</forename><surname>Angles</surname></persName>
		</author>
		<author>
			<persName><forename type="first">M</forename><surname>Arenas</surname></persName>
		</author>
		<author>
			<persName><forename type="first">P</forename><surname>Barceló</surname></persName>
		</author>
		<author>
			<persName><forename type="first">A</forename><surname>Hogan</surname></persName>
		</author>
		<author>
			<persName><forename type="first">J</forename><surname>Reutter</surname></persName>
		</author>
		<author>
			<persName><forename type="first">D</forename><surname>Vrgoč</surname></persName>
		</author>
	</analytic>
	<monogr>
		<title level="j">ACM Computing Surveys (CSUR)</title>
		<imprint>
			<biblScope unit="volume">50</biblScope>
			<biblScope unit="page" from="1" to="40" />
			<date type="published" when="2017">2017</date>
		</imprint>
	</monogr>
</biblStruct>

<biblStruct xml:id="b37">
	<monogr>
		<author>
			<persName><forename type="first">K</forename><forename type="middle">K</forename><surname>Pal</surname></persName>
		</author>
		<author>
			<persName><forename type="first">K</forename><surname>Kashihara</surname></persName>
		</author>
		<author>
			<persName><forename type="first">P</forename><surname>Banerjee</surname></persName>
		</author>
		<author>
			<persName><forename type="first">S</forename><surname>Mishra</surname></persName>
		</author>
		<author>
			<persName><forename type="first">R</forename><surname>Wang</surname></persName>
		</author>
		<author>
			<persName><forename type="first">C</forename><surname>Baral</surname></persName>
		</author>
		<idno type="arXiv">arXiv:2105.14357</idno>
		<title level="m">Constructing flow graphs from procedural cybersecurity texts</title>
				<imprint>
			<date type="published" when="2021">2021</date>
		</imprint>
	</monogr>
	<note type="report_type">arXiv preprint</note>
</biblStruct>

				</listBibl>
			</div>
		</back>
	</text>
</TEI>
