Annotated Question and Answer Dataset for Security Export Control Akihiko Obayashi1 , Rafal Rzepka2∗ 1 Center for Innovation and Business Promotion, Hokkaido University, Kita-ku, North 21, West 10, Sapporo, Japan 2 Faculty of Information Science and Technology, Hokkaido University, Kita-ku, North 14, West 8, Sapporo, Japan obayashi@mcip.hokudai.ac.jp, rzepka@ist.hokudai.ac.jp Abstract possible for those who do not have knowledge of export con- trol to easily and reliably determine whether technical infor- This paper introduces a set of questions and an- mation such as a potential publication is relevant or not. This swers in Japanese language for the topics related to should prevent the outflow of sensitive technology and thus security export control. Unlike the most available contribute to national security. The system developed in this datasets for question answering our set comprises research will be provided free of charge and can be freely of very detailed expert knowledge in both queries extended as open source. and replies. The knowledge is not widely shared which makes it difficult to simply apply current 1.2 Motivation for Sharing Data neural approaches and its small size limits fine tun- As a small team for an ambitious project requiring combining ing. By introducing this data we count on increas- various approaches, the authors hope that other researchers ing number of researchers extending contemporary could use the data to test their algorithms and propose new NLP methods to be applicable to very precise ex- methods. Nowadays new textual resources are prepared and pert systems. As the queries may often require widely shared but their character is concentrated on size, not additional questions for clarification, this dataset quality, which is a natural consequence of current systems can also be utilized for testing sophisticated task- that require big data. Moreover, most of the widely shared oriented dialog systems. datasets are in English causing that many NLP researchers from countries like Japan start working with English language instead of their native one. 1 Introduction 1.1 About Security Export Control 2 Related Works Security Export Control is to control transfer of technologies Plethora of textual datasets has been made available for both and export of goods for the purpose of preserving the peace question answering and dialog processing. Recently open- and security of the international community. It works to pre- domain question answering has gained popularity and many vent transfer of the technologies and goods that can be poten- benchmarks were developed [Zhu et al., 2021] and combin- tially diverted to weapons or military use to any such person ing information retrieval with deep neural networks has be- who might conduct activities of concern as a nation or terror- come a popular research area [Abbasiantaeb and Momtazi, ists who could threaten the peace and security of the interna- 2021]. However, most of the datasets originate in knowl- tional community. The only existing support system for deter- edge bases like Wikipedia and concentrate on factoids with mining whether transfer of technologies or export of goods is simple answers. When it comes to queries requiring ex- export-controlled or not is the online system of Stanford Uni- planations, ranking existing replies is one popular method versity. However, this system requires users to have special- and Frequently Asked Questions have been often utilized for ized knowledge of export control, which makes it difficult to more than a decade [Wang et al., 2009; Lin et al., 2020] use by researchers without extensive training. The goal of our and the dataset we describe in this paper is similar to an research is to build a dialog system [Obayashi and Rzepka, FAQ. Another approach is to retrieve answers from linked 2019] in which experts in export control and artificial intelli- data [Dimitrakis et al., 2020], however legal documents like gence collaborate to develop a novel user-friendly support for ours are just raw text and related methods cannot be used in non-experts. The system under development converts the text a straightforward manner. If the data could be translated into of export control laws and regulations into a computationally- graphs, there is a wide range of methods [Zheng et al., 2018; processable format (an ontology), automatically makes infer- Yasunaga et al., 2021] but as semantic web research show, ences from articles to be judged, and is planned to add miss- automatic ontology generation from text is not an easy task ing information through dialogue. Our system is to make it [Elnagar et al., 2020]. When it comes to dealing with longer texts, such research ∗ Contact Author / Equal Contribution has been comparatively less addressed, however recent works Copyright © 2021 for this paper by its authors. Use permitted under Creative Commons License Attribution 4.0 International (CC BY 4.0). 45 on long-form question answering has brought many new at- should be added) but it is not always the case. For example, tempts to answer longer questions or generate longer answers let us consider the following Q&A: when necessary. Still, the available datasets are dominated Q: We are planning to export a storage container by short texts [Cambazoglu et al., 2020] and even if there with teflon (tetrafluoroethylene resin) -coated wet- are dedicated ones for long-form question answering, they ted parts to China. In this case, how should we de- are also based on Wikipedia and provided mainly for English termine whether it is relevant or not? language [Kwiatkowski et al., 2019]. For Japanese language a Wikipedia-based quiz question and answer data [Suzuki et A: Fluoropolymers listed in Article 2, Paragraph 2, al., 2020]. Also a machine-translated SQUAD set [Rajpurkar Item 2 (c) of the Ministerial Order in Paragraph et al., 2016] has been made publicly available1 . [Takahashi et 3 (2) of Appended Table 1 of the Export Trade al., 2019] have developed a QA dataset focusing on driving Control Order include tetrafluoroethylene, so those domain created from Japanese blogs. They constructed two with a sealed structure are applicable. wide-coverage datasets as a form of QA using crowdsourc- In this case, the answer is labeled as “controlled” (mean- ing: predicate-argument structure one and a reading compre- ing “requiring a license”), but this alone does not mean that hension one. In the case of security export control, the do- it is strictly applicable. The reason is that the regulations also main is very narrow, therefore there is no easy access to the include a standard for capacity, for example, if the capacity expert knowledge and traditional crowdsourcing is not possi- does not exceed 0.1 cubic meter, it is not considered as “con- ble. trolled”. This is due to the fact that the Q&A only takes into account important factors when making human judgments. 3 Data Description An expert knows that fluoropolymers include tetrafluoroethy- lene resin, which is an effective piece of knowledge allow- Our dataset is compiled from the guidance files prepared by ing human to focus on appropriate category. Similarly, such the Center for Information on Security Trade Control (CIS- knowledge would be necessary when a machine performs a TEC)2 , and its target audience is those involved in security judgement, which makes this dataset a useful testbed for mul- export control at companies, universities, etc., with the aim tihop question answering on long-form texts. of providing them with the information they need to deter- mine whether a shipment is relevant or not. Therefore, it is assumed that they have some basic knowledge of secu- 4 Data Annotation rity export control and understand information on cargo in All question-answer pairs are annotated a) by the first author the field (e.g. chemical preparations). The FAQ data is pro- who is a security export control expert and b) by a program vided within guidance pdf files3 , therefore we manually re- matching numbers of articles and glossary terms included in trieved questions and answers but omitted these with pictures the pairs. and these referring to other documents without giving direct answers. In total we have extracted 548 question-answer 4.1 Expert Annotation pairs, which are available upon request. We have excluded We have decided to label the answers into four categories: pairs where images were used for asking a question or re- “controlled”, “not controlled”, “requiring confirmation” and plying it. Some questions are long with short answers and “other”. Example pairs for every of four categories is given in vice versa, short questions require detailed replies. Due to the Appendix. The expert has read all questions and related the length constrains, here we analyze rather short examples answers, then classified the type of answer with the labels (longer ones are given in the Appendix). The first one is about above. There are 95 answers labelled as “Controlled”, 138 as carrying a certain amount of a compound abroad: “Not Controlled”, 139 as “Requiring Confirmation” and 176 as “Other”. Some of the answers were difficult to classify Q: I am planning to take a very small amount of hy- as for example the licence requirement was obvious but there drogen fluoride (e.g., about 10g) to a foreign coun- was also a chance of some other problem occurring and the try. In this case, do I need to apply for an export answer provided an advice as well. Such cases make labels license? “Requiring Confirmation” and ‘Other” simultaneously rele- A: For substances that are used as raw materials for vant. High number of “Other” suggests that granularity of the chemical preparations for the military, such as hy- miscellaneous category could be increased further in the fu- drogen fluoride, an export license application is re- ture. Many examples suggest that “defining”, “confirming” quired when taking them out of Japan and into for- and “clarifying” could be used but the dataset is still not too eign countries, even in very small quantities. large, therefore we decided to keep four labels. In the example above the question contains a direct speci- fication of a question type (“do I need a license?”, for which 4.2 Automatic Annotation yes/no answer can be determined (although an explanation Japan Machinery Center for Trade and Investment4 publishes a 360 pages booklet containing trade security control terms 1 https://www.ai-shift.co.jp/techblog/1224 linked to related article numbers. By courtesy of the Cen- 2 https://www.cistec.or.jp/english/export/faq.html ter we acquired an electronic version of this material and 3 https://www.cistec.or.jp/members/f guidance/index.html 4 (registration is required to access the guidance files) http://www.jmcti.org/jmchomepage/english/ 46 utilized the human-made terms for automatic keyword an- References notation. It must be noted that except specialized words as [Abbasiantaeb and Momtazi, 2021] Zahra Abbasiantaeb and chemical compound names or names of viruses, everyday use Saeedeh Momtazi. Text-based question answering from words like “confirm” (kakunin) or “neccessary” (hitsuyō) are information retrieval and deep neural network perspec- also included. Except keywords we also have utilized article tives: A survey. Wiley Interdisciplinary Reviews: Data numbers from an XML file with related regulations5 . It has Mining and Knowledge Discovery, page e1412, 2021. to be noted that 20% of the question set contained only one or no keyword or article reference (0=8.2%, 1=10.9%)6 . Be- [Blei et al., 2003] David M. Blei, Andrew Y. Ng, and cause the average length of a question in the QA dataset is Michael I. Jordan. Latent dirichlet allocation. J. Mach. 110.16 ideograms (73.5 morphological tokens) and average Learn. Res., 3(null):993–1022, March 2003. number of nouns per question is 35.09, it can be again hy- [Cambazoglu et al., 2020] B Barla Cambazoglu, Mark pothesized that proper processing the data we prepared will Sanderson, Falk Scholer, and Bruce Croft. A review of require reaching beyond simple approaches. public datasets in question answering research. In ACM SIGIR Forum, 2020. 5 Conclusions and Future Work [Devlin et al., 2019] Jacob Devlin, Ming-Wei Chang, Ken- ton Lee, and Kristina Toutanova. BERT: Pre-training of In this paper we have introduced a unique expert question deep bidirectional transformers for language understand- answering dataset for the domain of security export control. ing. In Proceedings of the 2019 Conference of the North Although relatively small (548 pairs), the developed data fill American Chapter of the Association for Computational the gap in Japanese QA datasets by providing long-term ques- Linguistics: Human Language Technologies, Volume 1 tions and answers where both inquirer and answerer are ex- (Long and Short Papers), pages 4171–4186, Minneapo- perts with a different level of expertise. We have already lis, Minnesota, June 2019. Association for Computational performed set of experiments [Rzepka et al., 2021] to see Linguistics. whether both classic (LDA [Blei et al., 2003]) and neural [Dimitrakis et al., 2020] Eleftherios Dimitrakis, Konstanti- (BERT [Devlin et al., 2019]) approaches can correctly rank existing answers and find them in regulatory texts but the re- nos Sgontzos, and Yannis Tzitzikas. A survey on ques- sults show that they perform worse than a simple keyword tion answering systems over linked data and documents. matching. Because the set is small and the range of topics Journal of Intelligent Information Systems, 55(2):233– is very wide, fine-tuning is limited. However there are more 259, 2020. sophisticated methods for both ranking, retrieving and gener- [Elnagar et al., 2020] Samaa Elnagar, Victoria Y. Yoon, and ating answers to be tested. By providing this dataset we hope Manoj A. Thomas. An automatic ontology generation that the Japanese NLP community will use it for expanding framework with an organizational perspective. In 53rd current methods to deal with this unorthodox set of questions Hawaii International Conference on System Sciences, and answers annotated with answer intents, glossary words HICSS 2020, Maui, Hawaii, USA, January 7-10, 2020, and article numbers. pages 1–10. ScholarSpace, 2020. [Kwiatkowski et al., 2019] Tom Kwiatkowski, Jennimaria 6 Acknowledgements Palomaki, Olivia Redfield, Michael Collins, Ankur Parikh, Chris Alberti, Danielle Epstein, Illia Polosukhin, Matthew This work was supported by JSPS KAKENHI Grant Number Kelcey, Jacob Devlin, Kenton Lee, Kristina N. Toutanova, 20K12556. Llion Jones, Ming-Wei Chang, Andrew Dai, Jakob Uszko- reit, Quoc Le, and Slav Petrov. Natural questions: a bench- mark for question answering research. Transactions of the 7 Appendix Association of Computational Linguistics, 2019. Examples of annotated questions and answers are given in [Lin et al., 2020] Jimmy Lin, Rodrigo Nogueira, and An- Tables 1 and 2. Label abbreviations are: Con = “Con- drew Yates. Pretrained transformers for text ranking: Bert trolled” (applicable), N-Con = “not controlled” (not applica- and beyond. arXiv preprint arXiv:2010.06467, 2020. ble), “Conf” = “requiring confirmation”, “Other” = “ Miscel- [Obayashi and Rzepka, 2019] Akihiko Obayashi and Rafal laneous”. Rzepka. Towards interactive advisory system for security export control. In Proceedings of IJCAI Workshop on Lan- 5 They are specified in the “Appended Table of the Foreign Ex- guage Sense on Computer, Macau, 2019. change Order” (FEO) and the “Appended Table 1 of the Export [Rajpurkar et al., 2016] Pranav Rajpurkar, Jian Zhang, Kon- Trade Control Order” (ETCO) provided by the Japanese govern- ment. stantin Lopyrev, and Percy Liang. SQuAD: 100,000+ 6 One of the remaining problems is that article numbers can be questions for machine comprehension of text. In Pro- written in formal and informal, full and partial manner, therefore ceedings of the 2016 Conference on Empirical Methods in full formal numbers we used cover only a small part of the data. We Natural Language Processing, pages 2383–2392, Austin, need to address this problem with manual approach or sophisticated Texas, November 2016. Association for Computational regular expressions. Linguistics. 47 [Rzepka et al., 2021] Rafal Rzepka, Daiki Shirafuji, and Ak- ihiko Obayashi. Limits and challenges of embedding- based question answering in export control expert sys- tem. In Proceedings of the 25th International Conference on Knowledge-Based and Intelligent Information & Engi- neering Systems (to appear), Szczecin, Poland, September 2021. Springer. [Suzuki et al., 2020] Masatoshi Suzuki, Jun Suzuki, Koji Matsuda, Kyosuke Nishida, and Naoya Inoue. Jaqket: Constructing a japanese question answering dataset from trivia questions (in japanese)aqket: Constructing a japanese question answering dataset from trivia questions (in japanese). In In Proceedings of the 26th Annual Meet- ing of the Association for Natural Language Processing (NLP 2020), 2020. [Takahashi et al., 2019] Norio Takahashi, Tomohide Shi- bata, Daisuke Kawahara, and Sadao Kurohashi. Ma- chine comprehension improves domain-specific Japanese predicate-argument structure analysis. In Proceedings of the 2nd Workshop on Machine Reading for Question An- swering, pages 98–104, Hong Kong, China, November 2019. Association for Computational Linguistics. [Wang et al., 2009] Xin-Jing Wang, Xudong Tu, Dan Feng, and Lei Zhang. Ranking community answers by modeling question-answer relationships via analogical reasoning. In Proceedings of the 32nd international ACM SIGIR con- ference on Research and development in information re- trieval, pages 179–186, 2009. [Yasunaga et al., 2021] Michihiro Yasunaga, Hongyu Ren, Antoine Bosselut, Percy Liang, and Jure Leskovec. Qa- gnn: Reasoning with language models and knowledge graphs for question answering. In North American Chapter of the Association for Computational Linguistics (NAACL), 2021. [Zheng et al., 2018] Weiguo Zheng, Jeffrey Xu Yu, Lei Zou, and Hong Cheng. Question answering over knowledge graphs: question understanding via template decomposi- tion. Proceedings of the VLDB Endowment, 11(11):1373– 1386, 2018. [Zhu et al., 2021] Fengbin Zhu, Wenqiang Lei, C. Wang, Jianming Zheng, Soujanya Poria, and Tat-Seng Chua. Re- trieving and reading: A comprehensive survey on open- domain question answering. ArXiv, abs/2101.00774, 2021. 48 Question Answer Label In order to eliminate multipath signals from ra- Article 21, paragraph (2), item (iii)-(d)(3) of the Ministe- Con dio and TV broadcasts, etc., a receiver (fre- rial Ordinance regulates the technology required for the quency range 1.5 MHz to 87.5 MHz) has been design or manufacture of radio transmitters and radio re- designed using adaptive interference signal sup- ceivers (frequency range 1.5 MHz to 87.5 MHz) designed pression technology. The design is capable of sup- to suppress disturbance signals beyond 15 decibels using pressing multipath signals beyond 15 decibels. Is adaptive interference signal suppression technology. (2) this design technology regulated by the technol- A multipath signal is not an interference signal because it ogy specified in Article 21, Paragraph 2, Item 3- is a signal that is reflected or diffracted from the desired 2(d)(3) of the Ministerial Ordinance? signal by obstacles such as buildings and terrain. There- fore, if a cargo design technology uses adaptive interfer- ence signal suppression technology that is effective only for multipath signals, it is “not applicable” to Article 21, Paragraph 2, Item 3-2(d)(3) of the Ministerial Ordinance. However, if the adaptive interference signal suppression technology is equally effective against disturbance sig- nals and the cargo is capable of suppressing disturbance signals in excess of 15 decibels, the technology required for its design or manufacture is “applicable”. We are planning to set up a subsidiary in a for- Cross-flow filtration devices and components (e.g., mem- Con eign country to manufacture cross-flow filtration brane modules) are list-controlled goods, and their de- systems and components (membrane modules, sign, manufacture, and use technology are subject to reg- etc.) in order to be cost competitive. This sub- ulation. In addition, even joint ventures and proprietary sidiary will be operated solely with the capital of subsidiaries are considered non-residents, and the provi- a Japanese company, but do we need a permit? sion of technology to non-residents is subject to regula- tion. The residency of a corporation, etc. shall be deter- mined based on whether or not it has its principal office in Japan, and the residency of a branch office, sub-branch office or other office of a corporation, etc. shall be deter- mined as follows: (1) Branches, branch offices and other offices of Japanese corporations, etc. located in foreign countries shall be treated as nonresidents. From “Inter- pretation and Application of Foreign Exchange Laws and Regulations”. We have provided a numerical control program According to the results of public comments at the time of N-Con built into a machine tool that falls under the list the revision of the Ministerial Ordinance in August 2012, regulations using the special exception in Article the view was expressed as “Programs provided with a ser- 9, Paragraph 2, Item 14 (c) of the Ministerial Or- vice transaction license” in Article 9, Paragraph 2, Item dinance on Foreign Trade. However, we found out 14 (d) of the Foreign Trade Ordinance includes programs that the program has a bug. We would like to pro- provided by applying the special provisions that do not vide a corrected program, but since it is not built require a license. Since this case falls under this category, into the cargo this time, we cannot use this special the special exception in Article 9, Paragraph 2, Item 14 exception. In this case, do we need to obtain a new (d) of the Foreign Trade Ordinance can be applied and no service transaction permit? new service transaction license is required. One of our products is a lubricant oil. This prod- If it is judged that the product cannot be used for the regu- N-Con uct is mainly composed of ingredients regulated lated application based on the specifications at the time of in Section 5 (12), but it is designed and manufac- manufacture and actual examples of use, it may be judged tured only for use as a lubricant, and we have not as not applicable. We recommend that you keep the in- confirmed its actual use as a refrigerant in elec- formation that forms the basis for your judgment in some tronic equipment. Is it correct to assume that such form, such as by storing it in writing together with the products cannot be used as refrigerants, and that documents used to determine whether the product is ap- they do not fall under item 5(12)? plicable or not. Table 1: Examples of annotated questions and their answers for categories “Controlled” (Con), and “Not Controlled” (N-Con) 49 Question Answer Label What should I do if I don’t know how much of In the case of a trading company, the information should Conf the relevant chemical substance is contained in the be based on the information exchanged in normal busi- cargo I intend to export? ness practices (MSDS, catalog, etc.). If the chemical sub- stance does not fall under any of items 1(3), (13), 2(3), or 4(6), and does not exceed 10% of the price of other goods in the mixed destination, it is not considered to be a rel- evant chemical substance. Normally, the price of a small amount of a substance that is not listed on the MSDS, etc. is not considered to exceed 10% of the total price of the shipment. However, if there is little information on the chemical substances contained in the product and you do not know what the main substances are, it is necessary to confirm the applicability by contacting the manufacturer. How should diaphragm type vacuum pumps and Diaphragm type vacuum pumps and bellows type vac- Conf bellows-type vacuum pumps be judged as to uum pumps can usually only handle gas and cannot han- whether they are applicable or not? Are they clas- dle liquid. To that extent, none of the pumps fall into the sified as “vacuum pumps” or “seal-less pumps”? category of normal seal-less pumps that can handle liq- uids. Determine the applicability of the pump as a vac- uum pump. However, if the pump can also handle liq- uid, it must also be judged as a seal-less pump. (Canned pumps, magnet pumps, bellows pumps, and diaphragm pumps are classified as “seal-less pumps”.) Does the provision of “single or composite ox- No, it doesn’t. The same regulation regulates four types Other ides of zirconium and composite oxides of sili- of oxides: 1) single oxides of zirconium, 2) complex ox- con or aluminum” in Paragraph 5(3) of the Ap- ides of zirconium, 3) complex oxides of silicon, and 4) pended Table of the Foreign Exchange and For- complex oxides of aluminum. The English text of the eign Trade Ordinance and Article 17, Paragraph Wassenaar Arrangement states “Single or complex ox- 3, Item 1(a)(1) of the Ministerial Ordinance on ides of zirconium and complex oxides of silicon or alu- Freight, etc. regulate four types of oxides: 1) sin- minium”. gle oxide of zirconium, 2) composite oxide of zir- conium, 3) silicon, and 4) composite oxide of alu- minum? If a model for which the declared value was ac- In accordance with the provisions of 12) Others (3) of Other cepted before November 19, 2009, and has not “Declared Values for Linear Axis Positioning Accuracy” been produced for a considerable period of time (Export Notes 21, No. 49, 2009/11/13 Trade Bureau No. in the past, how should the declared value be han- 3, dated September 27, 2013), please submit the follow- dled? ing documents. Documents proving that production has been discontinued (if the possibility of resuming produc- tion in the future cannot be denied, please include the fol- lowing: a copy of the checklist acceptance sheet (1 copy) and the original of the previous declaration value accep- tance sheet. Table 2: Examples of annotated questions and their answers for categories “Requiring Confirmation” (Conf) and “Miscellaneous” (Other) 50