<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>ITGInsight - Discovering and Visualizing Science, Technology and Innovation Information for Generating Competitive Technological Intelligence</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Xuefeng Wang</string-name>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Shuo Zhang</string-name>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Yuqin Liu</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>School of Journalism and Publication, Beijing Institute of Graphic Communication</institution>
          ,
          <addr-line>Beijing, 102600</addr-line>
          ,
          <country country="CN">China</country>
        </aff>
        <aff id="aff1">
          <label>1</label>
          <institution>School of Management and Economics, Beijing Institute of Technology</institution>
          ,
          <addr-line>Beijing, 100081</addr-line>
          ,
          <country country="CN">China</country>
        </aff>
      </contrib-group>
      <fpage>3</fpage>
      <lpage>20</lpage>
      <abstract>
        <p>Nowadays, most organizations are facing the challenge of tracking the latest technological developments and identifying technology opportunities or threats of the competitive environment. In this context, intelligence analysis methods have been widely used, and lots of technology intelligence techniques have been embedded into general purpose tools to support the need for extracting valuable information from textual data. However, there was no single tool powerful and flexible enough to incorporate all the key elements (data retrieval, preprocessing, normalization, analysis, visualization, and interpretation) in analysis process or have only in limited form. Therefore, obtaining such intelligence awareness, especially from textual data, remains one difficulty. In this paper, addressing concerns of competitive technological intelligence and remedying the shortcomings mentioned before, ITGInsight has been developed. It presents four key features that are remarkable in respect to other software tools: (a) powerful data preprocessing module; (b) flexible user-defined analysis module; (c) gorgeous data visualization module; and (d) professional automatic interpretation module. Finally, an empirical study for synthetic biology is used to describe ITGInsight in deeper detail. The experiment results show that it is a powerful tool for generating effective competitive technological intelligence, such as profiling</p>
      </abstract>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>-</title>
      <p>science &amp; technology domain, mapping research front relationships, and
discerning overall trends, providing more insightful information to intelligent consumers
especially in non/few-expert supported environment.
1</p>
    </sec>
    <sec id="sec-2">
      <title>Introduction</title>
      <p>Nowadays, organizations are facing the challenge of keeping pace with the latest
technological developments and identifying technology opportunities or threats of the
competitive environment, especially given the exponential growth of accessible in
formation. Further, facing such technique revolution wave tide of the fast fierce
development, timely and relevant information on new technology and its
industrialization in both internal and external environments will not only have a direct
impact on the competitiveness and development of organizations (Liao, Sun, and Wang,
2003), but also become one of the important forces to boost the economy growth, as
well as the main index to decide the integrative competition of a country or territory.
With intelligence analysis methods being applied to the industrial or commercial
setting, the technical competitiveness of industries and organizations and the ability of
rapid response to information improved. However, it is crucial to acknowledge that
many organizations are still facing the challenge of interpreting the acquired
information, which can be an effective source of intelligence if well analyzed, and therefore
detailed guidelines on how to analyze data, generating different types of intelligence
information to meet specific needs of intelligence consumers are in urgent need.</p>
      <p>According to Porter and Cunningham, CTI also have tremendous innovation
potential and involves a wide range of Science, Technology and Innovation (ST&amp;I)
information, which address real-world concerns for diverse targets and emphases－e.g.
profiling technology domain, mapping research front relationships, discerning overall
trends and detecting “who is doing what, when and where” (Porter and Cunningham,
2005). The “what” question if far more challenge, whereas it can be enriched by
summarizing the experiences and knowledge of domain experts or extracting topical
content, especially the noun phrases/domain terms from related text (Newman, Porter,
Newman, Trumbach, and Bolan, 2014). In the process of intelligence activity analysis,
apart from the typical qualitative methods (e.g., Delphi, expert interview, scenario
planning), quantitative methods offer an appealing alternative to expert opinion. However,
some quantitative methods (e.g., bibliometric, patent analysis) use only simple
bibliometric indicators (Milanez et al., 2014), which cannot reflect technology changes in
granular details. Therefore, more researchers are making efforts to adopt advanced
quantitative methods to solve the problem, such as morphological analysis (Lee et al.,
2007; Yoon &amp; Park, 2005; Yoon et al., 2008), TRIZ (Yoon &amp; Kim, 2011; Zhang, Zhou,
Porter, &amp; Gomila, 2014), conjoint analysis (Xin et al., 2010; Yoon &amp; Park, 2007),
technology roadmapping (Choi et al., 2013; Huang et al., 2014; Lee et al., 2008; Lee et al.,
2009; Zhang et al., 2013; Zhang, Zhou, Porter, Gomila, et al., 2014) and text mining,
which could enhance the capability of CTI, but also rely heavily on expert opinions.
With the rapid development of information and the fragmentation of technology
domains, domain experts may become less reliable (Shibata et al., 2008). In some cases,
experts' biases and insufficient knowledge may even create inaccurate information to
the results. In this context, tech mining techniques emerged, providing information
from textual data by combining bibliometrics and text mining techniques. On the basis
of this technology, many scholars have carried out research. By using text-mining tools
and bibliometric indices, Zhu and Porter (2002) addressed capability to exploit huge
volumes of available information, ways to do so very quickly, and informative
representations via partially automated processes to generate helpful knowledge from text
quickly and graphically. These analytical findings can be tailored to the needs of
particular technology managers. Joung and Kim (2017) proposed technical keyword-based
analysis of patents to monitor emerging technologies and applied a keyword-based
model in contents-based patent analysis. A case study of mechanisms of electron
transfer in electrochemical glucose biosensors is given to demonstrate how the proposed
method can monitor emerging technologies. Above research results show that the
output of tech mining to good effect. It could help us to understand detailed contents and
knowledge flows from technology information at some extent (Huang et al., 2016).</p>
      <p>Therefore, in this paper, addressing concerns of CTI, remedying the shortcomings
mentioned before, ITGInsight has been developed. It is compatible with multiple data
sources (e.g. Web of Science, Derwent Innovation, Google, Twitter) and incorporates
various techniques, algorithms, and measures for all the steps in intelligence analysis,
from data preprocessing to discovering to visualizing. Most importantly, it provides
automatic interpretation function for analytical results to assist intelligence consumers
to write report under different technology domains in non/few-expert supported
environment. Thus, users only need to add details into the report according to their different
demand. To some extent, it can mitigate the drawbacks and make full use of various
methods strengths, generating effective CTI, e.g., discovering significant clues about
technology front/evolution trend. To the best of our knowledge, even though related
work have been booming in academia, ITGInsight, especially, the automatic
interpretation function makes substantial contribution for CTI discovering.</p>
      <p>The remainder of this paper consists of four sections. We first make a brief
literature review to introduce some related works. Then, we apply our software to synthetic
biology, providing an overview of ITGInsight’s functionality for generating and
interpreting CTI, and elaborate on the technical implementation of specific parts of the
program. Finally, we present remarks and directions for further study.
2</p>
    </sec>
    <sec id="sec-3">
      <title>Related Work</title>
      <p>Nowadays, many technology intelligence methods have been embedded into general
purpose software to support the need for extracting valuable information from
literatures or the other textual data. But for some organizations, it remains difficult to select
appropriate data sources and software, because the development of them have been
suggested by the need of a specific domain, and much science and technology
performed is not documented, thus their applications are limited. Sometimes, researchers
even have to use more than one software tools to perform a deeper analysis.
Furthermore, most of the technology intelligence software do not have systematic analyses
functions and there has been little related research which could give instructions about
the methods and techniques embedded in them. Given the problems mentioned above,
detailed guidelines on how to analyze data, generating different types of intelligence
information to meet specific needs of intelligence consumers are in urgent need. Lee,
Mortara, Kerr, Phaal, and Probert (2012) and Liu, Wang, and Lei (2015) investigated
the possibilities of conducting various data－mining techniques on literatures, and
exploratory analyses such as trend analysis, portfolio analysis or if-then analysis have also
been conducted by Lee et al. (2012). They divided the technology intelligence software
into four categories: search-only tools, search and summarisation tools, advanced text
analysis tools, and advanced literature management tools. In Cobo, Lopez-Herrera,
Herrera-Viedma, and Herrera (2011), they presented an analysis of the features,
advantages, and drawbacks of these tools. As a result, they concluded that there was no
single tool powerful and flexible enough to incorporate all the key elements (data
retrieval, preprocessing, normalization, analysis, visualization, and interpretation) in
analysis process (Cobo et al., 2011). That is why we designed ITGInsight, which
presents four key features that other software tools either do not have or have only in
limited form:
• Powerful data preprocessing module: ITGInsight is compatible with multiple data
sources and implements a wide range of data preprocessing tools such as natural
language processing, duplicate record detection, entity disambiguation
(normalization), and network extraction.
• Flexibly user-defined analysis module: ITGInsight incorporates various methods,
algorithms, and measures for all the steps in intelligence analysis and supports
combination application of them. In this way, users can carry out the research analysis
according to their specific need.
• Gorgeous data visualization module: ITGInsight can easily process tens of
thousands of data, and display maps that contain more than 20,000 items. ITGInsight also
has functionality for zooming, scrolling, searching, and supporting user
defined graph types which facilitates the detailed examination of large maps.
• Professional automatic interpretation module: ITGInsight can interpret the
analytical results automatically under different technology domains on the basis of a widely
used framework. Thus, it can provide general solution patterns and standardized
decision support process to intelligence consumers especially in non/few-expert
supported environment.
3</p>
    </sec>
    <sec id="sec-4">
      <title>ITGInsight</title>
      <p>In this section, we choose synthetic biology as our target technology field and provide
an overview of ITGinsight’s functionality for generating CTI1, especially the ability
from data preprocessing to visualization to interpretation of the results.
1 For a more extensive discussion of the functionality of ITGInsight, we refer to the ITGInsight
manual, which is available at http://cn.itginsight.com/Files/download/itginsight_manual.pdf.
3.1
In this research, we choose synthetic biology as our target technology field and we
applied the consolidated search strategy proposed by Shapira, Kwon, and Youtie (2017)
to publications recorded in Web of Science for the period 2000－2020 in Science
Citation Index Expanded. The gross worldwide number of records obtained by applying
this search strategy (on 9th May 2020) was 12,725. ITGInsight was then used for record
cleaning. After removing duplicate records, including the papers published with the
same title, abstract and authors as articles, our synthetic biology publication dataset
comprises 12,525 publication records.
3.2</p>
      <sec id="sec-4-1">
        <title>Natural Language Process</title>
        <p>For intelligence researchers, data preprocessing is essential for the accuracy and quality
of the results in intelligence analysis, because the data acquired by them is often totally
unstructured or semi－structured and have some mistakes, which may affect the further
analysis. ITGInsight provides automatic duplicate record deletion and entity
disambiguation (normalization) function, etc. In addition, natural language processing
techniques embedded in ITGInsight could help extract a set of subject words from any
structured or unstructured text (e.g., the title and abstract of the publications, Web text
data) for generating more intelligence information. In most cases, subject words
retrieved in this way are large and “noisy”, making them difficult to understand.
Therefore, ITGInsight also provides further processing function, generating better domain
terms for achieving CTI. It includes four main steps as described in Table 1:</p>
        <sec id="sec-4-1-1">
          <title>Description Word segmentation process</title>
          <p>Raw dataset for 12,525 publications－ apply Hidden Markov Model for word segmentation
Data cleaning－remove common/meaningless words (e.g., a/an, the, what, detailed description, some
time, method), extreme words (e.g., occurrence in only one record, word length less than 2)
Fuzzy matching－combine words with similar structures based on pattern commonality, such as stemming
and text similarity)
Sequencing－sort the words according to their C-value, forming the preliminary results
7
8
synthetic promoter</p>
          <p>mammalian cell
2 ITGInsight supports various user-defined dictionaries, and they can be used to intervene the
natural language processing without directly modifying the original data.</p>
          <p>Word segmentation</p>
          <p>output
gene expression
metabolic
engineer</p>
          <p>ing
artificial cell
mammalian cell
escherichia coli
essential gene
synthetic gene
degree c</p>
        </sec>
        <sec id="sec-4-1-2">
          <title>Domain term No</title>
          <p>gene expression
metabolic
engi</p>
          <p>neering
artificial gene
synthesis
synthetic cell
synthetic
pro</p>
          <p>moter
essential gene
escherichia coli
gene circuit
16
17
18
19
20
21
22
23
24
Consolidation and modification (preliminary results)－words that indicate the same meaning (especially
refer to the same technology) will be merged to improve the integration level, and some weakly
correlated words will be removed after consulting the domain experts, forming “Thesaurus Dictionary2”</p>
          <p>Domain terms extraction
Raw dataset for 12,525 publications－apply Hidden Markov Model and “Thesaurus Dictionary” for
domain terms extraction
Screening－Obtain the top 30 domain terms according to the C-value (PC-value/TF-IDF/ Frequency).</p>
          <p>In Table 2, we selected the top 30 word segmentation outputs and domain terms
according to the sequence of C-Value.
synthetic gene
net</p>
          <p>work
biological system</p>
          <p>nucleic acid
system biology
synthetic genetic</p>
          <p>network
synthetic gene</p>
          <p>circuit
biological system
natural product
genetic circuit</p>
          <p>nucleic acid
natural product
system biology</p>
          <p>c. glutamicum
synthetic gene
cir</p>
          <p>cuit
expression level</p>
          <p>genetic code
negative auto
reg</p>
          <p>ulation
biosynthetic
pathway
system metabolic
engineering
c. glutamicum
genetic code
biosynthetic
path</p>
          <p>way
membrane protein
dna sequence
Note. “Synthetic biology” is not included in Table 2.</p>
          <p>From Table 2, we can find that there are some differences between the
experimental results of “Word segmentation” and “Domain term” (marked as grey). For
example, there some meaningless words (e.g., expression level, degree c) in word
segmentation outputs. In addition, for the terms of “artificial cell”, “synthetic gene
network”, “genetic circuit” and “synthetic biology approach”, we realize entity
disambiguation (normalization) in “Domain terms extraction” by using “Thesaurus Dictionary”.
3.3</p>
        </sec>
      </sec>
      <sec id="sec-4-2">
        <title>Finding Research Fronts</title>
        <p>Realizing the research fronts in specific fields can not only understand the current
development status and future trends, but also provide CTI information to intelligence
consumers. In addition, it is also helpful for government to formulate relevant technical
policies. We take “bibliographic coupling” analysis as an example to conduct our
experiment. Further, we use LinLog layout algorithm3 for clustering analysis, and
obtained six prominent clusters as shown in Figure 1.
3 ITGInsight provides various clustering algorithms, such as LinLog, VOSmapping, TSNE.
Fig. 1. A 396-node network of bibliographic coupling on synthetic biology (2000-2020) Note.</p>
        <p>The number at the corner of each circle is the number of the articles founded in the cluster</p>
        <p>We take Cluster1 as an example and add details to it. Cluster 1－Synthetic
mammalian gene circuits. This cluster contains 43 core documents and the high frequency
domain terms of this cluster include: mammalian cell [6], gene circuit [5], synthetic
biology method [3], gene expression [3], mathematical model [3], synthetic gene circuit
[3], phase separation [3], live cell [3], logic gate[3], etc. An important aspect is to
associate domain term identifiers to research fronts, which could help researchers to
identify and trace the key innovations and relations that significantly impact a technology’s
development, complementing the initial results of bibliographic coupling clustering. To
be specific, ITGInsight can display the analysis results about research fronts in seven
different ways and provide automatic interpretation of them. We take the topographic
map here as an example, in which we can gain more insights into the main
advancements in a specific technology through the density and color depth of domain terms
(We will discuss the technical implementation of it later on in this paper). In Figure 2,
we find that “crispr system” has strong relationship with “novel drug target”, and this
has been confirmed in some researched. Fione et al. (2019) performed genome-scale
CRISPR－Cas9 screens in 324 human cancer cell lines from 30 cancer types and
developed a data-driven framework to prioritize candidates for cancer therapeutics. Their
analysis provides a resource of cancer dependencies, generates a framework to
prioritize cancer drug targets and suggests specific new targets. The principles described in
this study can inform the initial stages of drug development by contributing to a new,
diverse and more effective portfolio of cancer drug targets.
To begin assembling a picture of how interest in these topics has emerged and evolved,
we first took the corpus and divided them into years. We then selected the top 30 most
frequently mentioned terms for each year and conducted a co-word analysis by text
mining techniques. Figure 3 illustrates how the topics have gained or lost importance,
merged, or split over the period of study. Each topic has a different color, and the
thickness of the connection represents the strength between topics (We will discuss the
technical implementation of topic evolution model later on in this paper). This analysis
reveals insights on two levels – first, some broad trends in the field and, second, a host
of micro-level translations, and all the results could be provided by the automatic
function of ITGInsight. Actually, there are too many specific evolutions to discuss
individually. Hence, these next few sections will discuss the two main overarching trends and
we add some micro-level examples to confirm them.</p>
        <p>From the macro level, scholars pay more attention to the research topic of "gene
expression" (correspond to Cluster 1), “metabolic engineering (correspond to Cluster
24)”, “gene circuit (correspond to Cluster 1)”, “natural product (correspond to Cluster
65)”, “synthetic cell (correspond to Cluster 2)”, “escherichia coli”, and “saccharomyce
cerevisiae”, etc., because those domain terms gain more importance at late stage.</p>
        <p>From the micro level, we listed two key points in the development of synthetic
biology. In 2008, a strong connection between molecular noise and genetic network
emerged. Molecular noises in gene networks come from intrinsic fluctuations,
transmitted noise from upstream genes, and the global noise affecting all genes. Knowledge of
molecular noise filtering in gene networks is crucial to understand the signal processing
in gene networks and to design noise-tolerant gene circuits for synthetic biology (Chen,
Chang, and Wang, 2008). Further, they find that biochemical regulatory networks suffer
4 Cluster 2－Cell-free protein synthesis.
5 Cluster 6－Genome mining
from process delays, internal parametrical perturbations as well as external disturbances
due to the context of host cells. Then the filtering ability of attenuating additive external
disturbances is estimated for time-delay biochemical regulatory networks (Chang and
Chen, 2010). That is why external disturbance, synthetic genetic network, and host cell
appear together in 2009. In 2011, a strong connection between “negative auto regulation”
and “Escherichia coli” emerged as an evolution of these previous findings. Dianel et al.
(2011) find gene regulation networks are made of recurring regulatory patterns, called
network motifs. One of the most common network motifs is negative auto-regulation,
in which a transcription factor represses its own production. Negative auto-regulation
has several potential functions: it can shorten the response time (time to reach halfway
to steady-state), stabilize expression against noise, and linearize the gene’s
input－output response curve. This latter function of negative auto-regulation, which increases the
range of input signals over which downstream genes respond, has been studied by
theory and synthetic gene circuits. To address this, they studied the negative
auto-regulation motif in the arabinose utilization system of escherichia coli, in which negative
autoregulation is part of a complex regulatory network.
4
4.1</p>
      </sec>
    </sec>
    <sec id="sec-5">
      <title>Methods</title>
      <sec id="sec-5-1">
        <title>Topic clustering process analysis</title>
        <p>Technology theme map is one the most effective methods to carry out the technical
analysis and technical route tracking. Topographic map, as a new means of technology
theme map could help the researchers finding effective research hotspot or research
front (Liu et al., 2017). The specific methods are outlined below.</p>
      </sec>
      <sec id="sec-5-2">
        <title>Step1. Construction of co-occurrence matrix</title>
        <p>According to the relationship strength between domain terms, we constructed
cooccurence matrix as follows：</p>
        <p>In this research, relationship strength rij depends on the number of co-occurrences
between domain terms. At the end of this step, we can obtain a set of nodes V (domain
terms) and a set of edges E (the relationships between domain terms).</p>
      </sec>
      <sec id="sec-5-3">
        <title>Step2. Coordinate calculation</title>
        <p>In order to visualize domain terms, it is necessary to determine the position
coordinates of them in plane. In this research, we take LinLog algorithm as an example to
give a detailed description about computational process. LinLog is proposed by Noack
(2004), in which the repulsion is based on edges repulsing each other instead of node
repulsion and is defined as
!"#!$%() = ∑(',) )∈, ‖ − ‖ − ∑(',) )∈-(") deg ()deg ()‖ − ‖ (2)
Where ||pv-pu|| represents the distance between node u and node v, pv/pu is a vector
of node positions, v(2) is the set of all subsets of V which have exactly two elements,
and deg (u)/deg(v) is the node degree. To be specific, the first part of the subtraction in
the above equation is a calculation of the attraction between adjacent nodes. The second
part is the calculation for the repulsion between the edges. In this way, each node ends
up having an influence on the drawing, which is proportional to the number of edges
connected to them (the degree). This property of the node can thus be reflected on the
graph being drawn by making the size of the node proportional to its degree. At the end
of this step, we can obtain the position coordinates of each node (domain terms).</p>
      </sec>
      <sec id="sec-5-4">
        <title>Step3. Plane pixel density function</title>
        <p>After obtaining the position coordinates of each node, we use this method
proposed by Liu et al. (2017) to determine the color of them in this step. The density
function formula of is
 (, ) = ∑"#:; ("). −  &gt;/(010$)"2(313$)"?9 ,  &gt; 0,  &gt; 0 (3)
4"567#8.</p>
        <p>Where (xi, yi), i= 1... N is the position coordinates of each node, the distance
between two different nodes is depend on the average two-dimensional euclidean distance,
and f (Numberi) is the standardized value for the number of nodes. The most important
part is that we use (x, y) to represent pixel points on the computer screen, after
standardizing, it can be connected to RGB color code. By default, ITGInsight uses a blue－
yellow－red color scheme. In this color scheme, blue corresponds with the highest item
density and red corresponds with the lowest item density. Beyond that, Circle Layout
(CR), Evolution Layout (EV), Reference Layout (RF), Weight Spring Layout (SP), No
Weight Spring Layout (UP), Kamada Kawai Layout (KK), Fruchterman Reingold
Layout (FR), VOSmapping Layout (VS) and TSNE-Layout (TS) are also embedded in
ITGInsight for nodes coordinates computing, and ITGInsight supports the simultaneous
use of two or more layout algorithms to provide information from different perspective.
4.2</p>
      </sec>
      <sec id="sec-5-5">
        <title>Topic evolution process analysis</title>
        <p>Text mining techniques can reveal rich details on the technical information to provide
a picture of how topics emerge as a subject of interest and how they develop over time.
The topic evolution analysis function of ITGInsight could help the researchers to trace
the emerging and evolving topics. The methods are outlined below. To avoid
interference by meaningless terms, researchers could take the Thesaurus-dictionary generated
in (NLP) and performed further word segmentation and part of speech tagging on the
data set. With these results in hand, document collection is divided into time periods Dt
= {dt1, dt2, dt3, ..., dtm} (d: document; m: document number; t: time). Each document
corresponds to many terms, d = {w1, w2, w3, ..., wz} (w: term), and each term represents
a single topic. The next step was to construct a co-word network for each time period
based on term (topic) co-occurrence. More specifically, if Topic a and Topic b appeared
in the same document, they were deemed to have a co-occurrence relationship. Further,
the number of times a and b co-occurred in any document was considered a reflection
of the strength of the connection and was weighted according to the number of
cooccurrences. The differences between each co-word network for each time period
reveals how topics have evolved. For instance, some topics emerge in a time period; some
disappear. Some gain or lose importance; others merge or split. This paper identifies
the following evolutionary relationships:
• Type1 － Gain/Lose Importance: the occurrences number of the same topic
has been steadily increasing/dropping over the next several periods.
• Type2－Emergence: a new topic is generated suddenly within a certain period;</p>
        <p>Death: a topic does not appear again in research time.
• Type3－Topic Split: a new topic is generated from an existing topic (In T-1 time
period, they appear in the same document; In T time period, they appear in different
documents).
• Type4－Topic Fusion: an existing topic fuses with another existing topic (In T-1
time period, they appear in different documents; In T time period, they appear in the
same document).
5</p>
      </sec>
    </sec>
    <sec id="sec-6">
      <title>Conclusions</title>
      <p>ITGInsight is an advanced text mining and visualization tool, mainly for generating
competitive technological intelligence, ranging from basic statistics to more
complicated analyses based on the selected data, that could be used by a wide range of users.
It incorporates various techniques, methods, algorithms, and measures for all the steps
in intelligence analysis, from data preprocessing to discovering to visualizing. Most
importantly, it provides the function of automatic interpretation to assist intelligence
consumers to write report under different technology domains on the basis of a widely
used framework. Thus, users only need to add details into the report according to their
different demand. To some extent, it can mitigate the drawbacks and make full use of
various methods strengths, generating effective CTI. The experiment results show that
it is a powerful tool for generating effective CTI, such as profiling science &amp; technology
domain, mapping research front relationships, and discerning overall trends, providing
more insightful information to intelligent consumers especially in non/few-expert
supported environment.</p>
    </sec>
    <sec id="sec-7">
      <title>Acknowledgments</title>
      <p>We are grateful to many scholars and software enthusiasts who provide their valuable
opinions and suggestions in the process of ITGInsight design and development. Users
could download and install the latest version of ITGInsight from http://en.itginsight.
com/download/.
2. Boyack, K. W., &amp; Klavans, R. (2010). Co-citation analysis, bibliographic coupling, and
direct citation: which citation approach represents the research front most accurately?
Journal of the American Society for Information Science &amp; Technology, 61(12),
23892404. https://doi.org/10.1002/asi.v61:12
3. Chang, Y. T., &amp; Chen, B. S. (2010). A fuzzy approach for robust reference tracking
control of nonlinear distributed parameter time-delayed systems and its biological application.
IEEE International Conference on Systems Man &amp; Cybernetics.</p>
      <p>
        https://doi.org/10.1109/ICSMC.2010.5642363
4. Chen, B. S., Chang, Y. T., &amp; Wang, Y. C. (2008). Robust -stabilization design in gene
networks under stochastic molecular noises: fuzzy-interpolation approach. IEEE Transaction
on Cybernetics, 38(
        <xref ref-type="bibr" rid="ref1">1</xref>
        ), 25-42. https://doi.org/10.1109/TSMCB.2007.906975
5. Choi, S., Kim, H., Yoon, J., Kim, K., Lee, Y.J. (2013). An SAO-based text-mining
approach for technology roadmapping using patent information. R&amp;D Management, 43(
        <xref ref-type="bibr" rid="ref1">1</xref>
        ),
52-74. https://doi.org/10.1111/j.1467-9310.2012.00702.x
6. Cobo, M. J., Lopez-Herrera, A. G., Herrera-Viedma, E., &amp; Herrera, F. (2011). Science
mapping software tools: review, analysis, and cooperative study among tools. Journal of
the Association for Information Science &amp; Technology, 62(7), 1382–
1402.https://doi.org/10.1002/asi.21525
7. Fiona, M. B. et al. (2019). Prioritization of cancer therapeutic targets using CRISPR-Cas9
screens. Nature, 568(7753):511-516. https://doi.org/10.1038/s41586-019-1103-9
8. Gaber, R., Lebar, T., Majerle, A., Ster, B., Dobnikar, A., BencIna, M., &amp; Jeralam, R.
(2014). Designable dna-binding domains enable construction of logic circuits in
mammalian cells. Nature Chemical Biology, 10(3), 203-208.
https://doi.org/10.1038/nchembio.1433
9. Huang, Y., Zhang, Y., Ma, J., Porter, A. L., &amp; Guo, Y. (2016). Generating Competitive
Technical Intelligence Using Topical Analysis, Patent Citation Analysis, and Term
Clumping Analysis. Anticipating Future Innovation Pathways Through Large Data Analysis
(p:153-172) https://doi.org/10.1007/978-3-319-39056-7_9
10. Joung, J., Kim, K. (2017). Monitoring emerging technologies for technology planning
using technical keyword based analysis from patent data. Technological Forecasting &amp;
Social Change, 114(
        <xref ref-type="bibr" rid="ref1">1</xref>
        ), 281 -292. https://doi.org/10.1016/j.techfore.2016.08.020
11. Kerr, C. I. V., Mortara, L., Phaal, R., &amp; Probert, D. R. (2006). A conceptual model for
technology intelligence. International Journal of Technology Intelligence and Planning,
2(
        <xref ref-type="bibr" rid="ref1">1</xref>
        ), 73-93. https://doi.org/10.1504/IJTIP.2006.010511
12. Kempton, H. R., Goudy, L. E., Love, K. S., &amp; Qi, L. S. (2020). Multiple input sensing and
signal integration using a split cas12a system. Molecular Cell, 78(
        <xref ref-type="bibr" rid="ref1">1</xref>
        ).
      </p>
      <p>https://doi.org/10.1016/j.molcel.2020.01.016
13. Kim, H., Bojar, D., &amp; Fussenegger, M . (2019). A crispr/cas9-based central processing unit
to program complex logic computation in human cells. Proceedings of the National
Academy of Sciences.https://doi.org/10.1073/pnas.1821740116
14. Liu, Y. Q., Wang, X. F., and Lei, X. P. (2015). Design and Implementation of Academic
Relation and Visualization System. Library and Information Service, 59 (8), 118-125.
https://doi.org/10.13266/j.issn.0252-3116.2015.08.015
15. Lee, C., Seol, H., Park, Y. (2007). Identifying New IT-Based Service Concepts Based on
the Technological Strength: A Text Mining and Morphology Analysis Approach. Fuzzy
Systems and Knowledge Discovery (FSKD 2007).
https://doi.org/10.1016/j.techfore.2010.11.010
16. Lee, S., Mortara, L., Kerr, C., Phaal, R., &amp; Probert, D. (2012). Analysis of
document-mining techniques and tools for technology intelligence: discovering knowledge from
technical documents. International Journal of Technology Management, 60(1/2), 130-156.
https://doi.org/10.1504/IJTM.2012.049102
17. Liao, S. H., Sun, B. L., &amp; Wang, R. Y. (2003). A knowledge-based architecture for
planning military intelligence, surveillance, and reconnaissance. Space Policy, 19(3), 191-202.
https://doi.org/10.1016/S0265-9646(03)00020-1
18. Martin, V. J, J, Pitera, D. J., Withers, S. T., Newman, J. D., &amp; Keasling, J. D. (2003).
Engineering a mevalonate pathway in escherichia coli for production of terpenoids. Nature
Biotechnology, 21(7),795-802. https://doi.org/10.1016/j.jnoncrysol.2004.08.013
19. Milanez, D. H., De Faria, L. I. L., Do Amaral, R. M., Leiva, D. R., &amp; Gregolin, José
Angelo Rodrigues. (2014). Patents in nanotechnology: an analysis using macro-indicators and
forecasting curves. Scientometrics, 101(2), 1097-1112.
https://doi.org/10.1007/s11192014-1244-4
20. Newman, N. C., Porter, A. L., Newman, D., Trumbach, C. C., &amp; Bolan, S. D. (2014).</p>
      <p>Comparing methods to extract technical content for technological intelligence. Journal of
Engineering and Technology Management, 32(4-6), 97-109.
https://doi.org/10.1016/j.jengtecman.2013.09.001
21. Noack, A. (2004). Visual Clustering of Graphs with Nonuniform Degrees. Proceedings of
the 13th International Symposium on Graph Drawing (GD 2005, Limerick, Ireland, Sep.
12-14).
22. Porter, A. L., Cunningham, S. W. (2005). Tech mining: exploiting new technologies for
competitive advantage[J]. Information Processing &amp; Management, 41(5),1305-1306.
https://doi.org/10.1002/0471698466.index
23. Shapira, P., Kwon, S., &amp; Youtie, J. (2017). Tracking the emergence of synthetic biology.</p>
      <p>Scientometrics, 112(3), 1439-1469. https://doi.org/10.1007/s11192-017-2452-5
24. Shibata, N., Kajikawa, Y., Takeda, Y., et al. (2009). Comparative study on methods ff
Detecting research fronts using different types of citation. Journal of the American Society for
Information Science &amp; Technology, 60(3): 571580. https://doi.org/10.1002/asi.20994
25. Vincent, B., &amp; Bernadette. (2013). Discipline-building in synthetic biology. Studies in
History &amp; Philosophy of Biological &amp; Biomedical Sciences, 44(2), 122-129.</p>
      <p>https://doi.org/10.1016/j.shpsc.2013.03.007
26. Yoon, J., Kim, K. (2011). An automated method for identifying TRIZ evolution trends
from patents. Expert Systems with Application, 38(12),15540-15548.</p>
      <p>
        https://doi.org/10.1016/j.eswa.2011.06.005
27. Yoon, B., Phaal, R., &amp; Probert, D. (2008). Morphology analysis for technology
roadmapping: application of text mining. R&amp;D Management, 38(
        <xref ref-type="bibr" rid="ref1">1</xref>
        ), 51-68.
      </p>
      <p>
        https://doi.org/10.1111/j.1467-9310.2007.00493.x
28. Zhang, Y., Zhou, X., Porter, A. L., Gomila, J. M. V. (2014). How to combine term
clumping and technology roadmapping for newly emerging science &amp; technology competitive
intelligence: "problem &amp; solution" pattern based semantic TRIZ tool and case study.
Scientometrics, 101(2),1375-1389. https://doi.org/10.1007/s11192-014-1262-2
29. Zhang, Y., Zhou, X., Porter, A. L., Gomila, J. M. V., &amp; Yan, A. (2014). Triple helix
innovation in china's dye-sensitized solar cell industry: hybrid methods with semantic triz and
technology roadmapping. Sentometrics, 99(
        <xref ref-type="bibr" rid="ref1">1</xref>
        ), 55-75.
https://doi.org/10.1007/s11192-0131090-9
30. Zhu, D., &amp; Porter, A. L. (2002). Automated extraction and visualization of information for
technological intelligence. Technological Forecasting and Social Change, 69(5), 495-506.
https://doi.org/10.1016/S0040-1625(01)00157-3
      </p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          1.
          <string-name>
            <surname>Bacchus</surname>
            ,
            <given-names>W.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Lang</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>El-Baba</surname>
            ,
            <given-names>M. D.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Weber</surname>
            ,
            <given-names>W.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Stelling</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          , &amp;
          <string-name>
            <surname>Fussenegger</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          (
          <year>2012</year>
          ).
          <article-title>Synthetic two-way communication between mammalian cells</article-title>
          .
          <source>Nature Biotechnology</source>
          ,
          <volume>30</volume>
          (
          <issue>10</issue>
          ),
          <volume>991</volume>
          .https://doi.org/10.1038/nbt.2351
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>