<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>Research on Network Prediction of Hidden Danger in Logistics Enterprises 1</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Xiying Chen</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Yufeng Zhuang</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Lingyi Lu</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Beijing University of Posts and Telecommunications</institution>
          ,
          <addr-line>Beijing</addr-line>
          ,
          <country country="CN">China</country>
        </aff>
      </contrib-group>
      <fpage>182</fpage>
      <lpage>187</lpage>
      <abstract>
        <p>In view of the large amount of manually recorded safety inspection data accumulated in the current logistics industry, as well as the problem of hidden danger neglect caused by lack of experience of inspectors and other reasons, this paper constructs a knowledge graph to show the structure of hidden danger of logistics enterprises in the form of graph data, targeted analysis of various unsafe hidden factors in logistics enterprises, based on the original unstructured data, To realize the construction of the relationship network between enterprises and the hidden danger factors, and analyse the relationship between the hidden danger of logistics enterprises. At the same time, the time sequence information is integrated, the graph data of continuous time slices is integrated, and the time-varying rule is analysed to realize the dynamic prediction of the hidden danger network of logistics enterprises. The accuracy of prediction is increased by learning historical data, so as to provide more targeted and specific inspection focus for the hidden danger inspection of logistics enterprises, and become an important auxiliary tool in manual inspection.</p>
      </abstract>
      <kwd-group>
        <kwd>eol&gt;hidden danger prediction</kwd>
        <kwd>knowledge graph</kwd>
        <kwd>feature of time sequence</kwd>
      </kwd-group>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>1. Introduction</title>
      <p>With the development of express delivery in the logistics industry and the increase of the size of
logistics enterprises, safety risks are also increasing. The safety of production activities of logistics
enterprises is facing severe challenges. The failure of timely inspection and prevention of hidden risks
often causes accidents, resulting in huge losses. In October 2018, 15 people died and 46 people were
injured in a major accident on the Shenhai Expressway. According to the investigation, it was caused
by the failure of the logistics enterprise to fulfill the main responsibility of production safety, the
longterm neglect of the attached vehicles, and the substandard quality inspection of the vehicles. In
November 2019, Shanghai Xinde Logistics Co., Ltd. caught fire due to combustible materials on the ground
and ignition during welding operations. In the final analysis, the fire was caused by the company's
inadequate implementation of housing safety management responsibilities. In July 2021, a logistics
warehouse in Changchun also caught fire due to improper storage of combustible materials, resulting
in 15 deaths and 25 injuries. In recent years, the state also attaches great importance to the investigation
and inspection of the safety risks of enterprises, vigorously carry out the responsibility of the
implementation of enterprise safety production and law enforcement inspection arrangements, the post
inspection into the prevention and investigation in advance, really reduce the occurrence of accidents
from the source.</p>
      <p>
        In view of the important problems of accident hidden danger, knowledge graph has been applied in
the field of emergency safety. Aditya Pingle et al. [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ] proposed a system to create semantic triples of
network security texts and extract possible relationships using deep learning methods. Security analysts
can form decisions about cyber attacks from the knowledge graph. Celebi R et al. [
        <xref ref-type="bibr" rid="ref2">2</xref>
        ] used the
knowledge map of the drug library to predict unknown drug interactions based on RDF2Vec. Based on
BiLSTM-CRF information extraction model, Sutphin C[
        <xref ref-type="bibr" rid="ref3">3</xref>
        ] identified adverse reactions and their causes
from FDA drug labels. Elluri, L et al. [
        <xref ref-type="bibr" rid="ref4">4</xref>
        ] trained the custom named entity recognition (NER) model,
constructed the network security knowledge graph (CKG) to infer the subjective association between
the network security text and users, and generated the relevant features of the text. Wang Yibao et al.
[
        <xref ref-type="bibr" rid="ref5">5</xref>
        ] drew a map of diversified, time-sharing and dynamic scientific knowledge of urban security research
by using the core documents of urban security in CNKI database from 1993 to 2018.
      </p>
      <p>From the research status of domain knowledge graph and its application in emergency security, it
can be seen that there is no research on the construction of knowledge graph based on the relationship
between hidden dangers in logistics enterprises, and most of them are still applied in the fields of
network security and traffic management. In addition, the current construction of knowledge graph is often
based on existing data, forming a triplet form to obtain static knowledge graph, which often ignores the
time-varying law of hidden trouble data of logistics enterprises and fails to make full use of time
information. Therefore, based on the existing hidden danger detection data and combined with the time series
characteristics, this paper constructs the hidden danger knowledge map of logistics enterprises. In the
form of graph data, the co-occurrence relationship of hidden dangers and the connection between
enterprises and hidden dangers are analyzed, and the information of people, objects, environment and
management is extracted to form a more comprehensive and relevant hidden dangers analysis.
2. Hidden Danger Prediction of Logistics Enterprise Based on Knowledge</p>
    </sec>
    <sec id="sec-2">
      <title>Graph 2.1. Overview of hidden danger prediction methods for logistics enterprises</title>
      <p>
        This paper mainly includes two parts, one is to build the knowledge map of hidden dangers in the
field of logistics, so as to realize the targeted study of hidden dangers in logistics enterprises; the other
is to carry out the dynamic prediction of hidden dangers network of logistics enterprises in time
sequence, reduce the risk of accidents, and convert the after-repair into pre-prevention. By predicting the
hidden dangers that may occur in slice logistics enterprises in the next time, guide logistics enterprises
to inspect the hidden dangers of safety production. The time knowledge graph is constructed in a
topdown way. The pattern layer is designed first, and then the fact representation of the data layer is
realized. The ontology design is used to standardize the actual data, which ensures the rationality of the
knowledge graph. The process of prediction is to embed entity and time with the Diachronic Embedding
(DE) [
        <xref ref-type="bibr" rid="ref6">6</xref>
        ] model and realize the training of model combined with the scoring function of complEx[
        <xref ref-type="bibr" rid="ref7">7</xref>
        ]
model. The design of knowledge graph of logistics enterprise is shown in Figure 1.
      </p>
      <p>original database
e
n
tr
e
p
ir
s
e
n
a
m
e
safety check</p>
      <p>data
i
n
s
p
e
c
t
i
o
n
t
i
m
e</p>
      <p>h
c idd
o e
tnn an
e d
ts eng
r
ispn lap
tce ec
ion fo
e
n
tr
e
p
ir
s
e
n
a
m
e
enterprise basic</p>
      <p>data
ca ten
tego rep
ry irse
fe ten
trau rrep
se ise
ad ten
red rrep
ss ise
structured data</p>
      <p>unstructured data
entities, attributes, relationships, time
shown in Figure 2.
is the set of all quadruples (ℎ, , ,</p>
      <p>Entity, attribute, relation and time are extracted from the data, and the quadruples are constructed by
"entity-relationship-entity-time" and "entity-attribute-attribute value-time". According to the time
information, the data is divided into multiple time windows, then the corresponding triplets in each time
window can form a static knowledge graph</p>
      <p>( represents the  th time window). Temporal knowledge
graph is the set of knowledge graph under different time windows, which is defined as  =
{
,  , … , 
}. 
is the set of all entities under this time window,</p>
      <p>is the set of all relations, and 
). The hidden danger network prediction of logistics enterprises is</p>
      <sec id="sec-2-1">
        <title>Qtrain</title>
        <p>(head,relation,tail,time)
Gt-1
deleted
facts
true
facts
negative
sample
Gt
positive
sample</p>
      </sec>
      <sec id="sec-2-2">
        <title>DE(Diachronic</title>
      </sec>
      <sec id="sec-2-3">
        <title>Embedding)</title>
        <p>knowledge
representation learning</p>
        <p>Qtrain
(head,relation,tail,time)
complEx scoring</p>
        <p>function
evaluation of model
is the output vector of the entity embedded at time  , and 
[] represents the nth element of
the vector. Where  ∈ 
,  is the corresponding time, 
∈ ℝ
、
, 
∈ ℝ
are all vectors with
learnable parameters associated with a particular entity or relationship,  (∗) is the activation function.
It can be seen from the formula that the first 
features changing with time, while the remaining (1 − )
elements of the vector are used to capture time sequence
elements are used to capture static features.
tions in the static knowledge graph, which can better model the asymmetric relations.
 is between 0 and 1, which is the hyperparameter controlling the percentage of time sequence features.</p>
        <p>ComplEx model introduces complEx space and uses complex  + 
to represent entities and
rela (ℎ, ,  ) =</p>
        <p>(&lt; 
=&lt; 
+&lt; 
+&lt; 
−&lt; 
(
(
(
(
), 
), 
), 
), 
(
(
(
(
, 
), 
), 
), 
), 
, 
(
(
(
(
&gt;)
) &gt;
) &gt;
) &gt;
) &gt;
when 
has only an imaginary part, and symmetric when 
has only a real part.
lation vector, ·̅ means take the conjugate vector, 
(·) means take the real part, 
, 
∈ ℂ is the complex vector corresponding to the head and tail entities, 
∈ ℂ is the
re(·) means take the
imaginary part, &lt;·,·,·&gt; represents the product sum of vector elements. The function is antisymmetric</p>
        <p>In the process of model training, deleted facts are also used to construct negative samples, and
positive and negative samples are combined. On the one hand, the prediction effect and stability of the
model are improved. On the other hand, the prediction accuracy of the model for the facts that were
correct in the past but are wrong now (i.e. deleted) is improved.</p>
        <p>∩ ∃(ℎ, , ,</p>
      </sec>
    </sec>
    <sec id="sec-3">
      <title>2.2. Evaluation indicators of predictive effectiveness</title>
      <p>In terms of the selection of evaluation indexes, same evaluation indexes as the static knowledge
graph are used. For the quintuples in the test set, two types of candidate quintuples are constructed by
replacing the head or tail entities, and the scores of these entities are output in order.</p>
      <p>1. MRR (Mean Reciprocal Ranking) is the average proportion of quadrilateral ranking in the
prediction result. The larger the MRR value is, the higher the triplet ranking with correct prediction will be,
and the better the prediction effect of the model will be.</p>
      <p>=
1
| |</p>
      <p>(
 =
1
1
| |
1
| |
+

1
| |
| |</p>
      <p>Ⅱ(</p>
      <p>1
+ ⋯ +

1
| |</p>
      <p>)
≤ )
2. HITS@n refers to the average proportion of triples ranked less than n in the link prediction.
| S | is the number of triples,</p>
      <p>refers to the ranking of link predictions for the ith triplet. Ⅱ(∗) is
the indicator function. The value of the function is 1 if the condition is true, and 0 otherwise. In general,
n is equal to 1, 3, or 10. The larger HITS@n is, the higher the probability that the predicted results rank
less than n, and the better the link prediction effect is.</p>
    </sec>
    <sec id="sec-4">
      <title>3. Construction and Prediction of Knowledge Graph</title>
    </sec>
    <sec id="sec-5">
      <title>3.1. Basic information of data</title>
      <p>This paper takes the inspection records of security risks in a city in a certain year as the core data,
with a total of 512840 pieces of data content, including 169 fields such as the enterprise, time, place,
place, and specific content related to the inspection of potential risks. Delete the fields with more than
80% missing values, and the remaining 68 fields are divided into 19 indicators related to the company,
27 indicators related to hidden dangers, and 22 indicators of other categories.</p>
      <p>According to statistics, the top ten feature words of hidden dangers in logistics enterprises are
security, fire extinguisher, logo, clutter, production, warehouse, use, channel, time and line. The top ten
feature words of all enterprises in the city are security, fire extinguisher, logo, clutter, distribution box,
exit, cover, record, training and use. As can be seen from the statistics, logistics enterprises and all
enterprises have both common problems and great differences. If all analysis is done using enterprise
data, considerations of operational safety, workplace, and problem urgency will inevitably be ignored,
thereby affecting the reliability and value of real-world applications.</p>
    </sec>
    <sec id="sec-6">
      <title>3.2. Hidden danger network construction and sample setting</title>
      <p>Using the method of fuzzy matching and enterprise classification, 14,831 pieces of data of logistics
enterprises were selected, involving 4042 logistics enterprises. According to the top-level structure
design of knowledge graph, 11 fields are extracted and six types of quad pattern are designed, which is
shown in Table 1.
industry classification happen hidden danger category t
hidden danger category appear address t
hidden danger category appear place t
hidden danger category co-occurrence hidden danger category t
According to the concept layer's quadruple pattern, the knowledge graph under different time
windows can be constructed. As shown in Figure 3, the time sequence changes of corresponding facts in
the knowledge graph can be used to complete the construction of positive and negative samples.
1
2
3</p>
      <p>D
A
D
A
A</p>
      <p>B
B
C
7
6</p>
      <p>Where, the number represents the entity, the letter represents the relationship. And the green
indicates that the current time window is deleted relative to the previous time window, that is, the negative
sample; and red represents the real sample under the current time window, that is, the positive sample.</p>
    </sec>
    <sec id="sec-7">
      <title>3.3. Prediction process and result analysis</title>
      <p>In this paper, DE is used as the embedded representation method of timing knowledge graph, the
time information is embedded into the representation of entity and relationship, and the model is trained
by combining the complEx scoring function. In this paper, 70% data is taken as the training set and 30%
data as the verification set. The learning rate of the model is set as 10-3, and the embedding size of
representation learning is set as 128. When using positive and negative samples for model training,
ensure that the total number of samples for each training is less than 1024. Through the performance
test and comparison of the verification set, 100 samples are taken as the negative sampling rate in this
paper, that is, 100 negative samples are set for each prediction fact, including 50 for head entity
replacement and 50 for tail entity replacement. The experiment found that this ratio achieved an appropriate
trade-off between task performance and training time.</p>
      <p>When the loss value of the model is no longer reduced, the average ranking of the real entity is
between 2 and 3. The evaluation results on the test set are shown in Table 2.
0.87
0.59
0.33</p>
      <p>MRR</p>
      <p>It can be seen that Hit@10 of this model has achieved a good result, that is, the link prediction
ranking corresponding to most correct facts is less than 10, and the link prediction ranking
corresponding to more than half of correct facts is less than 3, which indicates that the ranking conforms to the
facts well.</p>
      <p>The entity importance under different time windows is analysed by using the measurement method
of complex network centrality. Take point-degree centrality as an example, the five most important
hidden danger categories under the three continuous time Windows are shown in Table 3.
Table 3 Entity importance under different time windows
t=17
safety equipment and
facilities
operating environment
auxiliary system equipment
and facilities
safety signs and identifiers
other equipment and
facilities
t=18
emergency rescue plan and
implementation
auxiliary system equipment
and facilities
material
operation behavior of the
practitioner
safety equipment and
facilities
t=19
safety in production
other security
management
safety equipment and
facilities
education and training of
practitioners
operating environment</p>
      <p>According to the importance of hidden trouble entities under different time windows, it can be seen
that the emphasis of key hidden trouble under different time windows is different, and the attention of
the same type of hidden trouble has time sequence change. For example, the indexes of safety equipment
and facilities are all in the top five hidden danger categories in the three time windows, indicating their
universality, but their importance is different in different time windows. The job environment class
ranked the top five in t=17, and after t=18 was improved, t=19 again attracted attention.</p>
    </sec>
    <sec id="sec-8">
      <title>4. Summary</title>
    </sec>
    <sec id="sec-9">
      <title>5. References</title>
      <p>Combined with the previous experimental analysis, we found that the time-series knowledge graph
can be used to predict the hidden dangers of logistics enterprises. The time-series knowledge graph can
be combined with various information to learn in the way of graph structured data, and the triplet
relationship between data can be used to improve the ability of hidden dangers prediction. The method in
this paper provides effective data support for daily inspection and hidden danger prevention of logistics
enterprises, improves the comprehensiveness of hidden trouble investigation, and greatly reduces the
probability of hidden trouble occurrence.</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          [1]
          <string-name>
            <given-names>Aditya</given-names>
            <surname>Pingle</surname>
          </string-name>
          , Aritran Piplai, Sudip Mittalet al.
          <article-title>Relext: Relation extraction using deep learning approaches for cybersecurity knowledge graph improvement[M]</article-title>
          . Vancouver, BC, Canada: Association for Computing Machinery, Inc,
          <year>2019</year>
          :
          <fpage>879</fpage>
          -
          <lpage>886</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          [2]
          <string-name>
            <surname>Celebi</surname>
            <given-names>R</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Uyar</surname>
            <given-names>H</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Yasar</surname>
            <given-names>E</given-names>
          </string-name>
          , et al.
          <article-title>Evaluation of knowledge graph embedding approaches for drugdrug interaction prediction in realistic settings[J]</article-title>
          .
          <source>BMC Bioinformatics</source>
          ,
          <year>2019</year>
          ,
          <volume>20</volume>
          (
          <issue>1</issue>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          [3]
          <string-name>
            <surname>Sutphin</surname>
            <given-names>C</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Lee</surname>
            <given-names>K</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Yepes</surname>
            <given-names>A J</given-names>
          </string-name>
          , et al.
          <article-title>Adverse drug event detection using reason assignments in FDA drug labels</article-title>
          .[J].
          <source>Journal of biomedical informatics</source>
          ,
          <year>2020</year>
          ,
          <volume>110</volume>
          :
          <fpage>103552</fpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          [4]
          <string-name>
            <surname>Elluri</surname>
            <given-names>L</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Nagar</surname>
            <given-names>A</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Joshi K P. An</surname>
          </string-name>
          <article-title>Integrated Knowledge Graph to Automate GDPR</article-title>
          and
          <string-name>
            <surname>PCI DSS Compliance[M]. Abe</surname>
            <given-names>N</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Liu</surname>
            <given-names>H</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Pu</surname>
            <given-names>C</given-names>
          </string-name>
          , et al.
          <source>IEEE International Conference on Big Data</source>
          .
          <year>2018</year>
          :
          <fpage>1266</fpage>
          -
          <lpage>1271</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          [5]
          <string-name>
            <surname>Wang</surname>
            <given-names>Yibao</given-names>
          </string-name>
          ,
          <string-name>
            <given-names>Yang</given-names>
            <surname>Tinghui</surname>
          </string-name>
          .
          <article-title>Visual analysis of Knowledge graph of Urban security Research</article-title>
          [J].
          <source>Urban Development Studies</source>
          ,
          <year>2019</year>
          ,
          <volume>26</volume>
          (
          <issue>03</issue>
          ):
          <fpage>116</fpage>
          -
          <lpage>124</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          [6]
          <string-name>
            <given-names>Rishab</given-names>
            <surname>Goel</surname>
          </string-name>
          , Seyed Mehran Kazemi, Marcus Brubaker,
          <string-name>
            <given-names>Pascal</given-names>
            <surname>Poupart</surname>
          </string-name>
          .
          <article-title>Diachronic Embedding for Temporal Knowledge Graph Completion[J]</article-title>
          .
          <source>Proceedings of the AAAI Conference on Artificial Intelligence</source>
          ,
          <year>2020</year>
          ,
          <volume>34</volume>
          (
          <issue>04</issue>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          [7]
          <string-name>
            <surname>Trouillon</surname>
            <given-names>T</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Welbl</surname>
            <given-names>J</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Riedel</surname>
            <given-names>S</given-names>
          </string-name>
          , et al.
          <article-title>Complex Embeddings for Simple Link Prediction[J]</article-title>
          .
          <source>JMLR.org</source>
          ,
          <year>2016</year>
          .
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>