=Paper= {{Paper |id=Vol-2819/session1paper3 |storemode=property |title=Using Lexical Link Analysis as a Tool to Improve Sustainment |pdfUrl=https://ceur-ws.org/Vol-2819/session1paper3.pdf |volume=Vol-2819 |authors=Edwin Stevens,Ying Zhao }} ==Using Lexical Link Analysis as a Tool to Improve Sustainment== https://ceur-ws.org/Vol-2819/session1paper3.pdf
                  Using Lexical Link Analysis as a Tool to Improve Sustainment


                                                       Edwin Stevens, Ying Zhao ∗
                                              Naval Postgraduate School, Monterey, CA, USA




                              Abstract                                     logistical tail wag the operation dog,” in other words, a good
                                                                           logistician does not want to be the reason that the mission
   A major challenge in the the complex enterprise of the US               can’t go on. Limited manpower, funding, storage space, and
   Navy global materiel distribution is that when a new opera-
                                                                           resources for repair are all in high demand. A good system
   tion condition occurs, the probability of fail or demand model
   of a Naval ship part or item needs to modify to adapt to the            needs to be in place to determine the most efficient and ef-
   new condition. Meanwhile, historical supply databases in-               fective method of stocking, forward staging, or contracting
   clude demand patterns and associations that are critical when           for the materials that have the highest likelihood of demand,
   the new condition enters the system as a perturbation or dis-           balanced with the potential impact of failure. Since even
   ruption which can propagate through the item association net-           one ship has hundreds of thousands of failed parts, many
   work. In this paper, we first show how the two types of item            of which could cause a “redline.” It is of critical importance
   demand changes can be interacted and integrated to calculate            to consider the activities of all the parts as a complex system
   the total demand change (TDC). We show a use case on how                and predict the demand as a whole so that the supply system
   to apply the lexical link analysis (LLA) to discover the item           is as intelligently designed as possible in order to quickly
   association network that propagates the TDC.
                                                                           handle part failures.

                          Introduction                                        Uncertainty, Perturbation and Association
There are many challenges in the complex enterprise of
the US Navy global materiel distribution. Forward deployed                 The probability of fail of a part can be affected by many
US Navy ships, particularly in the high operating tempo                    factors. We need to consider the uncertainty, disruption and
(OPTEMPO) areas such as the Seventh and Fifth Fleet, have                  perturbation that can impact the logistics plans as a whole.
challenges that arise in receiving logistical support when                 For example, uncertainty factors related to environment and
part failures occur. These failures manifest as either a de-               events in wide geographic areas, such as, weather change
mand on the supply system, a casualty report (CASREP), or                  or mission change from a peace time to a conflict time, or
a request for technical assistance. The toughest challenges                a sudden event can cause a perturbation and disruption for
arise when a high impact part fails, and is not immediately                previous logistics and supply plans. Previously high impact
available. This can cause a “redline,” or a failure that stops             but low fail parts may suddenly become in high demand.
the unit from being able to complete its’ mission until the                   The probability of fail is also embedded in the historical
problem can be resolved. The goal of any operational com-                  supply and maintenance data. A failed part is considered to
mander is 100% operation availability (AO), meaning their                  be fixed before a new one is ordered. A part order frequency
unit is always ready to be tasked for any situation that arises.           in the historical supply data reflects its demand if the part can
Failures in contentious environments will stop the mission,                not be repaired within a certain period of time. The demand
and could have great effects on the international and politi-              data in the supply data reflects partial probability of fail.
cal situation. The goal of a Navy logistician is to “not let the              The complexity of predicting total probability of fail for
   ∗
                                                                           a large list of the items calls for the integration of methods
     This will certify that all author(s) of the above article/paper are   in data fusion, data mining, causal learning, and optimiza-
employees of the U.S. Government and performed this work as part           tion for all the elements in a logistics when facing partic-
of their employment, and that the article/paper is therefore not sub-
ject to U.S. copyright protection. No copyright. Use permitted un-
                                                                           ular uncertainty and perturbation. The goal of this paper is
der Creative Commons License Attribution 4.0 International (CC             to demonstrate the techniques such as data mining and lexi-
BY 4.0). In: Proceedings of AAAI Symposium on the 2nd Work-                cal link analysis (LLA) to recalculate the probability of fail
shop on Deep Models and Artificial Intelligence for Defense Appli-         for the previously high impact and low failure parts or items
cations: Potentials, Theories, Practices, Tools, and Risks, Novem-         when the whole system facing a perturbation, uncertainty,
ber 11-12, 2020, Virtual, published at http://ceur-ws.org                  disruption, or a ”redine” failure.
           Lexical Link Analysis (LLA)                           Type 2): Item associations with other items where the asso-
A data mining tool used for this research is Lexical Link        ciations could be due to physical linkages or linked demand
Analysis (Zhao,MacKinnon,and Gallup 2015). LLA is an             based on past business practices. If an item i is ordered, item
unsupervised machine learning method and describes the           j is also likely to be ordered based on the historical data.
characteristics of a complex system using a list of attributes   Type 2) DCs can be mined from historical potentially big
or features, or specific vocabularies or lexical terms. Be-      data, Type 1) DCs may come from expert and engineering
cause the potentially vast number of lexical terms from big      knowledge and simulations.
data, the model can be viewed as a deep model for big data.         In Figure 2, Associj measures how strong item i and j are
For example, we can describe a system using word pairs or        demanded together. Probability and lift are the two measures
bi-grams as lexical terms extracted from text data. LLA au-      defined in Equation (1) and Equation (2) in LLA to measure
tomatically discovers word pairs, and displays them as word      the strength of an association.
pair networks. Figure 1 shows an example of such a word
network discovered from data. “Clean energy” and “renew-         probij = demand of item i, j together out of demand of item j
able energy” are two bi-gram word pairs. For a text docu-                                                           (1)
ment, words are represented as nodes and word pairs as the
links between nodes. A word center (e.g., “energy” in Fig-                 demand of item i, j together out of demand of item j
ure 1) is formed around a word node connected with a list of     lif tij =
                                                                                     demand of item i out of all demands
other words to form more word pairs with the center word                                                                    (2)
“energy.”                                                           In LLA, we first use lif tij to filter out the associations
                                                                 that are not strong enough, then apply probij to compute the
   Discovering Item Associations Using LLA                       total demand change (TDC) for item i as in Equation (3)
                                                                                   M                N
Bi-grams allow LLA to be extended to numerical or cate-                            X                X
gorical data. For example, using structured data, such as at-           T DCi =         DCi |Cm +         probij ∗ T DCj     (3)
tributes from supply chain databases, we discretize numeric                       m=1               j=1
attributes and categorize their values to word-like features.       In this paper, we show LLA can be used to compute the
The word pair model can further be extended to a context-        association network, probij , and lif tij from historical de-
concept-cluster model (Zhao and Zhou 2014). A context can        mand data. When there is a perturbation such as a new opera-
represent a location, a time point, or an object shared across   tion condition Cm occurs that generates a DCj |Cm for item
data sources. For example, the quarters in a year can be one     j, it causes a T DCj for item j; meanwhile, T DCj propa-
of the contexts for item supply data. Items (parts) are the      gates through the discovered association network from LLA
concepts.                                                        to affect the whole demand system and forward predictions
   In this paper, we use LLA for the structured data of sup-     as shown in Equation (3).
ply databases. We want to show that the bi-gram generated
by LLA can also be a form discovery of association among          Data Description and Initial Analysis Results
items demand for a Navy supply database.                         Currently, a part is reviewed to be stocked if it has more
   The common consensus is that data-driven analysis or          than two reorders in one year. This simple system is effec-
data mining can discover initial statistical correlations and    tive overall, but does not consider the reasons for failure,
associations from big data.                                      the reason it is being reordered, or the effect that the failure
   Figure 2 shows conceptually how the associations and          has on the ship. There are a small amount of parts, called
correlations are discovered by LLA. We anticipate the de-        “maintenance assist modules” that are carried onboard ev-
mand change (DC) an item i might come from two types of          ery ship due to engineering specifications calling for imme-
sources: Type 1): A collection of outside perturbations such     diate availability if needed, but that is not enough to prevent
as the change of missions or new operational conditions; and     “redline” failures. To show the feasibility of our method-
                                                                 ology, we compiled a large selection of demand data over
                                                                 the last nine years, containing over 1,000,000 individual de-
                                                                 mands. This data was then compiled by Item Mission Essen-
                                                                 tiality Code (IMEC - impact code), quarter in which the de-
                                                                 mand occurred, and number of demands logged. Next, LLA




                                                                 Figure 2: Total demand change (TDC) caused by new con-
       Figure 1: An example of lexical link analysis             ditions and associations
was applied the data to help discover historical associations   contexts set to be ship type, unit identification code, IMEC,
among the failures. The associations reflect the items that     or shorter time period than the quarters, and then apply LLA
are ordered in the same contexts (e.g., the same quarter or     to search for causal associations at higher or lower resolu-
same ship) historically. Associated parts might be stockpiled   tions, or by stricter or looser requirements. In comparison,
in the same manner should one fail suddenly in a new and        there is a current tool in place called Predictive Risk Spar-
disrupted condition. On a sample run, there were 50 con-        ing Matrix (PRiSM), which has been able to identify parts
nections found across 65,000 demands as illustrated in Fig-     in various C4I systems that have had real world demands,
ure 3, we only considered the associations among the high       which would not have been identified under the standard sys-
impact items (4) with quarterly demand > 51 (high) or low       tem. PRiSM uses mathematical algorithms from inventory
(= 1). For example, item “lwm048749” and “lwm048745”            sparing models to determine potential failures, and these al-
both have high impact 4, while “lwm048749” had high de-         gorithms could possibly be used in coordination with sim-
mand in some quarters when “lwm048745” had low de-              ulation and LLA to better determine future needs. We will
mand. When drilling down using LLA as shown in Figure 4,        also leverage the liaisons from NAVSUP and DLA at the
“lwm048749” had high demand in two quarters (10 and 18)         Fifth and Seventh fleet naval bases, whose job is to track de-
when “lwm048745” had low demand. “lwm048749” had                mand, and then to work with the DoD logistics organizations
high demand in two quarters out of the total 20 quarters. The   to improve operational availability. The LLA tool could be
probability for the association of the two items is 100% and    tested and then given to these liaisons to help them and to
lift is 10. Should “lwm048745” demand more in a new oper-       improve the overall area of operation (AO) for forward de-
ation condition, associated parts such as “lwm048749” may       ployed ships and improve sustainment.
demand even more in the new condition. LLA calculates the
lift measure that is similar to the counterfactual reasoning                 ACKNOWLEDGMENTS
in causal learning (Mackenzie and Pearl 2018; Pearl 2018;       Authors would like to thank the Office of Naval Research
Zhao, MacKinnon, and Jones 2019), i.e., that there is indeed    (ONR)’s Naval Enterprise Partnership Teaming with Uni-
causal relationship between two demands.                        versities for National Excellence (NEPTUNE 2.0) program.
                                                                The views and conclusions contained in this document are
           Conclusion and Future Work                           those of the authors and should not be interpreted as repre-
In this paper, we showed the feasibility on how to apply LLA    senting the official policies, either expressed or implied of
to improve demand change predictions for a complex Navy         the U.S. Government.
supply database.
   In the future research, we will consider the association                            References
                                                                Book with Multiple Authors
                                                                Mackenzie, D. and Pearl, J. 2018. The Book of Why: The
                                                                New Science of Cause and Effect. Penguin.

                                                                Journal Article
                                                                Zhao, Y. and MacKinnon, D.J. and Gallup, S.P., 2015. Big
                                                                data and deep learning for understanding DoD data. Jour-
                                                                nal of Defense Software Engineering, Special Issue: Data
                                                                Mining and Metrics, July/August 2015, Page 4-10. Lumin
                                                                Publishing ISSN 2160-1577.
                                                                Proceedings Paper Published by a Society
                                                                Pearl, J. 2018. The Seven Pillars of Causal Reasoning
                                                                with Reflections on Machine Learning. Retrieved from
                                                                http://ftp.cs.ucla.edu/pub/sta tser/r481.pdf
                                                                Zhao, Y. and Zhou, C. 2014. System and method for knowl-
Figure 3: Total demand change (TDC) caused by new con-          edge pattern search from networked agents. US Patent
ditions and associations                                        8,903,756.
                                                                Proceedings Paper Published by a Press or Publisher
                                                                Zhao Y., MacKinnon, D.; and Jones, J. 2019. Causal Learn-
                                                                ing Using Pair-wise Associations to Discover Supply Chain
                                                                Vulnerability. Proceedings of the 11th International Con-
                                                                ference on Knowledge Discovery and Information Retrieval
                                                                (KDIR 2019), September 17-19, 2019, Vienna, Austria.


Figure 4: LLA allows a drill-down to see how many times
(quarters) the two items are associated