Firm Business Networks

           Chien-Hung Chien, Armin Haller and Anton H. Westveld

                        Australian National University
         {chien-hung.chien,armin.haller,anton.westveld}@anu.edu.au
                           http://www.anu.edu.au


      Abstract. This paper describes an ontology-based approach to inte-
      grate datasets from Intellectual Property Australia and the Australian
      Securities Exchange to study firms in business networks. We combine
      different indicator variables with SPARQL queries for research to under-
      stand the characteristics of different firms in multiple business networks.
      We use an exponential random graph models approach to describe fac-
      tors that help firms form business networks. In doing so we find evidence
      of homophily for large firms in patents and trademarks business net-
      works. They are more likely to form business networks in comparison
      with small & medium firms. For firms in patents and shared director
      and trademarks and directors business networks, firm size does not play
      an important factor in the formation of business networks and there is
      limited evidence of homophily.

      Keywords: semantic web, business networks, exponential random graph
      model


1   Introduction
At a time when governments face budgets constraints, it is important for them
and their statistical agencies to make better use of available resources. Govern-
ments around the world have realised the advantages of integrating their datasets
to use them for purposes beyond which they were collected for. The Australian
Government’s open data agenda aims to integrate multiple data sources and pro-
vide information to encourage evidence-based policy development [1]. The 2017
Productivity Commission inquiry into Data Availability and Use highlighted the
need to create integrated and linked national interest datasets to inform policy
development [2].
    The need to integrate a large number of datasets from multiple sources has
created a big data challenge for statistical offices, including the Australian Bu-
reau of Statistics (ABS). The statistical challenges associated with creating and
analysing data from diverse sources has been discussed extensively in [3]. A
recent paper by [4] has presented several ABS case studies on using semantic
web technologies to visualise and analyse integrated datasets. This preliminary
research builds on [5]. Using open and purchased data sources, we focus on com-
bining semantic web and statistical methods to develop a better understanding
of firms in multiple business networks.
2      Semantic web and firm business networks

    Firms seek partners with complementary assets to leverage each other’s
strengths and find competitive advantages to ensure market success. Business
networks play a vital role in finding new market opportunities and obtaining the
necessary resources to achieve growth [6]. There are different types of business
networks ranging from more structured (business groups or franchising) to less
structured (R&D consortium, trade association and shared directors). These
business networks facilitate different degrees of knowledge transfer and create
social capital to enhance business performance [7].

    Business networks play a particularly important role in ensuring the economic
success of small firms. Firms in business networks have mutual dependence to
ensure each other’s success. Business networks can also help better resource
allocation and reduce operational risks through cooperative arrangements. This
is particularly important in sectors with fast technological advancement and
short product life cycle. This is evident by the success of high-tech start-ups
in Taiwan, where business networks play an important role in integrating the
operation of a large number of specialised small firms in subcontracting and
outsourcing industries [8, p.2-4].

    There is empirical evidence to support firm R&D collaboration as an im-
portant source of innovation to improve firm performance [9,10]. R&D collabo-
ration enables knowledge transfer between firms to share new managerial ideas
and technology. Firms look for different competitive advantages in the mar-
ket through business networks [11,12]. Firms can also form business networks
through shared directors. [13] argued that boards of directors can enhance firm
performance through effectively monitoring and providing resources. [14] also
found that directors serve as an important asset to form business networks,
particularly for young high-tech firms.

    Governments use a range of initiatives to encourage business networks to
achieve economic growth. The Australian Government, through the Department
of Industry, Innovation and Science, actively supports firm collaboration. It has
established Cooperative Research Centres to encourage collaborative research
partnerships between research institutes and industry partners [15]. Another ex-
ample is Innovation Connections which provides funding to support collaborative
projects between small and medium sized firms to find expert technology advice
and collaborate with research centres in developing new ideas with commercial
potential [16].

    It is useful for policy makers to have an understanding of the factors that
could lead to the formation of business networks. Are the factors contributing
to firms forming business networks different when we compare firms in multiple
business networks? Are firms with similar characteristics more likely to form
business networks? What kind of impact did the global financial crisis (GFC)
have on firms in multiple business networks? This study uses open data and
purchased data to address these questions.
                                                    Firm Business Networks        3

2   Semantic web
We use an ontology-based data model to integrate three datasets - patents,
trademarks, and publicly listed companies - to answer our research questions.
The patents and trademarks datasets are from 2017 Intellectual Property Gov-
ernment Open Data (IPGOD). The publicly listed companies on shared directors
from the Australian Securities Exchange (ASX) is purchased from MorningStar
DatAnalysis Premium. Examples of data visualisation can be found in Appendix
B.
    The semantic web approach is well suited to integrate data from multiple
sources and to extract information on firms in multiple business networks. We
assign a unique Uniform Resource Identifier (URI) for each firm using a unique
firm identifier e.g. Australian Business Numbers (ABNs) or Australian Com-
pany Numbers (ACNs). We attach different firm attributes e.g. Industry, Profit
and Loss etc. from different data sources to each firm. Our datasets include Aus-
tralian firms in three different types of business network relationships: firms that
collaborate in R&D related activities (patents networks), firms that collaborate
for commercial interests (trademarks networks) and firms that share directors
(shared directors networks). We are comparing firms belonging to multiple net-
works i.e. patents and shared directors networks, patents and trademarks net-
works and trademarks and shared directors networks. In addition, we separate
our sample into three periods: before the global financial crisis (GFC) from 2003
to 2006, during the GFC from 2007 to 2009 and after the GFC from 2010 to
2013 in our sample.
    For our analysis, it is important to know the data provenance to correctly
compare firms with and without multiple business networks. We use named
graphs in the knowledge graph to distinguish different data sources. These named
graphs are then used in the SPARQL queries to retrieve the correct subgraph.
Figure 1 shows the ontology for our data model. We use Ontodia - an OWL and
RDF diagramming tool to visualise our data model. For example, f irm alpha
is connected to f irm beta within a patent business network, while it is also
connected with f irm gamma in a trademark business network. In comparison,
f irm beta is in a patent business network with f irm alpaha and shares a direc-
tor with f irm delta.
    We have 518, 870 triples of firms in the patent named graph http://patents,
5, 638, 915 triples of firms in the trademark named graph http://trademarks
and 272, 289 triples in directors named graph http://directors with a total of
6, 430, 074 triples in the integrated knowledge graph. We use legal entities to rep-
resent firms. The IPGOD and ASX datasets contain unique firm identification
numbers (ABNs or ACNs) for firms. We use patent and trademark applications
(application number), and directors (director id) to identify firms in different
business networks. We use unique ABNs and ACNs to form the URI for the
legal entities. These URIs serve as unique linking keys to correctly retrieve firm
information from different sources using SPARQL queries. The bottom right
panel of Figure 1 shows the basic business network relationships. The Busi-
ness Network node qualifies how firms can be connected through joint patent
4       Semantic web and firm business networks

                                 Fig. 1: Ontology


                                                                                                 Patents
                                              Basic relationships
                                                                                   hasApplication

                                         Legal    hasBusinessNetwork   Business   hasApplication
                                                                                                       Trademarks
                                         Entity                        Network
                                                                                   hasDirector


                                                                                                 Director


or trademark applications or shared directors. An example below shows how
we construct a SPARQL query to retrieve firms belonging to both patent and
trademark networks in the period before the GFC from 2003 to 2006.

                    Listing 1.1: intersection SPARQL query
prefix pat : < http :// patents >
prefix tmk : < http :// trademarks >
SELECT ? ABN
FROM NAMED pat :
FROM NAMED tmk :
WHERE {
values (? BN ) {( " 2003 _BN " ) ( " 2004 _BN " ) ( " 2005 _BN " ) ( " 2006 _BN " )}
{ GRAPH pat :{
? LegalEntity fnet : h a s A u s t r a l i a n B u s i n e s s N u m b e r ? ABN ;
fnet : hasBusinessNetwork ? businessNetwork .}}
FILTER EXISTS
{ GRAPH tmk : {
? LegalEntity fnet : h a s A u s t r a l i a n B u s i n e s s N u m b e r ? ABN ;
fnet : hasBusinessNetwork ? businessNetwork .}}}


3     Data sources

3.1   Intellectual Property Government Open Data

Patents and trademarks datasets are from the 2017 Intellectual Property Gov-
ernment Open Data (IPGOD) for this study. IPGOD contains administrative
                                                   Firm Business Networks        5

information on patents and trademarks [17]. Patent and trademark applications
can be filed by one applicant or multiple applicants. Over the sample period
between 2003 and 2013, there are 329, 809 applicant-patent application com-
binations with 290, 568 unique patent applications. In comparison, there are
1, 350, 134 applicant-trademark application combinations with 639, 958 unique
trademark applications. Counting unique ABNs and unique ACNs, 15, 617 firms
filed patent applications and 152, 539 firms filed trademark applications over the
sample period between 2003 and 2013.
    We do not observe whether firms are in business networks or not. However,
we observe if a firm files a patent or trademark application by itself or with
another firm(s) in IPGOD. Therefore, we define a firm as being in a business
network in year t when it shares a patent and/or a trademark application with
at least one other firm. We create indicator variables equal to one if a firm
has a patent and/or trademark application with at lease one other applicant
type (small & medium or large enterprises) in year t. The indicator takes zero
value if a firm files an application by itself. One would support that the business
networks that generate joint patent and/or trademark applications could have
existed before year t and could have lasted beyond year t. Consider one scenario,
firm A, which had a joint application with firm B in 2003. The last available
observation for firm A is in 2005. The network indicator will show firm A was in
a business network from 2003 to 2005. We have made this assumption because
the duration of a standard patent is 25 years and 10 years for a trademark right
[18][19].


3.2   Purchased publicly listed company data


The data on publicly listed companies and their directors is purchased from
MorningStar DatAnalysis Premium data service. This service contains detailed
reports for all current and formerly listed companies on the Australian Secu-
rities Exchange (ASX). There are 1, 722 listed companies and 9, 892 directors
in the sample reference period between 2003 and 2013. We use the following
data items in our analysis - unique director identification number (DirectorID),
ABNs, ACNs, Global Industry Classification Standard (GICS) industry sectors,
GICS industry groups, director appointed dates and director resigned dates.
    We create an indicator variable if a firm shares a director with at least one
other firm during the sample period. If a director’s appointed date is before
01 − 01 − 2003, we use 01 − 01 − 2003 as the appointed date. We exclude directors
who resigned before 01 − 01 − 2003. The duration of the shared director network
is derived by taking into account the director’s earliest appointed and latest
resigned dates. For example, if director 001 worked in firm A between 2003 and
2004 and firm B between 2004 and 2005 then firm A and B are connected in
the director network between 2003 to 2005. There is a higher proportion of firms
with network connections in the ASX data compared to IPGOD.
6               Semantic web and firm business networks

3.3           Multiple business networks


The scope of the analysis includes firms in three types of multiple business net-
works. They include when a firm files a joint patent and a joint trademark
application with at least one other firm, when a firm files a joint patent applica-
tion and shares the same director with at least one other firm, and when a firm
files a joint trademark application and shares the same director with at least one
other firm. Figure 2 compares the proportion of large and small & medium en-
terprises in these multiple business networks. The number of business networks
has grown over time in the sample. Overall, there are no significant differences
between the proportion of large and small & medium enterprises over differ-
ent periods. However, we observe a larger increase in the proportion of small &
medium enterprises in multiple networks when we compare between the before
GFC (2003 to 2006) and during GFC (2007 to 2009) periods.


Fig. 2: Proportion of applicants types between 2003 and 2013 in business net-
works

                           patents and directors                   patents and trademarks                      trademarks and directors

    firm
    counts


             3000

                                                                                         0.42%


             2000


                                                                              0.41%


             1000                              0.56%                                     0.58%

                                                              0.38%
                                 0.53%                                                                                              0.46%
                    0.43%                                                     0.59%
                                                                                                            0.27%      0.43%
                                               0.44%          0.62%
                    0.57%        0.47%                                                                      0.73%      0.57%        0.54%
                0

                    2003          2007             2010        2003           2007        2010              2003        2007         2010
                     to            to               to          to             to          to                to          to           to
                    2006          2009             2013        2006           2009        2013              2006        2009         2013

                                                          large enterprises    small & medium enterprises
                                                    Firm Business Networks         7

4   Statistical model

Our research goal is to describe the factors that contribute to the formation of
business networks. There are many interdependent social processes that drive
the formation of business networks. The formation of business networks can
be influenced by the presence (or absence) of other ties in the network. The
complexity is shown by the business network formed between Verizon Wireless
and Google in 2009. Verizon Wireless, one of the key wireless telecommunica-
tions providers in the United States, wanted to become less reliant on Apple
and iPhone to deliver its service to high-paying customers [20]. This was mainly
because AT&T, one of Verizon Wireless’s main competitors, had already estab-
lished a closed working relationship with Apple [21]. The successful partnership
between Verizon Wireless and Google has led to forming of business networks
with other Android phone manufacturers like Samsung [22].
    Business networks are inherently relational so the occurrence of a particu-
lar relationship, or tie, could depend on the occurrence of other ties [23]. The
above example clearly demonstrates the endogenous network structural effects.
Firms consider factors beyond simple evaluation of the suitable characteristics
of the perspective partners. The decision by Verizon Wireless involves a strate-
gic response to compete against AT&T by forming other ties with Google and
Samsung. An observed business network can result from a combination of simul-
taneous processes with interdependent endogenous factors [24].
    We use an exponential random graph models (ERGMs) approach which takes
into account the underlying network structure, characteristics of firms and the
characteristics of the dyad in the inference. ERGMs have two main functions:
(1) to describe if a given network structure, e.g. edge or transitivity observed
in a network, occur more than expected by chance; (2) to determine whether
there is an association between network links and firm characteristics and be-
tween network links and dyad characteristics or both [25]. Table 1 shows selected
commonly observed business network structures [26].


                 Table 1: Common business network structures
    Network structure Graphic configuration          Network statistics
                                              P
          edge                                i,j∈N Xij Xji for dissimilar nodes
                                              P
       homophily                                i,j∈N Xij Xji for similar nodes


                                                    P
       transitivity                                     i,j,k∈N Xij Xjk Xki


   Note that the dependency structure is shown when a tie being included in
homophily can also be contained in the transitivity structure .
8       Semantic web and firm business networks

   We build on the work of [27] and [28] and use ERGMs to study Australian
business networks. We use formula (2) in Appendix C to define ERGMs as
                                          1
                   P r(Y = y | θ, X) =        exp [θ | g(y, X)],                (1)
                                         k(θ)
    where y is the observed business network and it takes value 1 if there is a tie
between firm i and firm j and 0 otherwise. The symbol θ represents unknown pa-
rameters of interest and determines the effects of the network statistics. We use
g(y, X) to represent the endogenous network statistics in the model. We follow
[28] and include edges and transitivity structures, common network structures
observed in business network data. We use X to denote exogenous explanatory
variables for the network statistics. There are three firm-specific characteristics.
The variable P roducts is the number of patents or trademark applicants reg-
istered by a firm. Our measures of homophily are LargeF irm and SM E. The
LargeF irm indicates a tie is formed between two large enterprises. The SM E
indicates a tie between two small & medium enterprises. The reference group is
a tie form between a LargeF irm and a SM E. The dyad-specific characteristic,
Industry, takes the value one if a network contains at least two firms in the
same industry or zero otherwise. The term k(θ) is the normalising constant. A
discussion on the key ERGMs assumption and basic concept can be found in
Appendix C.

4.1   Results
We use the R ergm package - The Statnet Project to estimate our models. See [29]
for more details. Some useful references and tutorials can be found in [30,31]. Ta-
ble 2 shows the statistical model results. The results are in log-odds as discussed
in Appendix C. Firms with higher numbers of patent or trademark applications
have a slightly higher probability of forming business network. We have some
evidence of homophily, i.e. firms with similar characteristics are more likely to
form business networks with each other. Table 1 shows that the coefficients for
LargeF irm and SM E are significant. The probabilities are generally higher for
LargeF irm (0.92) than SM E firms (0.73) in particular for firms in patent and
trademark business networks in the period before the GFC. We do not observe
homophily for firms in trademark and shared director networks as the coefficients
for both LargeF irm and SM E are insignificant in the three periods. There is a
mixed result for the same industry indicator variable on firms. Firms operating
in same industries appear to have a higher probability of forming trademark and
shared director business networks than patent and shared director networks. See
model results in Appendix A.


5     Conclusion and future direction
This preliminary analysis has shown the benefits of using semantic web to in-
tegrate datasets to study firms in business networks. We have found that large
                                                       Firm Business Networks           9

firms are more likely to form business networks when they are in patents and
trademarks business networks. The homophily results are mixed for firms in
patents and shared director and trademarks and shared directors business net-
works. Our research could be extended in several areas. One possibility is to
combine the open and purchased datasets with ABS business data to obtain
more firm characteristics, such as better industry classifications or firm produc-
tivity to improve the statistical models.Another possibility is to compare the
model results with latent class model approach to better understand business
network effects.


Acknowledgements

We gratefully acknowledge Laurent Lefort and Chris Conran, who both provided
technical advice for this paper, and Professor Alan Welsh for his comments on
the paper.


References

 1. Turnbull, M.: Speech to the locate 15 conference: The power of open data (12
    2015) accessed at link on 2017-06-03.
 2. Commission, P., et al.: Data availability and use. Inquiry Report 82 (2017) accessed
    at link on 2017-08-11.
 3. Tam, S.M., Clarke, F.: Big data, official statistics and some initiatives by the
    australian bureau of statistics. International Statistical Review 83(3) (2015) 436–
    448
 4. Clarke, F., Chien, C.H.: Visualising big data for official statistics: The abs experi-
    ence. In: Data Visualization and Statistical Literacy for Open and Big Data. IGI
    Global (2017) 224–252
 5. Clarke, F., Chien, C.H.: Connectedness and Meaning: New Analytical Directions
    for Official. In: Proceedings of the 3rd International Workshop on Semantic Statis-
    tics, Bethlehem, U.S. (October 2015) page where your paper starts–ends accessed
    at link on 2018-04-01.
 6. Lee, C., Lee, K., Pennings, J.M.: Internal capabilities, external networks, and per-
    formance: A study on technology-based ventures. Strategic Management Journal
    22(6/7) (2001) 615–640
 7. Inkpen, A.C., Tsang, E.W.K.: Social capital, networks, and knowledge transfer.
    The Academy of Management Review 30(1) (2005) 146–165 accessed at link on
    2017-08-03.
 8. Perry, M., Knapp, A.B., Piggott, V.C.: Small Firms and Network Economies.
    Routledge (2002)
 9. Ahuja, G.: Collaboration networks, structural holes, and innovation: A longitudinal
    study. Administrative Science Quarterly 45(3) (2000) 425–455 accessed at link.
10. Belderbos, R., Carree, M., Lokshin, B.: Cooperative r&d and firm performance.
    Research Policy 33(10) (2004) 1477–1492 accessed at link.
11. Miotti, L., Sachwald, F.: Co-operative r&d why and with whom? Research Policy
    32(8) (2003) 1481–1499 accessed at link.
10      Semantic web and firm business networks

12. Chuluun, T., Prevost, A., Upadhyay, A.: Firm network structure and innovation.
    Journal of Corporate Finance 44 (2017) 193–214 accessed at link.
13. Hillman, A.J., Dalziel, T.: Boards of Directors and Firm Performance: Integrating
    Agency and Resource Dependence Perspectives. The Academy of Management
    Review 28(3) (2003) 383–396
14. Collins, C.J., Clark, K.D.: Strategic Human Resource Practices, Top Manage-
    ment Team Social Networks, and Firm Performance: The Role of Human Resource
    Practices in Creating Organizational Competitive Advantage. The Academy of
    Management Journal 46(6) (2003) 740–751
15. Department of Industry, Innovation and Science: Cooperative research centres
    programme (2017) accessed at link on 2018-04-01.
16. Department of Industry, Innovation and Science: Innovation connections (2018)
    accessed at link on 2018-06-24.
17. IP Australia: Intellectual property government open data 2017 (2017) accessed at
    link on 2017-08-03.
18. IP Australia: Patent basics (2018) accessed at link on 2018-04-01.
19. IP Australia: Trademark basics (2018) accessed at link on 2018-05-16.
20. Svensson, P.: Verizon, google in android partnership (6 2009) accessed at link on
    2018-12-03.
21. Cohan, P.: Project Vogue: Inside Apple’s iPhone Deal With AT&T (9 2013) ac-
    cessed at link on 2018-12-03.
22. Tibken, S.: Samsung, verizon will partner on 5g smartphone in first half of 2019
    (12 2018) accessed at link on 2018-12-03.
23. Koskinen, J., Daraganova, G.: Exponential random graph model fundamentals. In
    Lusher, D., Koskinen, J., Robins, G., eds.: Exponential random graph models for
    social networks: Theory, methods, and applications. Cambridge University Press
    (2013) 49–76
24. Kim, J.Y., Howard, M., Cox Pahnke, E., Boeker, W.: Understanding network
    formation in strategy research: Exponential random graph models. Strategic man-
    agement journal 37(1) (2016) 22–44
25. Valente, T.W.: Social networks and health: Models, methods, and applications.
    Volume 1. Oxford University Press New York (2010)
26. Robbins, G., Dean, L.: Simplified account of an exponential random graph model
    as a statistical model. In Lusher, D., Koskinen, J., Robins, G., eds.: Exponen-
    tial random graph models for social networks: Theory, methods, and applications.
    Cambridge University Press (2013) 29–36
27. Desmarais, B., Cranmer, S.: Statistical mechanics of networks: Estimation and
    uncertainty. Physica A: Statistical Mechanics and its Applications 391(4) (2012)
    1865 – 1876 accessed at link.
28. Balland, P.A., Belso-Martı́nez, J.A., Morrison, A.: The dynamics of technical
    and business knowledge networks in industrial clusters: Embeddedness, status, or
    proximity? Economic Geography 92(1) (2016) 35–60
29. Handcock, M.S., Hunter, D.R., Butts, C.T., Goodreau, S.M., Krivitsky, P.N., Mor-
    ris, M.: ergm: Fit, Simulate and Diagnose Exponential-Family Models for Net-
    works. The Statnet Project (http://www.statnet.org). (2018) R package version
    3.9.4.
30. Hunter, D.R.: Curved exponential family models for social networks. Social net-
    works 29(2) (2007) 216–230
31. Levy, M.: Ergm tutorial. = http://michaellevy.name/blog/ERGM-tutorial/ (2016)
    Accessed: 22-09-2018.
32. Lybbert, T.J., Zolas, N.J.: Getting patents and economic data to speak to each
    other: an ‘algorithmic links with probabilities’ approach for joint analyses of patent-
    ing and economic activity. Research Policy 43(3) (2014) 530–542
33. Cranmer, S.J., Desmarais, B.A.: Inferential network analysis with exponential
    random graph models. Political Analysis 19(1) (2011) 66–86
34. Cranmer, S.J., Leifeld, P., McClurg, S.D., Rolfe, M.: Navigating the range of
    statistical tools for inferential network analysis. American Journal of Political
    Science 61(1) (2017) 237–251
35. Morris, M., Handcock, M.S., Butts, C.T., Hunter, D.R., Goodreau, S.M., de Moll,
    S.B., , Krivitsky, P.N.: Exponential random graph models (ergms) using statnet
    tutorial (2016) accessed at link on 2019-01-03.
36. Goodreau, S.M., Kitts, J.A., Morris, M.: Birds of a feather, or friend of a friend?
    using exponential random graph models to investigate adolescent social networks.
    Demography 46(1) (2009) 103–125
37. Martina Morris, Steven M. Goodreau, S.M.J.: Network modeling for epidemics
    (2018) accessed at link on 2019-01-03.


Appendix A            Model results

Appendix B            Visualisation
Fig. 3 in Appendix shows an example of firms in shared directors business net-
works using information from MorningStar DatAnalysis Premium. The firms
are classified to different industries using Global Industry Classification Stan-
dard (GICS). GICS Classification provides details of the classification. Each
node represents a firm which shares a director with at least one other firm. The
colour of the node represents different industries classifications.
                                                                                          Table 2: model comparison
                                                                                                                      Dependent variable:
                                                                      Networks before GFC 2003 2006       Networks during GFC 2007 2009       Networks after GFC 2010 2013
                                                                PAT and ASX PAT and TMK TMK and ASX PAT and ASX PAT and TMK TMK and ASX PAT and ASX PAT and TMK TMK and ASX
                                          edges                  −4.194∗∗∗   −10.546∗∗∗      −6.181∗∗∗    −4.571∗∗∗       −11.191∗∗∗        −6.214∗∗∗   −5.014∗∗∗   −11.741∗∗∗         −6.290∗∗∗
                                                                  (0.571)     (0.329)         (0.497)      (0.571)         (0.287)           (0.414)     (0.518)     (0.257)            (0.317)
                                          triangles               0.939∗∗∗    4.469∗∗∗       1.570∗∗∗      1.017∗∗∗        5.811∗∗∗         1.539∗∗∗    0.983∗∗∗     8.501∗∗∗           1.882∗∗∗
                                                                   (0.279)     (0.612)        (0.228)       (0.234)         (0.387)          (0.186)     (0.206)      (0.503)            (0.193)
Semantic web and firm business networks


                                          Products                 0.022        0.011        0.029∗∗∗        0.028         0.032∗∗∗         0.030∗∗∗     0.035∗      0.051∗∗∗           0.033∗∗∗
                                                                  (0.026)      (0.013)        (0.008)       (0.020)         (0.009)          (0.007)     (0.018)      (0.011)            (0.006)
                                          Industry                −0.076                     0.754∗∗∗        0.286                          0.718∗∗∗      0.406                         0.574∗∗∗
                                                                  (0.270)                     (0.241)       (0.284)                          (0.203)     (0.264)                         (0.154)
                                          Largefirm               0.899∗∗     2.025∗∗∗         0.278         0.371         1.536∗∗∗           0.191      0.569∗∗     2.038∗∗∗            0.088
                                                                  (0.406)      (0.438)        (0.192)       (0.301)         (0.399)          (0.169)     (0.275)      (0.331)           (0.140)
                                          SME                     1.162∗∗∗    −1.091∗∗         0.553        0.541∗        −1.092∗∗∗           0.183      0.532∗     −1.206∗∗∗           −0.088
                                                                   (0.408)     (0.432)        (0.568)       (0.321)        (0.340)           (0.460)     (0.306)     (0.302)            (0.463)
                                          Akaike Inf. Crit.       221.305     1,151.229       712.345      345.917         3,167.493        1,048.500    441.311    4,756.693          1,259.173
                                          Bayesian Inf. Crit.     247.979     1,210.979       746.731      375.303         3,233.346        1,086.537    473.130    4,827.291          1,298.919
                                                                                                                                                                    ∗
                                          Note:                                                                                                                         p<0.1; ∗∗ p<0.05; ∗∗∗ p<0.01
12
                                Firm Business Networks   13

Fig. 3: Firm networks through shared directors


                                                              Materials
                                                              Real
                                                              Energy
                                                              Media
                                                              Technology
                                                              Health
                                                              Pharmaceuticals
                                                              Food,
                                                              Software
                                                              Utilities
                                                              Retailing
                                                              Banks
                                                              Commercial
                                                              Consumer
                                                              Capital
                                                              Insurance
                                                              Diversified
                                                              Transportation
                                                              Automobiles
                                                              Semiconductors
              14     Semantic web and firm business networks

                  Fig. 4 in Appendix shows the innovation hotspots across Australia. We com-
              bine the available latitude and longitude coordinates information with the prod-
              uct type information from 2017 Intellectual Property Government Open Data
              . Each node represent the location of an applicant. There is missing latitude
              and longitude information. However, these firms have postcode information. We
              use a combination of information from the Post Office, Google map api and a
              research dataset to impute latitude and longitude information for these firms.
              The use of postcode is an improvement than using the capital cities for imputing
              missing latitude and longitude information.
                  The colour of the node represents the product type for patent or trademark
              applications. Patent and trademark applications are classified using different
              classifications. This makes it difficult to compare innovations between patent
              and trademark applications. [32] proposed an algorithmic links with probabil-
              ities (ALP) approach, which analyses the text descriptions, to concord Inter-
              national Patent Classification (IPC) for patents and Nice Agreement (NICE)
        0     Classifications for trademarks. We applied their research and concord trademark
              applications to IPC.


                                        Fig. 4: Innovation hot spots
      −10


                                                                                  product_code
                                                                                     A
      −20
                                                                                     B
                                                                                     C
lat


                                                                                     D
                                                                                     E
                                                                                     F
      −30
                                                                                     G
                                                                                     H


      −40


                A = Human Necessities, B = Performing Operations, Transporting, C = Chem-
                istry, Metallurgy, D = Textiles and Paper, E = Fixed Constructions, F = Me-
            110 chanical Engineering,
                          120         G130= Physics, H
                                                     140 = Electricity.
                                                                   150     160
                                          lon
                                                       Firm Business Networks    15

Appendix C          ERGMs assumptions and basic concept
The exponential random graph models (ERGMs) maximise the probability of
the observed networks over the networks with the same number of vertices that
could have been observed to estimate parameters. The approach allows statisti-
cal inference without independence assumptions. This is because the approach
allows for endogenous dependencies coming from the networks. It also contains
a set of network statistics that can include exogenous variables coming from the
characteristics of vertices or edges [33,34].
    We follow [35] and specify the general form for an ERGM as:

                                        1
                       P (Y = y) =          exp[θ | g(y, X)],                   (2)
                                       k(θ)

    where Y is the random variable for the state of the network and y is the ob-
served networks, g(y, X). We use X to denote the observed firm characteristics.
The symbol θ represents unknown parameters of interest and determines the
effects of the network statistics. The symbol k(θ) is the normalising constant,
it represents the quantity in the numerator summed over all possible networks
(typically constrained to be all networks with the same number of node set as
y). The formula (2) can be re-expressed in terms of the conditional log-odds of
a single tie between two actors as

                                           {
                         logit (Yij = 1 | yij ) = θ | δ(yij ),                  (3)

    where Yij is the random variable for the state of the firm pair i,j (with
                           {
realisation yij ). We use yij to denote the complement of yij , i.e. all connections
in the network except yij . The vector δ(yij ) contains the change statistic for
each model term. The change statistic records how g(y, X) term changes when
yij is toggled from 0 to 1 [36]. So

                                                               { 
                                                   (Yij = 1 | yij )
                      h                   i      
                                      {
                 logit P r(Yij = 1 | yij ) = log               {
                                                                                (4)
                                                   (Yij = 0 | yij )
                                            = θ | δ(yij )                       (5)

    This means that the coefficients θ are interpreted as the log-odds of an indi-
vidual tie conditional on all other ties [37]. This is the major departure from the
logit or probit model. The inclusion is necessary because P r(Yij ) is dependent
on the dyad-wise outcome of all other dyads [23].