<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>Modeling Agents, Roles, and Positions in Machine Learning Project Organizations</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Rohith Sothilingam</string-name>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Eric Yu</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Department of Computer Science, University of Toronto</institution>
          ,
          <addr-line>Toronto</addr-line>
          ,
          <country country="CA">Canada</country>
        </aff>
        <aff id="aff1">
          <label>1</label>
          <institution>Faculty of Information, University of Toronto</institution>
          ,
          <addr-line>Toronto</addr-line>
          ,
          <country country="CA">Canada</country>
        </aff>
      </contrib-group>
      <fpage>61</fpage>
      <lpage>66</lpage>
      <abstract>
        <p>As Machine Learning (ML) continues its emergence across numerous industries, software teams and organizations face new challenges beyond those found in conventional software projects. The design of data science teams in ML software projects can vary substantially based on the organization's maturity, personnel availability, and their relationship with customers. In an empirical case study of three ML software project organizations, we examined variations in project team designs using i* models. We consider the usefulness of the concepts of Agents, Roles, and Positions defined in the original i* framework to support the analysis of complex organizational relationships. We illustrate how the Position concept helps distinguish the different ways in which each ML software project organizes its team to meet specific needs.</p>
      </abstract>
      <kwd-group>
        <kwd>Organization modeling</kwd>
        <kwd>data science</kwd>
        <kwd>project organization</kwd>
        <kwd>roles</kwd>
      </kwd-group>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>Introduction</title>
      <p>
        Most ML projects and organizations are at the early stages of the maturity curve,
compared to other types of software projects [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ]. The software development lifecycle
for ML is similar to agile software engineering processes, with the cyclical nature of
workflows, involving the collaboration between people from diverse backgrounds and
skillsets [
        <xref ref-type="bibr" rid="ref2">2</xref>
        ].
      </p>
      <p>
        Organizations assign sets of related responsibilities to people based on their
capabilities and project needs. This is known as role mapping. There are many ways to allocate
roles. To allocate roles effectively, roles should be assigned to people with skillsets
matching those required of the role. Without accurate role mapping, business outcomes
will be hindered due to a mismatch between project needs and the qualifications and
abilities of team members [
        <xref ref-type="bibr" rid="ref7">7</xref>
        ].
      </p>
      <p>
        Unlike in more mature areas of software development, roles in ML projects are still
evolving and ill-defined. Furthermore, there is a shortage of trained personnel available
to ML software organizations [
        <xref ref-type="bibr" rid="ref4">4</xref>
        ]. As revealed in recent empirical literature, technical
ML engineering roles are not enough to cover the diverse types of expertise needed in
the ML lifecycle [
        <xref ref-type="bibr" rid="ref9">9</xref>
        ]. Organizations often improvise in ML projects to allocate
responsibilities to available personnel who do not entirely have the right types of expertise
[
        <xref ref-type="bibr" rid="ref9">9</xref>
        ].
      </p>
      <p>Copyright © 2020 for this paper by its authors. Use permitted under
Creative Commons License Attribution 4.0 International (CC BY 4.0).</p>
      <p>In an empirical case study, we identified a number of key issues regarding team
design at three ML project organizations. In this paper, we consider how i* modeling
might help analyze such issues. For example:
• Can goal evaluation in an i* model help identify the underlying factors behind
unsuccessful ML projects?
• Can people issues such as dissatisfaction among team members be analyzed
and addressed with the help of i* modeling?
• Is the i* concept of Position useful for designing ML project teams in which
team members cover multiple roles?
2</p>
    </sec>
    <sec id="sec-2">
      <title>Agents, Roles, and Positions in i*</title>
      <p>
        The concepts of Agent, Role, and Position were introduced in i* for modeling
complex organizational relationships [
        <xref ref-type="bibr" rid="ref8">8</xref>
        ]. A Role is an abstract characterization of a social
actor. An Agent, which can play (one or more) Role(s), represents a physical entity,
such as a person. The Position concept mediates between Agents and Roles so as to
provide an abstraction for a bundle of roles that is typically allocated to a single Agent.
The Agent is said to occupy the Position, while the Position covers the set of Roles.
These terms are capitalized in this paper to refer to the i* concepts, to be distinguished
from their general English usage. In everyday usage, it is common to speak of a person
being hired into a role, such as a Project Manager or a Data Scientist. In i*, these would
be treated as Positions, if each of them encompasses multiple Roles, such as Assigning
Tasks, Monitoring Progress, and Evaluating Performance [
        <xref ref-type="bibr" rid="ref8">8</xref>
        ].
      </p>
      <p>
        In ML projects, job roles often require a diversity of skillsets and knowledge areas
[
        <xref ref-type="bibr" rid="ref4">4</xref>
        ] [
        <xref ref-type="bibr" rid="ref6">6</xref>
        ], including application domain knowledge [
        <xref ref-type="bibr" rid="ref7">7</xref>
        ]. Practitioners are required to
occupy multiple Roles, which involve different skillsets and expertise [
        <xref ref-type="bibr" rid="ref9">9</xref>
        ]. An Agent hired
to occupy a Position should possess the competencies and skills required to fulfill all
the Roles covered by the Position. Previous work have noted how job titles at different
organizations can vary based on responsibilities and expertise required [
        <xref ref-type="bibr" rid="ref3">3</xref>
        ] [
        <xref ref-type="bibr" rid="ref9">9</xref>
        ].
3
      </p>
      <p>Model-based analysis of ML project team design</p>
      <p>We conducted an empirical study where we studied three ML project
organizations. The organizations differ in size, level of maturity in ML projects, and the types
of products and services offered. Organization A is a large international financial
organization which develops e-commerce payment systems globally. Organization B
builds advanced ML systems for customers, drawing on research in deep learning.
Organization C builds AI systems to help organizations screen candidates for hiring.</p>
      <p>To compare the team structure design of these organizations, we use a simplified i*
model showing Agents, Roles, and Positions and how they are associated with each
other through plays, occupies, and covers links, while omitting strategic dependencies
and rationales (Fig. 1). We arrived at the configuration of i* Agents, Roles, and
Positions at each organization from interview data. We assigned each job title to be an i*
Position. We named i* Roles for each set of responsibilities expected for each Position.
Principal
data
scientist
Business
analysis
ML
Engineer</p>
      <p>Requirem
ents
gathering</p>
      <p>Project
Managme</p>
      <p>nt
Research
Scientist</p>
      <p>Model
design
Experienced</p>
      <p>ML Engineer
Business
analysis</p>
      <p>Requirem
ents
gathering</p>
      <p>Project
Managme
nt</p>
      <p>Model
design</p>
      <p>Model
developm</p>
      <p>ent
Psychologi
st</p>
      <p>We note that there is much commonality among the Roles found in the three
organization: End user, Business analysis, Project management, Model development, Model
experimentation, Model design, Model testing, and Model deployment. However, there is
considerable variance in how the Roles are grouped into Positions. In Organization A, the Roles
of Model Design, Model Development, and Model Deployment are covered by different
Positions, and are played by different classes of Agents. In Organization B, the first two
Roles are covered under the same Position (Research Scientist), whereas in C, all three
Roles are covered by the ML Engineer Position, which is occupied by a ML Engineer
Agent, i.e., someone with the competencies and skills of an ML engineer.</p>
      <p>In Organization B, the Research Scientist Position, occupied by an ML Engineer Agent,
covers the Role of several technical Roles, such as Model Deployment. At Organization
C, the Business Analysis Role is covered by an I-O Psychologist Position occupied by a
Psychologist Agent. This analysis shows that Role assignment can differ substantially
based on how Positions are defined in each organization. The design of the Positions
may be constrained by the existing skills and competencies available and the kinds of
talent they can attract, recruit, and retain.</p>
      <p>The diverse expertise required of Roles which a Position covers will determine what
expertise is required of an Agent occupying the Position. In Fig. 1, we can see that
Organization C has introduced a Psychologist Agent who is a specialization of Business
Domain Expert. Using modeling, Organization C can evaluate their project team design
and analyze how well this Agent can satisfy the I-O Psychologist Position based on the
Roles it covers. The Psychologist Agent is a business domain expert, providing them
with the right expertise to perform the Business Analysis Role.</p>
      <p>The validity of the empirical findings are limited by our interview data, which have
been obtained through one individual at each organization - the Principal Data Scientist,
the CTO, and founder and CTO at Organizations A, B and C respectively.
4</p>
    </sec>
    <sec id="sec-3">
      <title>Analyzing project team design</title>
      <p>In this section, we use i* Strategic Rationale modeling and goal evaluation to analyze
a past issue in Organization C (Fig. 2) and how it was subsequently addressed (Fig. 3).
The detailed i* modeling allows us to identify the specific underlying factors behind
why this organization's customers were not satisfied in their early history. We use
slightly heavier border elements for highlighting tasks typically seen in ML projects.
Using this convention, we can see how typical ML activities appear in i* models, to
analyze their dependency relationships.</p>
      <p>In the past (Fig. 2), the Business Analysis Role was assigned to the ML Engineer
Position, which was occupied by an ML Engineer Agent who lacked the Business domain
expertise (Resource element) required for the Business Analysis Role. Using goal
evaluation, we can see that the End user Role’s goal of Business objectives satisfied could not be
fully achieved due to the insufficient Business domain expertise of the ML Engineer Agent.</p>
      <p>PractHitRioner Customer
Business
Analysis</p>
      <p>End User
ML
Engineer</p>
      <p>ML
Engineer</p>
      <p>Model
Development</p>
      <p>Business</p>
      <p>Analysis
ML
Engineer</p>
      <p>ML
Engineer
End User</p>
      <p>Using a detailed i* model, we are able to arrive at the conclusion of insufficient role
mapping by following the two paths of propagation (elements highlighted by red
circles) caused by the partially denied Resource element Business domain expertise. The
End User Role must satisfy the goal Business objectives satisfied, which has goal
dependency relationships between the Business Analysis and Model Deployment Roles. Firstly,
the End User Role’s goal is dependent on the soft-goal Successful business goals, which
is partially denied because the Business domain expertise resource is not satisfied. Along
the other path, the Model Development Role’s task of Train model is only partially satisfied
because it is dependent on the Business Analysis Role’s Resource element Business
domain expertise to be satisfied. As a result, the End User Role’s goal of Business objectives
satisfied is only partially satisfied because it depends on the goal of Accurate application,
which is only partially satisfied.</p>
      <p>To address this challenge, Organization C introduced the I-O Psychologist Position
(Fig. 3). The (industrial organizational) I-O Psychologist Agent occupying this Position
is a Business domain expert and who has the Business domain expertise (Resource element).
Through the same paths of goal dependencies, the End User Role’s goal of Business
objectives satisfied is now satisfied. The organization was able to redesign their team
design to better satisfy the goals of the Customer.</p>
    </sec>
    <sec id="sec-4">
      <title>Conclusions and future work</title>
      <p>In this paper, we used modeling to identify how ML or data science teams can vary
substantially in their team design. We demonstrated the use of the i* concept of
Position, not included in the iStar 2.0 Core, for modeling complex organizational
relationships, and as a step toward addressing the challenge of mapping roles in ML projects
to the right people based on expertise. Using modeling, we were able to identify the
lack of business domain expertise as the underlying factor contributing to why a ML
project organization was facing challenges with their customer satisfaction. By
identifying where specifically failure is occurring, organizations can diagnose challenges in
team design in greater detail, through the early detection of the problem. In future work,
we plan to consider expertise and domain knowledge of Agents in an i* extension to
help improve the analysis of Role mapping.</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          1.
          <string-name>
            <surname>Akkiraju</surname>
            ,
            <given-names>R.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Sinha</surname>
            ,
            <given-names>V.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Xu</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          , et al. (
          <year>2018</year>
          ).
          <article-title>Characterizing machine learning process: A maturity framework</article-title>
          . arXiv preprint arXiv:
          <year>1811</year>
          .04871
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          2.
          <string-name>
            <surname>Amershi</surname>
            ,
            <given-names>S.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Cakmak</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Knox</surname>
            ,
            <given-names>W. B.</given-names>
          </string-name>
          , &amp;
          <string-name>
            <surname>Kulesza</surname>
            ,
            <given-names>T.</given-names>
          </string-name>
          (
          <year>2014</year>
          ).
          <article-title>Power to the people: The role of humans in interactive machine learning</article-title>
          .
          <source>AI Magazine</source>
          ,
          <volume>35</volume>
          (
          <issue>4</issue>
          ),
          <fpage>105</fpage>
          -
          <lpage>120</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          3.
          <string-name>
            <surname>De Mauro</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Greco</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Grimaldi</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          , &amp;
          <string-name>
            <surname>Ritala</surname>
            ,
            <given-names>P.</given-names>
          </string-name>
          (
          <year>2018</year>
          ).
          <article-title>Human resources for Big Data professions: A systematic classification of job roles and required skill sets</article-title>
          .
          <source>Information Processing &amp; Management</source>
          ,
          <volume>54</volume>
          (
          <issue>5</issue>
          ),
          <fpage>807</fpage>
          -
          <lpage>817</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          4.
          <string-name>
            <surname>de Sá Sousa</surname>
            ,
            <given-names>H. P.</given-names>
          </string-name>
          , &amp; do Prado Leite,
          <string-name>
            <surname>J. C. S.</surname>
          </string-name>
          (
          <year>2017</year>
          ).
          <article-title>Requirement patterns for organizat ional modeling</article-title>
          .
          <source>In 2017 IEEE 25th International Requirements Engineering Confe rence Workshops (REW)</source>
          (pp.
          <fpage>252</fpage>
          -
          <lpage>259</lpage>
          ). IEEE.
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          5.
          <string-name>
            <surname>Fazel-Zarandi</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          , &amp;
          <string-name>
            <surname>Fox</surname>
            ,
            <given-names>M. S.</given-names>
          </string-name>
          (
          <year>2012</year>
          ).
          <article-title>An Ontology for Skill and Competency Management</article-title>
          .
          <source>In Proceedings of the 4th Conf. on Formal Ontologies in Information Systems (FOIS)</source>
          (pp.
          <fpage>89</fpage>
          -
          <lpage>102</lpage>
          ).
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          6.
          <string-name>
            <surname>Linden</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Idoine</surname>
            ,
            <given-names>C.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Hare</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          , &amp;
          <string-name>
            <surname>Brethenoux</surname>
            ,
            <given-names>E.</given-names>
          </string-name>
          (
          <year>2018</year>
          ).
          <article-title>Staffing Data Science Teams: Map Capabilities to Key Roles</article-title>
          .
          <source>Retrieved October 15</source>
          ,
          <year>2019</year>
          , from https://www.gartner.com/document/3888468?ref=
          <article-title>TypeAheadSearch&amp;qid=d8a9c1b58ba a53d2ab0$q=Staffing Data Science Teams: Map Capabilities to Key Roles</article-title>
          .
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          7.
          <string-name>
            <surname>Saltz</surname>
            ,
            <given-names>J. S.</given-names>
          </string-name>
          , &amp;
          <string-name>
            <surname>Grady</surname>
            ,
            <given-names>N. W.</given-names>
          </string-name>
          (
          <year>2017</year>
          , December).
          <article-title>The ambiguity of data science team roles and the need for a data science workforce framework</article-title>
          .
          <source>In 2017 IEEE International Conference on Big Data (Big Data)</source>
          (pp.
          <fpage>2355</fpage>
          -
          <lpage>2361</lpage>
          ). IEEE.
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          8.
          <string-name>
            <surname>Yu</surname>
            ,
            <given-names>E. S.</given-names>
          </string-name>
          , &amp;
          <string-name>
            <surname>Mylopoulos</surname>
            ,
            <given-names>J.</given-names>
          </string-name>
          (
          <year>1994</year>
          ).
          <article-title>Understanding “why” in software process modelling, analysis, and design</article-title>
          .
          <source>In Proceedings of 16th international conference on software engine eering</source>
          (pp.
          <fpage>159</fpage>
          -
          <lpage>168</lpage>
          ). IEEE.
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          9.
          <string-name>
            <surname>Zhang</surname>
            ,
            <given-names>A. X.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Muller</surname>
            ,
            <given-names>M.</given-names>
          </string-name>
          , &amp;
          <string-name>
            <surname>Wang</surname>
            ,
            <given-names>D.</given-names>
          </string-name>
          (
          <year>2020</year>
          ).
          <article-title>How do Data Science Workers Collaborate? Roles, Workflows, and Tools</article-title>
          . arXiv preprint arXiv:
          <year>2001</year>
          .06684.
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>