Introduction

Supervisors for The Cloud Haskell Application Platform. distributed-process-supervisor. Accessed:

Analysing the hierarchical structure of Erlang applications

0 Bence Janos Szabo 1 ELTE Eotvos Lorand University Faculty of Informatics Budapest , Hungary

2016

201 6 06

Understanding and maintaining legacy software is a well-known problem. Static analysis tools aims to support developers in these tasks. We present a static analysis method for extracting the supervisor hierarchy of Erlang programs, which can support program comprehension and maintenance. Beside the algorithm we introduce two di erent views of the supervisor graph, a compact supervisor graph with only the essential information, and a more detailed full supervisor graph. We demonstrate the method on an example Erlang application and present the two di erent views of the supervisor hierarchy. The method has been implemented as an extension of the RefactorErl analyser framework.

Introduction

The paper is structured as follows. Section 2 describes the necessary background information for understanding the approach. Section 3 describes the formal extension of the used source code representation (SP G) and the algorithm of the analysis. Section 4 introduces two graphs/views with di erent granularity level. Section 5 demonstrates the presented algorithm and views on an example. Section 6 surveys the supervisor concept used in other programming languages. Finally, Section 7 summarises our results and presents our conclusions. 2

Background

2.1

Erlang supervisors Before discussing the method in details, it is necessary to survey the Erlang supervisor and the RefactorErl framework used as a basis for the analysis.

The supervisor behaviour is a library module that contains generic code for process monitoring and managing. The speci c code is implemented in the callback module. The only mandatory function in the callback module is the function init/1. This function de nes the restart strategy, the maximum restart intensity and the child speci cations of the supervisor process. The child speci cation describes the identi er of the child, the function to start the process, the restart option, the timeout of the shutdown, the type of child process (worker or supervisor) and its callback module.

The supervisor process can be started with the supervisor:start link/2,3 functions. These functions require the name of the supervisor (optional), the callback module and startup arguments. When the function is applied the init/1 function is invoked from the provided callback module with the speci ed arguments.

We must note here that besides the static initialising arguments, there are several supervisor manipulating functions such as adding, restarting, deleting a child process, or querying and checking child speci cations, etc in runtime. 2.2

RefactorErl RefactorErl stores the analysed data in a special directed graph called semantic program graph (SP G). This graph includes all the information that can be statically extracted from the source code with di erent analyses. The SP G consists of three layers. The rst is the lexical layer, the second is the syntactic layer and the third is the semantic layer.

The semantic layer is constructed by several analyses which can be divided into two groups. The rst is the group of pre analyses which are executed when an Erlang source le is being loaded into RefactorErl, and the second is the group of post analyses which need to be run manually after the pre analysis phase.

The SP G is a directed graph that can be described as an ordered hextuple:

SP G = (V; AN ; AV ; l; E; e) The components of the hextuple are the following:

V = (Vlex [ Vsyn [ Vsem [ Vr): The nodes of the SP G are the union of the lexical, syntactic and semantic nodes, extended with a special singleton set of the root node.

AN : The set of attribute names of the nodes.

AV : The set of possible attribute values. l : V

AN ! AV : The node labelling partial function.

E: The set of edge labels. e : V

N ! V : The partial function that describes labelled, ordered edges between the graph nodes.

The SP G has several type of nodes with di erent attributes describing the properties of the nodes. The nodes of the SP G and their attributes that are important from the aspect of the supervisor analysis are as follows: Root: A special singleton set that includes the root node of the SP G. Symbol: Vr File: Set of syntactic nodes representing the les in the SP G. Symbol: Vfi

Vsyn

Expr: Set of syntactic nodes representing the expressions in the SP G. The expression nodes have a type attribute which describes the type of the expression. The supervisor analysis uses expression nodes with type application, which means that the expression is a function invocation. Symbol: Ve Vsyn Module: Set of semantic nodes representing an Erlang module in the SP G. The module nodes have a name attribute which describes the name of the module. Symbol: Vm Vsem Behaviour: Set of semantic nodes representing the behaviours in the SP G. A behaviour node is present in the SP G if the analysed module implements a behaviour. The type attribute of the node describes the name of the implemented behaviour. Symbol: Vb Vsem Func: Set of semantic nodes representing the functions in the SP G. The functions are identi ed by the triple of the name of the de ning module, and the name and arity of the function ((M,F,A)). Symbol: Vf Vsem 3

Supervisor analysis

The analysis de ned in this paper extends the SPG with the supervisor semantic node and new labelled links between semantic and syntactic nodes of the graph according to the semantics of the supervisor behaviour. 3.1

SP G extension We formally specify how do we extend the previously described set of semantic nodes with a set of Supervisor nodes (Vsup) and the directed edges between the nodes of the graph.

Attributes of the Supervisor node: name: The name of the supervisor. Type: atom() type: The type of the supervisor. Type: [supervisor,worker] registered name: This attribute contains the possible registered names of the supervisor. Type: supRegName() = [f[via], [Module :: atom()], [Name :: atom()]g], f[Scope :: atom()], [Name :: atom()]g. data: If the supervisor is a worker then this attribute contains every information about the worker. Type: supPropList() = [fchild id, [atom()]g, frestart, [atom()]g, fshutdown, [integer()]g, fmodules, [atom()]g]. strategy: The restart strategy of the supervisor.

Type: supStrategy() = [f[Strategy :: strategy()], [MaxTime :: integer()], [MaxRestart :: integer()]g] where the strategy() is the type of restart strategies (one for all, one for one, rest for one, simple one for one), MaxTime and MaxRestart describe the maximum intensity of restarts within the given time period. start args: Contains the last argument of the supervisor:start link/2,3 function calls which can start the actual supervisor. The last argument is passed to the init/1 callback function. Type: [term()] New edges from the Supervisor node: sup cb module : Vsup ! Vm : Points to the node of the callback module of the supervisor. sup start : Vsup ! Vf : Points to the function nodes that apply the functions supervisor:start link/2,3 in their de nition (potentially start the supervisor process). sup init : Vsup ! Vf : Points to the function node init/1 in the callback module. sup worker : Vsup ! Vsup : Points from a supervisor to its direct successor supervisor nodes (in the supervision tree). w def : Vsup ! Vf : Points from a worker supervisor node to the function node that de nes the worker. sup def : Vsup ! Vsup : Points from a supervisor to its de ning supervisor node. sup ref : Vsup ! Ve : Points to the function calls (expressions) that can modify or query the supervisor in runtime.

New edges to the Supervisor node: beh sup : Vb ! Vsup : This edge points from the behaviour node to the supervisor node if its type attribute is supervisor.

Formal extension of the SPG Let SP G0 denote the extended semantic program graph:

SP G0 = (V 0; A0N ; A0V ; l0; E0; e0) The set of semantic nodes (Vsem) is extended with the set of Supervisor nodes (Vsup):

Vs0em = Vsem [ Vsup

V 0 = V [ Vs0em The sets of attribute names (AN ) and values (AV ) are extended with the attributes and possible values of attributes of the Supervisor node:

AsNup = fname; type; registered name; data; strategy; start argsg Asup = fatom() [ [supervisorjworker] [ supRegN ame() [ supP ropList() [ supStrateg() [ [term()]g V

The partial function e is extended with the following cases for an arbitrary x 2 Vsup node and an arbitrary n 2 N

natural number: For an arbitrary z 2 Vb node: e0 : x e0 : x e0 : x sup cb module n ! y 2 Vm fsup start; sup init; w def g n ! y 2 Vf fsup worker; sup def g n ! y 2 Vsup e0 : x sup ref n ! y 2 Ve e0 : z beh sup

n ! y 2 Vsup 3.2

The algorithm The analysis is performed in two phases. The rst step is the pre analysis which is performed during the basic semantic analysis. The second step is the post analysis, which extracts all the statically available information about the supervisors. We separated our analysis according to the asynchronous, incremental analyser framework of RefactorErl. The rst phase of the analysis (pre analysis) is the syntax based node identi cation part that can be performed incrementally form by form asynchronously. In this phase we have only a partial view on the semantic layer. The second phase (post analysis) uses the information calculated by the rst phase of the supervisor analysis and other semantic analyses performed on the full SP G in an interfunctional way (such as the data- ow reaching). The pseudo code of the analysis is presented in Algorithm 1. In the pre analysis phase a new Supervisor semantic node is created and inserted into the SP G if the module implements the supervisor behaviour. That is, the module contains the -behaviour(supervisor) attribute. This module is the callback module of the behaviour. A new edge with label sup cb module is inserted between the Supervisor node and the module and the name attribute of the Supervisor is set to the name of the callback module. Finally, a new edge with label beh sup is inserted between the corresponding behaviour node and the Supervisor node.

The pre analysis phase did the initialisation, new Supervisor nodes have been created, initialised and inserted to the SP G. The post phase of the algorithm extracts the most possible information about the supervisors. The algorithm determines the starting arguments, workers, hierarchy, etc. The post phase creates new semantic nodes and extends the semantic information available in the SP G. The Algorithm 1 shows the pseudo code of the post analysis phase for a given Supervisor node.

First of all, the algorithm nds the init/1 callback function of the supervisor using the f ind init f un routine. Once the callback function node is found we insert an edge with label sup init between the examined Supervisor node and the function node.

The second step of the algorithm is to determine the functions that potentially start the supervisor process. The algorithm searches for supervisor:start link/2,3 function applications. For each function application we get its arguments with the get args routine, then we apply data- ow reaching by invoking the f low routine to determine the possible values of the arguments. The M odule variable contains the name of the callback module of the supervisor that is started with the examined application. We lter the result for the actual supervisor (Supervisor) and we update its registered name and args attributes. Finally, we create a link with label sup start between the Supervisor and the node of the function that applies the supervisor:start link/2,3 functions.

The third step of the algorithm is the analysis of restart strategies and child speci cations of the Supervisor. This information can be found in the return value of the init/1 callback function or the supervisor:start child/2 function. At rst, we examine the return points of the init/1 callback function. The return points of the function are queried by the return points routine. For each return point we apply data- ow reaching to determine the possible strategies and child speci cations. We update the attribute restart strategy with the newly found restart strategies (Strategies) and collect the child speci cations in the set GoodChildSpecs. Next, we need to get the child speci cations from the supervisor:start child/2 function applications. We query the function applications and for each function application we try to determine the values of its arguments with data- ow reaching. The gathered child speci cations are inserted to the GoodChildSpec set if the set of registered names of the examined Supervisor and the set SupRef N ames has a non empty section. Next we create a new worker Supervisor node using the create worker routine for every child speci cation from the GoodChildSpec and create a link with label sup worker between every newly created worker and the examined Supervisor. Finally, based on the type of the worker process we use a sup def or w def edge to create a link between the worker and its de nition.

The last step is to analyse the supervisor references. We examine the function from the supervisor module listed in variable F unsW ithArity. For each function we query the function applications, then with data- ow reaching we try to determine the referred supervisor from its arguments. If the function application refers the examined Supervisor, then a link with label sup ref is created between the function application and the Supervisor node. Algorithm 1 analyse supervisor(Supervisor) 1: InitF un f ind init f un(Supervisor) 2: create link(Supervisor; InitF un; sup init) 3: %% Analyse supervisor : start link=2; 3 f unctions %% 4: StartF unCalls get f un calls(supervisor; start link; f2; 3g) 5: for F unCall 2 StartF unCalls do 6: (SupN ame; M odule; Args) f low(get args(F unCall)) 7: if Supervisorname \ M odule 6= ? then 8: Supervisorregistered name Supervisorregistered names [ SupN ame 9: Supervisorargs Supervisorargs [ Args 10: create link(Supervisor; get f un(F unCall); sup start) 11: end if 12: end for 13: %% Analyse child specif ications %% 14: RetP oints return points(InitF un) 15: GoodChildSpecs ? 16: for RetP oint 2 RetP oints do 17: (ok; (Strategies; ChildSpecs)) f low(RetP oint) 18: Supervisorstrategy Supervisorstrategy [ Strategies 19: GoodChildSpecs GoodChildSpecs [ ChildSpecs 20: end for 21: StartChildF unCalls get f un calls(supervisor; start child; 2) 22: for F unCall 2 StartChildF unCalls do 23: (SupRef N ames; ChildSpecs) f low(get args(F unCall)) 24: if Supervisorregistered name \ SupRef N ames 6= ? then 25: GoodChildSpecs GoodChildSpecs [ ChildSpecs 26: end if 27: end for 28: for ChildSpec 2 GoodChildSpecs do 29: N ewW orker create worker(ChildSpec) 30: W orkerDef get worker def (ChildSpec) 31: create link(Supervisor; N ewW orker; sup worker) 32: if supervisor 2 N ewW orkertype then 33: create link(N ewW orker; W orkerDef; sup def ) 34: else if worker 2 N ewW orkertype then 35: create link(N ewW orker; W orkerDef; w def ) 36: end if 37: end for 38: %% Analyse supervisor ref erences %% 39: F unsW ithArity f(start child; 2); (terminate child; 2); (delete child; 2); 40: (restart child; 2); (which children; 1); (count children; 1)g 41: for (F un; Arity) 2 F unsW ithArity do 42: F unCalls get f un calls(supervisor; F un; Arity) 43: for F unCall 2 F unCalls do 44: (SupRef N ames; ) f low(get args(F unApp)) 45: if Supervisorregistered name \ SupRef N ames 6= ? then 46: create link(Supervisor; F unCall; sup ref ) 47: end if 48: end for 49: end for 4

The supervisor graph

In Section 3 we extended the SP G with additional semantic information about supervisors. It is possible to visualise the SP G, however it is too extensive and it is hard to nd relevant information about supervisors. In this section we de ne a supervisor graph which is a view of the SP G that contains information only about the supervisors.

The introduced graphs contain explicit information about the supervisor structure that can be used for code comprehension, debugging or checking the hierarchy of the software. 4.1

De nition of the supervisor graph Let the supervisor graph Gsup be an ordered pair. The rst component of the pair is the set of nodes of the supervisor graph. The second component of the pair is the set of directed edges represented as ordered triples. The set of nodes is the union of supervisor nodes, worker nodes, function nodes and expression nodes, formally:

Gsup = (V sup; Esup)

V sup = (Vssup [ Vwsup [ Vfsup [ Vesup) The set of edges is represented by ordered triples. The components of a triple are the head node, the tail node and the label of the edge:

Esup

V sup

LABELsup The set of edge labels of the supervisor graph is:

LABELsup = fstart; init; supervisor; worker; sup def; worker def; ref erenceg Nodes of the supervisor graph

The de nition of the supervisor node is the following:

Sup node = fname :: string(); strategy :: supStrategy(); registered name :: [term()];

start args = [term()]g The Sup nodes are representing the supervisors in the supervisor graph. The Sup node has the following attributes: { name: The name of the supervisor. This will be the caption of the node in the supervisor graph. { strategy: The restart strategy of the supervisor. { registered name: If the supervisor is registered, this attribute contains the type of the registration along with the registered name.

{ start args: The arguments which are passed to the init/1 callback function.

The de nition of the worker node is the following:

SupW orker node = fname :: string(); data :: supP ropList()g The SupW orker nodes are representing the workers in the supervisor graph. The SupW orker node has the following attributes: { name: The name of the worker. This name will be the caption of the node in the supervisor graph. { data: This attribute contains every information available about the worker, for example: the type of the worker.

The de nition of function node is the following:

SupF unction node = fname :: string()g The function nodes represent the callback functions in the supervisor graph. The SupF unction node has the following attributes: { name: The name of the function. This will be the caption of the node in the supervisor graph. Notations

Vsup:

Vsups Vsupw The de nition of the expression node is the following:

SupExpression node = fname :: string(); f unc :: fmodule(); f un(); arity()g; args : [term()]g The expression nodes represent the supervisor references in the supervisor graph. Supervisor references are function applications, which might modify or query the supervisor arguments during runtime. The SupExpression node has the following attributes: { name: The text of the expression node which will be the caption of the node in the supervisor graph. { func: A module, function and arity triple (fM,F,Ag) that identi es the function which is referred by the application expression.

{ args: The arguments of the function application.

In the previous section we introduced the Vsup set of Supervisor semantic nodes in the SP G. In this section we use the V sup notation for the set of supervisor nodes in the new supervisor graph. We de ne two subsets of set Vsup { set of supervisor nodes which type is supervisor. Formally: Vsup { set of supervisor nodes which type is worker. Formally:

Vsups = f x j x 2 Vsup; supervisor 2 l(x; type)g

Vsupw = f x j x 2 Vsup; worker 2 l(x; type)g Auxiliary functions We introduce some auxiliary functions to shorten the supervisor graph building rules: edges(x; y): The function takes two nodes from the SP G and returns the set of labels of directed edges with head x and tail y. updateSup(Sup; Sup0):

Sup 2 Vsups ^ Sup0 2 Vssup

Sup0[name = Supname; strategy = Supstrategy; registered name = Supregistered name; start args = Supstart args] The rule says that the Sup0 node (type of Sup node) shall have the same name, strategy, registered name and start arguments as the Sup Supervisor node from the SP G. updateSupW orker(Sup; W orker):

Sup 2 Vsupw ^ W orker 2 Vwsup

W orker[name = Supname; data = Supdata] The rule says that the W orker node (type of SupW orker node) shall have the same name and the same data as the Sup Supervisor node (which type is worker) from the SP G. updateSupF unction(F un; F un0): The function updates the name attribute of F un0 node (type of set SupF unction node) with the (M,F,A) of the Func node from the SP G. updateSupExpression(Expr; Expr0): The function updates the attributes of Expr0 node (type of SupExpression node). The name attribute shall be the string created from (M,F,A) of the function that contains the SP G expression Expr. The args attribute contains the arguments of the function application which information is gathered from. Finally, the func attribute contains the ordered triple fM,F,Ag of the function which the supervisor reference refer. E.g.: fsupervisor,start child,2g. We introduce two types of supervisor graphs. The full supervisor graph contains every available information about the supervisors and workers, like callback functions, supervisor references, worker de nitions, etc. The COMPACT supervisor graph only contains information about the hierarchy of the supervisors and workers. We use the COMPACT ag to distinguish the COMPACT and full supervisor graphs in the rules. Rules

1. Supervisor node In the followings we de ne the rules which are used to build the full and COMPACT supervisor graphs. If an x Supervisor node is present in the SP G then the x0 node which is synthesised from x using the updateSup function will be in the supervisor graph. This x0 will be a member of the Sup nodes of the supervisor graph. 2. Supervisor edge (COMPACT ) fx; y; zg

Vsups ^ sup worker 2 edges(x; y) ^ sup def 2 edges(y; z) ^ COM P ACT x0 2 Vssup ^ z0 2 Vssup ^ updateSup(x; x0) ^ updateSup(z; z0)^

(x0; z0; supervisor) 2 Esup Let the x, y and z be Supervisor nodes from SP G. If there is a sup worker edge between x and y and a sup def edge between y and z, then there will be a supervisor edge between the x0 and z0 synthesised Sup node nodes in the COMPACT supervisor graph. The not COMPACT (full) supervisor edge rule can be analogously de ned. 3. Worker edge 4. Supervisor init 5. Supervisor start

x 2 Vsups ^ y 2 Vsupw ^ sup worker 2 edges(x; y) x0 2 Vssup ^ y0 2 Vwsup ^ updateSup(x; x0) ^ updateSupW orker(y; y0)^

(x0; y0; worker) 2 Esup Let x and y be Supervisor nodes in the SP G and the type of y is worker. If there is a sup worker edge between x and y, then there will be a worker edge between the x0 and y0 synthesized Sup node nodes in the supervisor graph.

x 2 Vsups ^ f 2 Vf ^ sup init 2 edges(x; f ) ^ :COM P ACT x0 2 Vssup ^ f 0 2 Vfsup ^ updateSup(x; x0) ^ updateSupF unction(f; f 0)^

(x0; f 0; init) 2 Esup If a Supervisor node is connected to a function node with a sup init edge in the SP G, then there will be an init edge between the synthesised x0 and f 0 nodes in the supervisor graph (if the supervisor graph is not COMPACT ). This rule describes that the supervisor shall be connected to its init callback function.

x 2 Vsups ^ f 2 Vf ^ sup start 2 edges(x; f ) ^ :COM P ACT x0 2 Vssup ^ f 0 2 Vfsup ^ updateSup(x; x0) ^ updateSupF unction(f; f 0)^ (x0; f 0; start) 2 Esup If a Supervisor node is linked to a function node with a sup start edge in the SP G, then there will be an start edge between the synthesised x0 and f 0 nodes in the supervisor graph (if the supervisor graph is not COMPACT ). This rule describes that the supervisor shall be connected to the function that can start the supervisor with an application of the supervisor:start link/2,3 functions. 6. Supervisor reference If the x Supervisor node is linked to an expression node x with a sup ref edge in the SP G, then there will be a ref erence edge between the synthesised x0 and e0 nodes in the supervisor graph (if the supervisor graph is not COMPACT ). 5

Demonstrating Example

In this section we illustrate the algorithm of the analysis with an example implementation of a supervisor. First we introduce the example implementation, then we discuss the algorithm step by step focusing mostly on the post analysis phase of the SPG extension. Finally we present the supervisor graph and the compact supervisor graph. 5.1

Example Supervisor The example supervisor can be seen in Figure 1. This supervisor is a simpli ed model of a theatre. The theatre has a director, a technician, and a bandmaster. The bandmaster supervises the musicians as can be seen in Figure 2.

Pre analysis The pre analysis phase looks for "-behaviour(supervisor)." attribute in each module. In this case both theatre (in line 2) and bandmaster module (in line 2) has such attribute. Two new Supervisor nodes will be inserted into the SP G. One for theatre module and one for bandmaster module, and two sup cb module links will be inserted between the Supervisor nodes and the module nodes. The name attribute of the Supervisor nodes will be equal to the names of the module which they are connected to. Also two beh sup links will be inserted between the Behaviour nodes and the Supervisor nodes.

Post analysis We discuss only the analysis of the theatre supervisor since the analysis of the bandmaster can be executed analogously. The only di erence will be the amount of extracted information about the restart strategy and maximal restart intensity as the information is passed as a parameter to function init/1. This additional information is not always trivially available statically, but the data- ow analysis can calculate the possible values of variables statically.

The rst step of the post analysis is to nd the de nition of the function init/1 from theatre callback module in the SP G. It can be found in the line 10 of Figure 1. We insert a link sup init between the supervisor node and the function node.

The second step of the analysis is the examination of the supervisor:start link/2,3 function applications. We query the references of the function and we get two function applications, one application in the de nition of function theatre:start link/1 (in line 8 in Figure 1) and one other one in the de nition of function bandmaster:start link/1 (in line 8 in Figure 2). The next step is the ltering of the relevant applications. To do that we query the possible values of arguments for each application (using data ow analysis) and we look for the second argument which describes the started supervisor. We nd that the function application from the theatre module refers to the theatre supervisor because the ?MODULE macro is substituted to the name of the containing module. We update the attributes of the theatre supervisor node with the gathered information. The attributes registered name and args are updated with the possible values of the rst and the third arguments of the application. Finally we insert a link sup start between the node of the function theatre:start link/1 (in line 7 in Figure 1) and the node of the supervisor theatre.

The third step is the analysis of restart strategies and child speci cations. To nd the child speci cations we need the possible return values of the init/1 callback function. We query the possible return values and extract the information about the restart strategy and maximum intensity of restarts from the result. In our case it is the tuple fone for all, 3, 500g (in line 21 in Figure 1). We update the strategy attribute of the theatre node with this information. Next we try to extract the child speci cations from the return values using backward data- ow reaching. The Children variable (in line 11 in Figure 1) contains the child speci cations, and we need to determine the possible values of this variable using data- ow analysis. There are three di erent children of the theatre supervisor. For each children we create a new Supervisor node of type worker and update the data attributes with the information can be found about each supervisor. We link the theatre supervisor to the workers with sup worker edges. We insert worker def links between the workers and their function de nition. The name of the worker function will be the second component of the child speci cation.

For example, fdirector, start link, []g (in line 13 in Figure 1) means, that the children will be started

using the director:start link/0 function. We also examine the child speci cations found in the applications of supervisor:start child/2 function, but in our case there is no information about them.

In the nal step of the analysis we examine the supervisor references that potentially query or change the attributes of the theatre supervisor. There is a reference to our supervisor in the theatre:add crew/1 function (in line 23 in Figure 1) . We insert a link sup ref between the supervisor node an the reference.

The post analysis of the bandmaster supervisor can be analogously executed. The only important di erence is that the data ow analysis will have heavier impact on the data extracted, because the init/1 callback function of the bandmaster module have four clauses. Thus, the restart strategy and the maximal restart intensity can be given as parameters to the supervisor. For example the data- ow analysis will determine that the supervisor can have three di erent restart strategies (one for one, rest for one, one for all. 5.2

Results Using the rules de ned in Section 4 the COMPACT and full supervisor graphs can be constructed from the SP G after the supervisor analysis is completed. We show the COMPACT and full supervisor graphs for our example supervisor implementation presented in subsection 5.1.

COMPACT supervisor graph The Figure 3 shows the COMPACT supervisor graph for the theatre supervisor. The graph shows only the hierarchy of the supervisors and the workers. The green diamond nodes represent the supervisors and the blue oval nodes are the workers of the supervisors. The attributes of the nodes contains every information that can be extracted statically about the supervisors or the workers as it can be seen in the picture, we highlighted the attributes of three nodes. Full supervisor graph The Figure 4 shows the full supervisor graph for the theatre supervisor. We can see that the structure of this graph is more complex than in the case of the COMPACT supervisor graph. We can nd additional information in the graph like the functions of the callback module, the function nodes of the worker de nitions and the supervisor references. The orange hexagons represent the function nodes. The supervisor concept is present in other programming languages as well. In this section we examine some tools which are capable of analysing supervisors in Erlang and in other programming languages as well. Percept2 [ 9 ] is a pro ler tool for Erlang programs with focus on processes and process communication. Percept2 is a dynamic analyser tool which analyses the program in runtime. In comparison with our results Percept2 can detect the hierarchy of the supervisors correctly. However it does not know the supervisor speci c information, for example restart strategy, child speci cations, etc. In some cases Percept2 can provide more accurate results, because it can detect new children added to supervisors dynamically. Our tool is a static code analyser, and can detect only those children which speci cation is present in the source code. However our tool can detect child speci cations that are not necessarily added to the supervisors in a certain runtime con gurations. In conclusion, the two tools have slightly di erent focus, as Precept2 is a dynamic and RefactorErl is a static analyser tool. The tools should be used as complementary tools.

Akka Akka [ 10 ] is a library for Scala and Java programs. It supports the implementation of distributed concurrent and fault tolerant software using the Actor model. The Actor model takes care of the thread handling and resource locking, thus the developer can focus on the business logic of the software. Supervisors are present in Akka and it is very similar to the supervisors in Erlang. The processes are represented as actors and the supervisor can monitor these actors. Supervisors can start, stop and restart the monitored processes, and also halt them. Akka supports two restart strategies only. Currently there is no static code analyser support for supervisors in aid of Akka. Several IDE-s can provide dynamic information, like call stacks and class diagrams which can be used to get some information about the supervisors, but it is not nearly as accurate as our results for Erlang supervisors. Cloud Haskell Cloud Haskell [ 11 ] is a library for Haskell programs. It supports the implementation of distributed and concurrent software mainly for clusters. In Cloud Haskell there is a package called distributed-process-supervisor [12] which supports supervisors very similar to the supervisors used in Erlang. There is no static code analyser tool that support supervisor analysis in Cloud Haskell. The SourceGraph [ 13 ] can be used to make function call graphs and detect module import hierarchy. From these graphs there are some information available about the supervisors, but our results for Erlang supervisors are more accurate and user friendly. 7

Conclusions

In this paper we discussed a new method for extracting the relevant information from the source code written in Erlang implementing the supervisor behaviour. For the analysis we used the static analysis tool RefactorErl and the information available in its Semantic Program Graph. RefactorErl performs lexical, syntactic and di erent semantic analyses on the source code and stores the extracted results in the SP G. We presented the algorithm of the new analysis with its pseudocode and explained the steps in detail. We introduced the formal extension of the SP G with the new nodes and the way the new semantic information is embedded in the SP G. To make the supervisor hierarchy more explicit we de ned a supervisor graph that is a view of the SP G that includes only information about supervisors. We de ned two levels of granularity for supervisor graphs, that is a full and a compact view. We discussed the steps and rules how these graphs are generated from the SP G. Finally we demonstrated our approach on an example and showed the generated graphs, both a full and a compact view.

The presented method was examined on di erent open source Erlang applications, and the results seemed to be useful in software comprehension.

[1] RefactorErl - Refactoring Erlang Programs. http://plc.inf.elte.hu/erlang, 2015 . Accessed: 2016 -06- 30.

[2]

Istvan

Bozo , Daniel Horpacsi, Zoltan Horvath, Robert Kitlei, Judit K}oszegi , Mate Tejfel, and Melinda Toth. RefactorErl, Source Code Analysis and Refactoring in Erlang . In Proceeding of the 12th Symposium on Programming Languages and Software Tools , Tallin, Estonia, 2011 .

[3]

Erlang

Programming Language . http://www.erlang.org. Accessed: 2016 -06-30.

[4]

Francesco

Cesarini and

Simon Thompson. Erlang

Programming. O'Reilly Media , 2009 .

[5]

Melinda

Toth and

Istvan

Bozo . Static Analysis of Complex Software Systems Implemented in Erlang. In Central European Functional Programming School , volume 7241 of Lecture Notes in Computer Science, pages 440 { 498 . Springer, 2012 .

[6]

Zoltan

Horvath , Laszlo Lovei, Tamas Kozsik, Robert Kitlei, Istvan Bozo, Melinda Toth, and

Roland

Kiraly . Modeling Semantic Knowledge in Erlang for Refactoring . In International Conference on Knowledge Engineering, Principles and Techniques, KEPT 2009 , Selected papers , pages 38 { 53 , 2009 .

[7]

Toth , I. Bozo , and

Horvath . Applying the Query Language to Support Program Comprehension . In Proceeding of International Scienti c Conference on Computer Science and Engineering, ISBN 978-80- 8086-164-3 , pages 52 { 59 , Stara

Lubovna

, Slovakia, September 2010 .

[8]

Martin

Logan , Eric Merritt, and Richard Carlsson. Erlang and OTP in Action. Manning Publications Co., Greenwich , CT , USA, 1st edition, 2010 .

[9]

Huiqing

Li and

Simon

Thompson . Multicore pro ling for erlang programs using percept2 . In Proceedings of the Twelfth ACM SIGPLAN Workshop on Erlang, Erlang '13 , pages 33 { 42 , New York, NY, USA, 2013 . ACM.

[10] Akka . http://akka.io. Accessed: 2016 -06-30.

[11]

Epstein ,

Andrew P.

Black , and Simon Peyton-Jones. Towards haskell in the cloud . In Proceedings of the 4th ACM Symposium on Haskell, Haskell '11 , pages 118 { 129 , New York, NY, USA, 2011 . ACM.

[13] SourceGraph: Static code analysis using graph-theoretic techniques . https://hackage.haskell.org/ package/SourceGraph. Accessed: 2016 -06-30.