<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>Automatic source code reduction⋆</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Jiˇr´ı Diviˇs</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Ondˇrej Bojar</string-name>
          <email>bojar@ufal.mff.cuni.cz</email>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Charles University, Faculty of Mathematics and Physics Malostranske nam.</institution>
          <addr-line>25, Praha 1, CZ-118 00</addr-line>
          ,
          <country country="CZ">Czech Republic</country>
        </aff>
      </contrib-group>
      <fpage>9</fpage>
      <lpage>16</lpage>
      <abstract>
        <p>The aim of this paper is to introduce Reductor, 1.1 Features of Reductor a program that automatically removes unused parts of the source code of valid programs written in the Mercury lan- Reductor implements two different kinds of reductions guage. Reductor implements two main kinds of reductions: - statical reduction and dynamical reduction. The statical reduction and dynamical reduction. In the statical user chooses between the statical and dynamical rereduction, Reductor exploits semantic analysis of the Mel- duction, not both at the same time. There is also bmoouvrende fMroemrcuthrey pCroomgrpaimle.r Dtoynfianmdircoaultrinedeuscwtihoinchocfarnoubteinrees- a trivial module reduction, which is done implicitly. additionally uses Mercury Deep Profiler and some sam- The module reduction takes all modules from ple input data for the program to remove unused contents the current directory that are (transitively) imported of the program routines. Reductor modifies the sources of by the main module of the program and copies them the program in a way, which keeps the formatting of the without any change in their contents into specified desoriginal program source so that the reduced code is further tination directory. editable. The statical reduction takes as an input a Mercury program and removes some inaccessible routines from the sources of the program. Inaccessible routines 1 Introduction are the routines that program would never call and that can be deleted, because they do not appear in any definition of any of the remaining routines. The dynamical reduction removes parts of routines that were never used on several inputs to the program. The reduction is based on profiling data from the runs of the program on some specified input. The deep profile is generated by Mercury Deep Profiler. The dynamically reduced program can be compiled but Reductor guarantees that the reduced program will generate the same output only for inputs the deep profile was obtained on.1 On different inputs, the reduced program may terminate with an exception. Reductor tries to preserve the original formatting of the original program to as much as possible. Code segments corresponding to goals or routines of the program to be removed are commented out using /*red: ... :red*/, any comments of the kind /* ... */ in such a segment are transformed into /nested:* ... *: nested/. If Reductor needs to add some code, then it is done by delimiting the inserted term by two newlines and a comment indicating that the term was inserted by Reductor.</p>
      </abstract>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>-</title>
      <p>Mercury [1] is a fast logic and functional programming
language with advanced error detection features,
developed for writing large real-world programs. Its
syntax builds upon Prolog syntax (Prolog predicate
clauses), adding some new declarations. These declarations
are used for error checking and to speed-up the
execution of a compiled program. Another great feature
of Mercury is that it can compile to C or Java and
thus it easily interfaces with foreign code. Motivations
behind the design of Mercury are very well summed
up in [2].</p>
      <p>Reductor [3] is mainly intended for programmers
who wish to understand and/or reuse code of a big
program or start an independent project based on just
a few features of an existing one. This is why Reductor
modifies the sources of the program in a way, which
preserves the original formatting, so that the reduced
code is further editable. But there are also other uses,
like reduction of the size of executables, decreasing
the compilation time of reduced programs or releasing
only a subsection of a huge project.</p>
      <p>We suggest the reader gets acquainted with basics
of Mercury language by reading one of the following
summaries: Ralph Becket’s tutorial [4] and Seri´al Mer- 2
cury on ROOT.CZ [5] (in Czech).</p>
    </sec>
    <sec id="sec-2">
      <title>Overview of Reductor</title>
      <p>⋆ This work has been supported by the grants
Euro</p>
      <sec id="sec-2-1">
        <title>MatrixPlus (FP7-ICT-2007-3-231720 of the EU and 7E09003 of the Czech Republic) and MSM 0021620838.</title>
        <sec id="sec-2-1-1">
          <title>Reductor consists of the following parts.</title>
          <p>1 We assume that the output of the original program is
determined solely by its input data.
1. Intermodule Representation of the Program (IMR). After the third stage, each term is converted into
This structure represents the entire Mercury pro- a syntax tree of the item.2 We will call this the
Itemgram on various levels of representation (see Sec- Level Representation (ILR) of a program. After
tion 2.1). It is used by all three remaining parts of this stage, each item is categorized based on the
inReductor and it is based on the results of the se- formation extracted solely from its corresponding
mantic analysis of Melbourne Mercury Compiler term(s), not from other terms of the module. The parse
(MMC) which we reuse and modify to suit our tree still contains most of the syntactic details of the
needs. item in the program.
2. I/O Interface. (Section 3) This part handles the Finally, after the fourth stage, the data structure
representation of the changes in the program on called by the designers of MMC the High Level Data
the term level (see below) and offers an interface Structure (HLDS) is constructed and filled with the
for making the changes proposed by the statical results of the semantic analysis of a module. This
inand dynamical reduction modules. It also outputs cludes the following: Declarations from imported
modthe reduced sources. ules are added, predicates and their goals and subgoals
This representation is designed for storing the are annotated with inferred determinisms and modes,
source changes while preserving the original source variables in scopes of each subgoal of each predicate
and its formatting as much as possible. are annotated with their modes and types, goals are
3. Dynamical Reduction. (Section 4) Using the data reduced to a certain equivalent subset of the original
from IMR, this module dynamically reduces the goals from the code and thus lot of syntactic detail
irprogram and submits the changes to the I/O in- relevant to the semantic analysis of the module source
terface. This module also extracts the data from is lost.</p>
          <p>the deep profile. After this stage, still most of the items from the
4. Statical Reduction. (Section 5) Using the data from source have their corresponding structures in HLDS,
IMR, this module statically reduces the program but not vice-versa—e.g. there are structures for
predand submits the changes to the I/O interface. icates that were automatically generated. We call the
described state of HLDS the High-level
Representation (HLR).
2.1 Levels of representation of a module As a part of the fourth stage, based on the mode
analysis of the program, a procedure is constructed
Reductor utilizes the results of the semantic analysis for each mode of the predicate/function because
clauof Melbourne Mercury Compiler (MMC) [1]. We thus ses have to be reordered for each mode independently.
extracted the part of MMC code responsible for the Also, in HLR, all clauses of a single procedure are
compilation up to the end of semantic analysis and we combined into one goal. Similarly as for term, we
debased the Reductor on the extracted code. fine the subgoal and the top-level goal (base goal).</p>
          <p>The mentioned part of the compilation of From now on, we will usually ignore the difference
a module consists of the following four main stages. between functions and predicates calling them simply
At these stages, the program is represented in certain predicates.
data structures. We call these data structures the level Finally, we define the call site as a goal which can
of representation of the program. call a procedure (this includes goals that call lambda</p>
          <p>Mercury program is composed of declarations and expressions).
clauses, we will use the term item for either those.</p>
          <p>The rst stage is lexical analysis, after which a
module is represented as a list of lists of tokens. Each such 2.2 Inter-module representation
list of tokens corresponds to an item. We call this rep- Considering the program representation, there is a big
resentation Token-Level Representation (ToLR). difference between the needs of Reductor and the</p>
          <p>Each item in a program corresponds to what can be needs of Compiler in the way modules are handled.
considered the standard ISO Prolog term for the pur- Compiler compiles each module of the program
indeposes of this paper. The second stage consists of pars- pendently but Reductor’s statical reduction needs to
ing of the tokens into terms. After this stage a mod- gather semantic analysis for all modules at once and
ule is represented as a list of terms. We will refer to store it for later uses. This fact has also some
implicathis as the Term-Level Representation (TeLR). tion on memory requirements of Reductor.
The parts of a term that are themselves terms may Our approach to combining MMC’s data
strucbe called subterms and the term that corresponds to tures of each module into IMR is quite simple. We use
an item may be sometimes called base term to avoid
confusion. 2 Some terms may be combined to form as single item.
map data types to map fully qualified module names 3.1 Specifying the changes in a term
to these structures. There are also a few structures
which are combined in a more sophisticated way to The changes in a term are represented by a term
improve efficiency, but these details are unnecessary change. This structure mirrors the original term
strucfor the purposes of this paper. ture and stores, for each subterm of the term, how the</p>
          <p>From the beginning, it was clear that we will have subterm is changed. The change can be (1) replacing
to reuse some of the MMC source. The question was the term with the string, (2) enclosing the subterm by
whether to use only ILR or use also HLR of the pro- two strings or (3) replacing the term with one of its
gram. We decided to use MMC’s semantic analysis subterms.
in order to achieve powerful enough reductions, es- Formally, the term change can have one of the
folpecially in the case of dynamical reduction. On the lowing values:
other hand, the use of HLR makes Reductor more de- no change: The corresponding term and all of its
pendent on MMC and makes it harder to understand, subterms are not modified.
which is among other things caused by the fact that no lvl change(list(term change)): The term is
MMC is designed for other purposes than simplicity not modified at the level of its functor, the
and reusability of the semantic analysis alone. term change list stores information about the
sub</p>
          <p>As a part of IMR, there is also a structure that terms.
stores information about the module dependencies subst ins(string, list(term change), string): The
(imports) and the source files that they came from. term is to be preceded by the string given as the
This structure is extracted from the make-like part of first parameter and followed by the string in the
MMC that directs compilation. third parameter. The second argument represents
the changes made on the term’s arguments.
3 I/O interface replace(string): The corresponding term is to be
replaced by the given string.</p>
          <p>The I/O interface allows both the statical and dynam- subst del(list(term change)): The substructures
ical reductions to specify changes on the term level by of this term change, i.e. the lit of term changes
adding, deleting or changing a particular term that have restricted set of allowed constructors here.
corresponds to an item of a module. Each change is There can be only no change constructors, except
specified in a data type called term change, see be- that one orig constructor has to be present. The
low. substructures of the orig constructor have no
reThe I/O interface performs the following steps: strictions.</p>
          <p>The term that corresponds to this term change is
1. For each module in a program, the TeLR and a cor- substituted by the term that corresponds to the
responding modi ed ToLR (MToLR) is con- term change with the orig constructor (which may
structed. MToLR extends ToLR by storing the be changed further).
exact position of the first character of the token in orig(list(term change)): This is applicable only if
the corresponding source file. contained in substructure with subst del
con2. A blank structure to hold term changes is set up structor as described.</p>
          <p>for each module.
3. tAiotntsh.isTphoeinIt/,Ochianntgerefsaacreepsruobvmidiettsetdhbeyatshsoecrieadtiuocn- 3.2 Construction of change lists
of items in TeLR with items in ILR, but the term For each source file we need to construct a change list
change itself has to by provided by the reduction. which specifies what will be modified in the file. This
4. From the collected changes, we construct is done by constructing the change list for each term
a change list for each module of the program, in the same order as they appear in the module, and
which represents the changes to be made in the concatenating the change lists. We briefly describe our
sources in terms of list of string insertions and approach to the problem of constructing the change
deletions. The algorithm for this is explained in list for a term.</p>
          <p>Section 3.2. We observed that each subterm consists of
contiguEach element of the change list represents either ous block of tokens in the MToLR. The block of tokens
the range of positions in the sources which is to is not affected by any subterms other than its own.
be removed or a string and the position where to Our algorithm is based on a synchronous traversal
insert it. of a base term and its term change and identifying
5. We copy the original sources and apply the the lists of consecutive tokens that correspond to the
changes specified by change lists. subterms. This gives us for each term its beginning
and end positions in the list of tokens. From this we
compute the subterm’s beginning and end positions in
the source file.</p>
          <p>We determine the list of tokens that correspond to
each subterm by finding how a given base term can
be written using tokens, while matching it against the
actual list of tokens in the MToLR. This gives us the
beginning and ending token of each subterm.</p>
          <p>While we traverse the subterm and the
corresponding term change in the mentioned way, we additionally
mark (according to the term change) which segments
of token lists are to be removed and at which
positions to insert the strings given in the term change.</p>
          <p>The change list of the term is then constructed from
the position information associated with the tokens.
4</p>
          <p>Dynamical reduction
2. Determine which goals were called in HLR:
(Section 4.4) We collect the data from MDP for some
of the call sites in the program. The data we
collect tells us which ports of the standard box model
were used on those call sites. We then use the
information from MMC semantic analysis to improve
this information, as the data transfer from MDP
to HLR is inaccurate.
3. Determine which predicates will be reduced:
(Sections 4.2 and 4.3) Any procedures may be excluded
from the reduction, if necessary, because our
dynamical reduction has a local character—the
reductions made in a predicate do not depend on
reductions made on any other predicates. Currently,
predicates that have multiple modes and typeclass
methods are unsupported and thus we do not
attempt to reduce them. As it will be seen later,
the reduction of multi-moded predicates can be
achieved by trivially combining the data of their
individual procedures in HLR.
4. Transfer the data about called atomic goals into</p>
          <p>the corresponding clauses in ILR.
5. Reduce the goals of an item: Identifying the
subgoals of the processed item clause that are to be
substituted with an exception.
6. Transfer the changes from ILR to TLR:
(Section 4.5) Construct the term change for the term
that corresponds to the item being processed and
submit the information to the I/O interface
described in Section 3.</p>
        </sec>
        <sec id="sec-2-1-2">
          <title>The idea of the dynamical reduction is to take the</title>
          <p>data collected by Mercury Deep Profiler (MDP) [1]
on multiple runs of the program with various input
data (we will call them test data in the following)
and based on the profile, remove most of unused code.</p>
          <p>This is achieved by cleverly substituting goals of
unused branches of code with calls to the throw predicate
(which throws an exception) so that we preserve the
syntactic correctness of the program.</p>
          <p>We will illustrate the basic idea on a simplified
model. Imagine a simplified version of the execution
model, which executes goals in top-down, left-to-right
order. In this model we collect the information from
the profile, which indicates for each atomic goal 4.2 Finding goals substitutable with
whether it was called during the execution of the runs exceptions
on the test data. We then reduce each procedure by
substituting topmost subgoals that do not contain any This algorithm for computing removable goals is based
called subgoals with a call to the throw predicate. The on the observation that in Mercury, the programmer
resulting program will have same outputs for the runs can substitute any goal with a call to throw predicate
of input data used to construct the profiling data. On without compromising the syntactic correctness of the
different data the program may exit with an exception. program, i.e. the resulting program compiles, but it</p>
          <p>The problem with this description is that MMC may give lot of warnings (e.g. about the presence of
does not create programs that execute goals in this singleton variables, various determinism warnings).
simple top-down, left-to-right sequence. For example, The algorithm processes goals of clauses in the ILR
MMC reorders conjuncts to satisfy mode declarations to identify removable goals in program clauses. By
reof the calls in the conjunction. However these issues do movable goals, we mean some subset of goals that
not prohibit a similar approach to the one described. can, but does not have to, be substituted with an
exception, while preserving the program semantics on
4.1 Overview of dynamical reduction the test data. By non-removable goals, we mean the
complement of the subset. We describe the algorithm
We present the main steps Reductor goes through to here and we discuss its correctness in Section 4.3.
dynamically reduce a program. These steps and addi- As an input to this algorithm, we assume that we
tional details are then discussed further in the sections are given the information about which atomic goals
below: were called on the test data. More precisely, we assume
1. Load the pro le created on the test data: We inte- that we get a superset of the called atomic goals. We
grated MDP into the Reductor and we use it to define atomic goals as the goals that do not have any
collect and read the profiling data. subgoals, i.e. calls and unifications.</p>
          <p>If we consider the ordering of goals that is given by Strict sequential operational semantics. For the
traversing the goal tree in top-down, left-to-right or- dynamical reduction we need the following three
asder, we mark all goals that are preceded by any called sumptions. We believe that using strict sequential
opatomic goal (called goals included) as non-removable. erational semantics in MMC (explained in [6]) ensures</p>
          <p>This means that some goals might not contain any them.
called goals and still be non-removable. The reason for
this is the fact that if we substituted those goals with
a call to throw, we would change the way the goal is
reordered and this could cause premature termination of
the reduced program by an exception. Code reordering
was the major challenge in the design of the dynamical
reduction and we discuss it in Section 4.3.</p>
          <p>The algorithm traverses the goal in the opposite
order to the one described earlier. Each goal gets
annotated with information saying if it is non-removable.</p>
          <p>We call this information removal status:
1. No calls are optimized away. Our implementation
of dynamical reduction assumes that the data from
the deep profile are accurate in the “semantic”
sense. If some calls were optimized away, the
collected profile would indicate that the goal was not
called, which then might cause the goal to be
considered removable, and as such it could be
potentially substituted with an exception. This could
then make the reduced program to incorrectly
throw an exception on the test data.
2. Conjunctions are reordered minimally, every time
the ordering of conjuncts is the same. This
restriction will be clarified later in this section, for now
we just note that the if we created two profiles
on the same test data with different conjunction
orderings, we may get different information about
which atomic goals are called, which could again
potentially lead to different output of the
algorithm from Section 4.2, which can be problematic.
3. Disjunctions are not reordered. The reason is same</p>
          <p>as in (2).</p>
          <p>Correctness of dynamical reduction. It is
guaranteed that a program dynamically reduced for a given
input will give the same results as the original one, if
both are compiled with strict sequential operational
semantics and the following conditions hold:
All goals that contain a single subgoal: The
removal status is the same as the removal status of
its subgoal.
`and', `implication', `equivalence': If the
secondargument goal is non-removable, then the
firstargument goal and all its subgoals are considered
non-removable.
`or': Removable if both of its subgoals are removable,</p>
          <p>otherwise non-removable.
`if then else': The ‘if then else’ goal inherits removal</p>
          <p>status from its ‘if’ subgoal.</p>
          <p>Atomic goals: They are non-removable if they are
called, otherwise they are removable. This can be
overridden with in ‘and’, ‘implication’,
‘equivalence’, ‘or’.
4.3</p>
          <p>Issues with code reordering</p>
        </sec>
        <sec id="sec-2-1-3">
          <title>In this section, we discuss the problems associated</title>
          <p>with the fact that MMC may reorder goals. We
discuss our assumptions about the compilation model of
MMC and conditions that need to be satisfied for the
algorithm for computing removable goals, introduced
in Section 4.2, to be correct.</p>
          <p>We note that by default, MMC reorders
conjunctions to satisfy mode declarations of the calls in the The first two conditions ensure that each
proceconjunctions. Additionally, MMC may reorder dis- dure is called with the same inputs as in the original
junctions and optimize away some calls and do other program. We did not determine, if the condition (1)
optimizations as discussed in [6]. is necessary, but we suspect that if we would not
re</p>
          <p>In a nutshell, in order for algorithm from Sec- quire it, there might be problems with goals with the
tion 4.2 to work correctly, we need to assure that in determinism multi.
both the original program, compiled to produce deep The failure to comply with the 2nd or 3rd condition
profile, and the reduced program, the executed goals causes the reduced program to terminate by exception
are identical in both programs and that they are exe- on the test data.
cuted in the same order (all with respect to particular We assume that these restrictions are satisfied by
input data). the default mode reordering algorithm of MMC as
1. The conjuncts that were called in the original
program have the same ordering in the reduced
program with respect to each other as in the original
program and the call goals call the same modes of
the called predicates.
2. The originally uncalled conjuncts are not
re</p>
          <p>ordered in front of any originally called conjuncts.
3. In the reduced program, only originally uncalled
conjuncts may be substituted with call to throw.
reused in Reductor, if strict sequential operational se- Transfer of the call status from atomic goals in
mantics are used. We also believe that the use of any HLR to atomic goals in ILR. The algorithm for
algorithm that performs minimal reordering as speci- identifying non-removable goals (Section 4.2) operates
fied in Mercury Reference Manual [6] should also sat- on the ILR, but we have the information about called
isfy our restrictions. If not, the use of other mode re- goals in the HLR. Therefore, we need to transfer the
ordering algorithms in MMC might cause problems. data to the item level representation.</p>
          <p>Also, it should be noted that we reasoned about
correctness of the algorithm based on the ILR, but
4.4 Collecting data from the deep pro ler the actual mode reordering algorithm of MMC
operpDaarttsa: f(r1o)mthtehceodperoinsleerrt.edThbey MMDMPCcofnorsisctoslloefcttiwnog tartaenssoitniotnhebeHtLwRee.nTIhLuRs wanedshHoLulRd dmoaeksensoutrecatuhsaet atnhye
aprowfielbingintinerfofarcmeatthioant ipnrteosetnhtesDteheep.cdoallteactefidle,inafnodrm(2a)- tormouitbilte.foUrnbfroervtuitnya.tely this is very technical and we
tion to the user. We call the information stored in One atomic goal in ILR can have multiple
correDeep.data the deep pro le and use it in Reductor. sponding atomic goals in HLR. We designed an
algo</p>
          <p>The deep profile contains for each call site in each rgiotahlmsobfatshede otwnoanreapprpersoenxtimataiotensm,
awthcherinegthofetlhoestatinomforictphraotcewdeurree uthseedlibsyt otfhepoprrtoscoedfuthreessctaalnleddarfdrobmoxthmeocdaelll smhaotrito,naanbyouILtRca’sll asttoamtuisc igsopaalrtisiamllyarrkeecdonastsrucactlleedd., Iinf
smitoed. eMl(ocraellp,reexciits,elfyaiiln, raedddoi)t,ioMn etrocutrhye hsatasnodnaerdadbdoix- ItLheRregiosaal. called goal in HLR which corresponds to the
tional port for the procedure throwing an exception. The matching of the atomic goals of the two
representations is based on the fact that throughout all four
Transferring data into the IMHLDS. We need to stages of compilation we reuse, MMC associates with
transfer the information about the used ports from the each goal the information about what source file and
deep profile to the corresponding call sites in IMHLDS. line number the goal is located on (for the purposes</p>
          <p>Unfortunately, the data about call sites from the of error messages). We use this as an (ambiguous) tag
deep profile do not correspond 1:1 with the HLDS, of each goal. The matching of the atomic goals of the
because the deep profiling code is generated after the two representations thus consists of the matching of
construction of HLR (i.e. the semantic analysis or the these tags3.
4th stage of the compilation). Also the data do not
generally contain enough information for an unambiguous
pairing of the two call-site structures. We thus had to
design a mechanism that accounts for this.</p>
          <p>Improving the information on called goals.
Because the information from the deep profile that we
collected into the HLR is not accurate and it does not
give us much information about unifications, we
improve our information about which atomic goals are
called using the information obtained from MMC
semantic analysis.
4.5</p>
          <p>Changing the individual items of the
program
At this point it is decided which predicate goal will be
changed and everything from now on is done locally
on individual clauses of the program. We describe the
design of the process of transforming each clause of
a predicate by constructing term change for a term,
given the original term and item that correspond to
the clause and HLR with the information about call
status of atomic goals.</p>
          <p>Constructing the term change for the clause.</p>
          <p>As an input to this, for each subgoal of the clause goal
(in ILR), we know if the subgoal is removable.</p>
          <p>The ILR of a clause corresponds reasonably well to
the TeLR of a clause. This fact allows us to construct
the term change by simultaneously recursively
traversing the clause goal in ILR and its corresponding term
while building the term change for the I/O interface.</p>
          <p>If at any point we are not able to pair the ILR goal
with its corresponding term, we just drop the change
leaving the problematic term unreduced. This is based
on the observation, that changing the removal status
of a goal from removable to non-removable is never
harmful, we only reduce less.
4.6</p>
          <p>Final remarks on dynamical reduction
There remains some unexploited potential for
dynamical reduction if we allowed code reordering: reorder
for mode constraints and maximum dynamical
reduction. On the other hand, the implementation of this
3 Interestingly, the matching of items to terms in I/O
interface is based on a similar principle.
extension would be rather difficult and we believe that
we would not gain much by this because (1) usually,
the programmers write predicates in the “correct”
order and (2) if we did major reordering, the reordering
might confuse the programmer.</p>
          <p>Recently, we learned about the new Profiler
Feedback Framework in MMC, which might perhaps have
saved us some work. The framework feeds back the
information from the deep profile into the MMC.
Unfortunately this new and still hidden4 feature is not
available in the version of Mercury the Reductor is
based on.
5</p>
        </sec>
      </sec>
    </sec>
    <sec id="sec-3">
      <title>Statical reduction</title>
      <sec id="sec-3-1">
        <title>This section describes how Reductor identifies inac</title>
        <p>cessible procedures to decide which clauses and
declarations to remove. As indicated by the name, this
analysis is performed statically, without the need to
run the program in question.</p>
        <p>First, we precisely define the notion of accessible
procedures. Procedure B is accessible from
procedure A, if there is a call site that can call
procedure B in a goal of some procedure that is accessible
from A. We call accessible procedure any
procedure in a program that is accessible from the predicate
main. It is not generally possible to say what
procedures can be called from a call site. Reductor thus
determines only superset of accessible procedures
(elements of the superset may be further called accessible
procedures, for brevity).</p>
        <p>There are four steps that are taken in the process
of statically reducing the procedures of a program:
called is known (in HLR). For dynamic call sites,
the procedure is determined at runtime. They are
two kinds of a dynamic call site: (a) call sites for
lambda expressions, and (b) call sites for instances
of a class method.</p>
        <p>Call sites (b) can be considered static, because the
class method called is known (the instance is not).</p>
        <p>Procedures that can be called from (a) are
determined by treating the unification that declares the
lambda expression as a call site of its lambda
expression.
2. Identify further non-removable procedures. All
accessible procedures are naturally non-removable.</p>
        <p>Unfortunately, there are cases when even an
inaccessible procedure cannot be removed. One
example are procedures whose removal would require
us to to remove additional declarations (beyond
the obvious pred, func and mode declarations and
procedure’s clauses). Another example are
predicates exported from Mercury to the C interface—
for these we cannot be sure if they are ever called
from some C code.
3. Recalculate the superset of accessible procedures.</p>
        <p>Marking some procedures as non-removable may
require the non-removability for further
procedures. We use the same depth-first search as above,
except that we start in all non-removable
procedures instead of main.
4. Remove the items that correspond to the removable
procedures. The removal is done separately for each
module of the program through finding their
corresponding items and submitting them for deletion
to the Reductor’s I/O interface.
6
1. Calculate the superset of accessible procedures of
the program. Consider the call graph, which we
define as an oriented graph, where the set of
vertices is the set of all procedures in a program
and set of oriented edges is defined by relation R.
(A; B) 2 R if and only if B is called from a call
site of A.</p>
        <p>We use a depth-first search on the call graph
starting from the predicate main. Each procedure that
is processed is added to the set of accessible
procedures. Processing one node of the search graph
consists of finding its call sites (i.e. constructing
the edges for the depth-first search). This is done
by another depth first search on the goal tree of
the procedure, where subgoals are the vertices of
the tree and atomic goals (i.e. call goals and
unifications) are the leaves of the tree.</p>
        <p>There are two types of call sites: static and
dynamic. For static call sites, the procedure to be</p>
        <p>We think that in future, Reductor can be extended to
be able to remove more declarations and to lift most of
the limitations it places on the input program. There
are however several areas where we are bounded by the
current design of MMC. The most notable thing is that
it is very problematic to make changes in the contents
of clauses and declarations in the general case, unless
we give up the aim of preserving the original
formatting of the code, i.e. the major goal of Reductor.</p>
        <p>One possible solution would be to extend
Reductor’s I/O interface so that the reductions could submit
changes directly to the HLR instead of just TeLR.
Although we did not design such an interface, we believe
that it would be reasonable to try to build it, if the
MMC had not such an unfavorable design with regard
4 For more details see deep profiler/feedback.m in to this proposal. Certainly in the presence of such an
newer versions of MMC source package. interface, the reductions would be quite easy
transfor</p>
      </sec>
    </sec>
    <sec id="sec-4">
      <title>Remarks on the interfacing compilers</title>
      <p>mations of HLR. This would, at the very minimum, input data sets, combined with statical reduction and
save us lot of work on dynamical reduction. some subtle changes in the original code, can be useful</p>
      <p>We also believe that this proposed enhanced I/O in understanding and/or reusing code of big programs.
interface could be easily reused for e.g. automatic code This was the main motivation behind building this
restructuring or refactoring (see e.g. [7]) or intelligent tool.
syntax highlighting. These two kinds of software can We have also commented on some limitations of
make a meaningful use of semantic analysis provided the design of MMC, briefly explored the possibilities
by the compiler. The need for such an interface can be for extending Reductor and gave some thoughts on
illustrated by the fact that the task as simple as func- commonalities of Reductor code with some other
detion renaming requires type analysis (which is a part velopment tools like automatic code refactoring and
of the semantic analysis) to distinguish data construc- intelligent syntax highlighting.
tors of discriminated unions from function calls of the
same name and arity.</p>
    </sec>
    <sec id="sec-5">
      <title>Conclusion</title>
      <sec id="sec-5-1">
        <title>1. Mercury project WWW page.</title>
        <p>http://www.mercury.csse.unimelb.edu.au/
We described two main automatic methods for source 2. Z. Somogyi, F. Henderson and T. Conway: Mercury: an
code reduction of Mercury programs aimed at under- efficient purely declarative logic programming language.
standing and reusing parts of Mercury projects. The ASCS'95, Glenelg, Australia, 1995, 499{512.
statical reduction removes predicates that are never 3. J. Divis: Automatick´e odstranˇen´ı nadbyteˇcn´ych ˇc´ast´ı
called (based on the statical analysis of the source programu6. Bachelor thesis at Charles University in
code). The dynamical reduction removes parts of pred- Prague, Faculty of Mathematics and Physics, 2009.
icates that were never used on some sample input data. 4. Ralph Becket's Mercury tutorial.
The formal correctness of the code is preserved by in- http://www.mercury.csse.unimelb.edu.au/
serting exceptions at these places. The dynamically re- tutorial/book/book.pdf
duced code is guaranteed to return the same outputs 5. Shtertipa:l /M/wewrcwu.rryoootn.RczO/OseTr.iCaZl.y/mercury/
on the inputs it was reduced for. On different outputs, 6. Mercury Reference Manual, version 0.13.1.
the reduced program may exit with an exception. http://www.mercury.csse.unimelb.edu.au/</p>
        <p>The implemented tool called Reductor performs information/doc-release/mercury ref/
both types of the reduction while preserving the orig- 7. T. Mens and T. Tourwe: A Survey of Software
Refacinal formatting of the source code as much as possible toring, IEEE Transactions on Software Engineering, 30
to enable further development. This is a unique fea- (2), February 2004.
ture that complicated the design of Reductor by far
the most.</p>
        <p>Reductor has been tested on few medium-sized
programs and a large set of small programs (mostly
from the MMC test suite). The reductions are
sufficiently powerful. There are few imposed constraints
on the Mercury language for statical reductions which
concern rather rare features. Moreover, the presence of
such unsupported constructs usually does not prevent
Reductor from reducing the program. While in these
cases the reduced program may become uncompilable,
it is still useful for manual inspection. In the case
of dynamical reduction, the tests did not reveal any
substantial quantitative degradation in the amount
of code reduced by the implemented method as
compared to what can be potentially reduced by a similar
method without the mentioned approximations.</p>
        <p>The utility of Reductor for large scale projects has
to be confirmed yet. We believe that use of dynamical
reduction5 on the well chosen subset of all possible
5 The dynamical reduction can also be viewed as coverage
testing tool that \visualizes" its results by commenting
out the unused parts of code.</p>
        <p>6 This is the formal title. The contents are written in
English.</p>
      </sec>
    </sec>
  </body>
  <back>
    <ref-list />
  </back>
</article>