<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>Yet Another Parallel Hypothesis Search for Inverse Entailment</article-title>
      </title-group>
      <contrib-group>
        <aff id="aff0">
          <label>0</label>
          ,
          <addr-line>2641 Yamazaki, Noda-shi, CHIBA, 278-8510</addr-line>
          ,
          <country country="JP">Japan</country>
        </aff>
        <aff id="aff1">
          <label>1</label>
          <institution>Faculty of Sci. and Tech. Tokyo University of Science</institution>
        </aff>
      </contrib-group>
      <abstract>
        <p>In this study, we design and implement a powerful Inductive Logic Programming (ILP) system to conduct a parallel hypothesis search for inverse entailment. One of the most important parts of ILP is speeding up the hypothesis search, and a number of parallel hypothesis exploration methods have been proposed. Recently, the Map-Reduce algorithm has been used for large-scale distributed computing, but it is difficult to apply such cloud technology to the ILP hypothesis search problem. By contrast, we designed a method that can dynamically distribute tasks for the hypothesis search, and implemented a network system in which modules autonomously cooperate with each other. We also conducted a parallel experiment on a large number of CPUs. Results confirm that the hypothesis search time is shortened according to the number of computers used, without reducing the optimality of the generated hypothesis.</p>
      </abstract>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>INTRODUCTION</title>
      <p>
        Recently, various analytical approaches to a large amount of data have been
developed. For example, the Map-Reduce model consists of a large number of
workers and one master, and uses on-line distributed-computing for quick
analysis. However, Map-Reduce has limited efficiency for Inductive Logic
Programming (ILP) [
        <xref ref-type="bibr" rid="ref2">2</xref>
        ]. For example, searching for relationships in information is not
efficient distributed processing in the Map-Reduce scheme, because new
relationships are found after searching for relationships in background knowledge
to generate rules. Therefore, the Map-Reduce application of ILP for large data
is not realistic because ILP requires much learning time. To solve this problem,
various studies have been conducted in an effort to speed up ILP [
        <xref ref-type="bibr" rid="ref1 ref3 ref4">1, 3, 4</xref>
        ].
Although the problems were partially solved, the process was not sufficiently sped
up based on the number of processors provided, and the quality of the generated
rules was not optimal, resulting in difficulty in its use as a practical tool.
      </p>
      <p>Considering these issues, we designed and implemented a new parallel
processing system for ILP. With this system, the worker itself has an autonomous
function, unlike Map-Reduce. When a worker has no task (e.g., immediately
after start-up or immediately after completion of a task), the worker accepts a</p>
      <p>Server
Module
Master
Module
…</p>
      <p>Network
divided task from another worker and starts the task. When the workload (the
number that the worker must search for relationships in information) reaches a
fixed quantity, the worker requests other workers to process the divided task.
After the request for the first task issued by a master is implemented for one
worker, autonomous process distribution is started among workers, and all
existing workers are engaged (saturation of the task). Because the divided task
continues to be repeated among workers, all workers finally complete all
processing at approximately the same time, and the master receives the processing
result (generated rules).</p>
      <p>Our parallel processing system has the following features.
– 1: The master does not need to consider the division of the process in
advance, as the master of the Map-Reduce scheme does.
– 2: All workers work until the end (no free time).</p>
      <p>The first feature indicates that the proposed system does not require
predivision processing of the master, which is used in conventional Map-Reduce. The
second feature means that speedup can be achieved. In our study, we implement
the proposed parallel ILP system and demonstrate its effectiveness by conducting
parallel-learning experiments using drug discovery data.
2
2.1</p>
    </sec>
    <sec id="sec-2">
      <title>SYSTEM</title>
    </sec>
    <sec id="sec-3">
      <title>DESIGN</title>
      <sec id="sec-3-1">
        <title>System Configuration</title>
        <p>– Server module for negotiation between modules</p>
        <p>The server module mediates negotiation messages (e.g., request, accept,
and commit) regarding the task shared by workers. This module broadcasts
to connected workers and the master. This function is used during
negotiation to find an acceptor of the divided task.
– Master Module</p>
        <p>The master module sends a task (main task) request to the worker
module (worker) and collects results (generated rules) from the worker module
when all divided tasks of the worker modules are finished. By connecting
with the server module, this module can negotiate with worker modules and
collect the negotiations between worker modules to recognize the processing
situation of all worker modules.
– Worker Module</p>
        <p>The worker module manages an ILP module and negotiates with other
worker modules via the server module. When this module has no task, it
accepts the shared task from other worker modules, transfers the task data
to its own ILP module, and starts learning. In addition, this module sends a
request message for the shared task to other workers when the ILP module
needs to divide the task.
– ILP Module</p>
        <p>Basically, the ILP module has the same functions as in traditional ILP.
Receiving task data from the worker module, the ILP module starts the
learning process. It searches for generating rules; thus, the amount of search
processing that this module must perform increases. If the amount of search
processing exceeds a threshold, this module divides the search processing
and generates a divided task. It then sends a request for the divided task to
the worker module. In addition, when all search processing (its own learning
process) is finished and the requesting task exists (waiting for an accept
response), this module cancels the request of the task via the worker module
and starts the task itself. When all search processing is finished and the
requesting task no longer exists, this module declares the end of the task to
the worker modules. As a result, this module starts other divided tasks from
other worker modules.
2.2</p>
      </sec>
      <sec id="sec-3-2">
        <title>Communication Protocol Between Modules</title>
        <p>In our system, a master module communicates with worker modules and worker
modules communicate with each other via the network (see Appendix Fig. 2,3).
We prepared two communication protocols. One is the “Negotiations Protocol
between Modules” to find a worker module that can execute a task (main task
or divided task). The other is the “Task Content Transmission Protocol” to send
task contents to a worker. The details of each protocol are as follows.
– Protocol 1. Negotiations Protocol between Modules</p>
        <p>This protocol communicates via the server module. All messages are
broadcast to all worker modules and a master module. This protocol is used for
negotiation messages (e.g., request, accept, and commit) to find a worker
module that can execute a task.
– Protocol 2. Task Content Transmission Protocol</p>
        <p>After a worker module is determined by protocol 1, protocol 2 is used for
transmitting the task contents to the module. Communication between
modules is carried out by peer-to-peer communication (not via the server
module).</p>
        <p>As a result, all worker and ILP modules perform a task by repeatedly
requesting shared tasks (i.e., saturation of the divided task). In the saturation
state, all worker modules send request messages regarding the shared task to
other worker modules and await an accept message. Conversely, each worker
module is ready to receive a request task from all other worker modules. Thus,
a worker module that has finished its own task sends an accept message to
another worker module and receives a new shared task (see Appendix Fig. 4,5).</p>
        <p>Shared tasks are requested and received until a learning rule is generated. As
a result, worker and ILP modules operate without spare time.
3</p>
      </sec>
    </sec>
    <sec id="sec-4">
      <title>IMPLEMENTATION AND EXPERIMENT</title>
      <p>
        In the present study, we implemented a rule generation engine for ILP [
        <xref ref-type="bibr" rid="ref2">2</xref>
        ] and
each module described above using the Java programming language. In addition,
in this implementation experiment, we temporarily set the threshold (number of
searches to be conducted) at 200 for the division request in the ILP module. We
used two 6-CPU computers and two 4-CPU computers in our parallel processing
experiment.
3.1
      </p>
      <sec id="sec-4-1">
        <title>Speedup Experiment</title>
        <p>We used two 6CPU computers to conduct an experiment to measure speedup
and used middle-scale data on drug discovery analysis as training data. One
by one, we increased the number of CPUs from 1 to 12, and performed 12
experiments. In our system, a worker module (and an ILP module) used one
CPU. The experiment results are presented in Table 1. We obtained a speedup
of 7.4 using a 12-CPU unit for this exercise. However, this problem was
smallscale and did not provide enough speedup.
3.2</p>
      </sec>
      <sec id="sec-4-2">
        <title>Large-Scale Parallel Experiment</title>
        <p>Using four computers, we performed application experiments on a large-scale
example of a drug discovery problem. In this experiment, we used 1 CPU, 12
CPUs (two 6-CPU computers), and 20 CPUs (two 6-CPU computers + two
4CPU computers) in a total of three experiments. Table 2 shows the experiment
results. The results are confirmed that speedup corresponded to the number
of CPUs used. For this drug-discovery problem, it conventionally took at least
15h using a parallel-processing system; however, with our proposed method,
one learning process required less than 1h. Thus, experiments can be easily
reconducted by adjusting the parameters and can achieve better research results.
For the above parallel processing experiments, we measured the
communication time and processing steps between workers to investigate how the workers
(worker module and ILP module) depend on each CPU. For this measurement,
we used six workers and one 6-CPU computer. In this reconducted experiment,
we implemented measuring functions on each module. Thus, the processing time
is longer than in the experiment results using 6 CPUs. Table 1 shows the results.</p>
        <p>We measured the time required for learning and for receiving the task data
of each worker. Table 3 lists the processing time of each worker. The end time
for all tasks is 205.69sec.</p>
      </sec>
    </sec>
    <sec id="sec-5">
      <title>CONCLUSIONS</title>
      <p>In the present study, we designed and implemented a parallel system of
Inductive Logic Programming (ILP) to realize a parallel hypothesis search for inverse
entailment. The hypothesis generation of ILP requires search processing, and
the result of the hypothesis generation influences the next hypothesis
generation. Thus, ILP cannot be parallelized using the Map-Reduce method that was
developed based on cloud technology. To address these issues, we designed a
method that can dynamically distribute tasks for search processing, and
implemented a network system in which modules autonomously cooperate with each
other. In addition, we enabled parallel experimentation using a large number of
workers (CPUs). Results confirmed that we shortened the processing time
depending on the number of computers used, without reducing the quality of the
rule generated.</p>
      <sec id="sec-5-1">
        <title>Appendix A: Image of communication protocol</title>
        <p>ILP
Module
Worker
Module
① Request
(Broadcast)</p>
      </sec>
      <sec id="sec-5-2">
        <title>Appendix B: Operational status of each worker</title>
        <p>W1
W2
W3
W4
W5
W6
Task ID
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15</p>
      </sec>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          1.
          <string-name>
            <given-names>Andreas</given-names>
            <surname>Fidjeland</surname>
          </string-name>
          ,
          <article-title>Wayne Luk and Stephen Muggleton: Customisable MultiProcessor Acceleration of Inductive Logic Programming</article-title>
          ,
          <source>Latest Advances in Inductive Logic Programming</source>
          , pp.
          <fpage>123</fpage>
          -
          <lpage>141</lpage>
          ,
          <year>2014</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          2.
          <string-name>
            <given-names>Fumio</given-names>
            <surname>Mizoguchi</surname>
          </string-name>
          and Hayato Ohwada:
          <article-title>Constrained Relative Least General Generalization for Inducing Constraint Logic Programs</article-title>
          ,
          <source>New Generation Computing 13</source>
          , pp.
          <fpage>335</fpage>
          -
          <lpage>368</lpage>
          ,
          <year>1995</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          3.
          <string-name>
            <given-names>Hayato</given-names>
            <surname>Ohwada</surname>
          </string-name>
          ,
          <article-title>Hiroyuki Nishiyama and Fumio Mizoguchi: Concurrent Execution of Optimal Hypothesis Search for Inverse Entailment</article-title>
          ,
          <source>Inductive Logic Programming, Lecture Notes in Computer Science</source>
          Vol.
          <year>1866</year>
          , pp.
          <fpage>165</fpage>
          -
          <lpage>173</lpage>
          ,
          <year>2000</year>
          .
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          4.
          <string-name>
            <surname>Ashwin</surname>
          </string-name>
          <article-title>Srinivasan: Parallel ILP for distributed-memory architectures, Machine Learning</article-title>
          , Vol.
          <volume>74</volume>
          , Issue 3, pp
          <fpage>257</fpage>
          -
          <lpage>279</lpage>
          ,
          <year>2009</year>
          .
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>