Integration of Everest Platform with BOINC-based
                        Desktop Grids

                                       Oleg Sukhoroslov
      Institute for Information Transmission Problems of the Russian Academy of Sciences
                     Bolshoy Karetny per. 19, build.1, Moscow 127051 Russia
                                      sukhoroslov@iitp.ru


                                                        Abstract
                       Desktop grids is an important class of distributed computing infrastruc-
                       tures (DCIs) used for solving complex scientific problems. The inherent
                       complexity of DCIs and used technologies limit the wide adoption of
                       distributed computing in practice. Everest is a web-based distributed
                       computing platform that uses service-oriented approach and cloud com-
                       puting models to solve this problem. This paper discusses the possible
                       approaches for integration of Everest and BOINC-based desktop grids
                       and presents the current prototype implementation. The proposed inte-
                       gration enables Everest users to seamlessly access computing resources
                       of desktop grids and build generic or domain-specific web services for
                       submission and automation of computations in desktop grids.


1    Introduction
Computational methods are now widely used for solving complex scientific and engineering problems. These
methods often require a large amount of computing resources. Nowadays, there is a wide range of such resources
including servers and personal computers, clusters and supercomputers, grids and clouds. Distributed computing
technologies enable integration of independent resources into distributed computing infrastructures (DCIs) that
can provide significant computing power and increase the efficiency of use of individual resources.
    Desktop grids [KTB+ 04, Fed12] represent an important class of DCIs that aggregate computing power of
idle personal computers. The most widely used technologies for building desktop grids are HTCondor [TTL05]
and BOINC [ACA06]. While HTCondor is most suited for integration of resources within an organization,
i.e. enterprise desktop grids [Iva14], BOINC is originally designed to support volunteer computing projects
[And04, ACA06] with globally distributed resources donated by their owners. Modern desktop grids are capable
of integrating resources of hundreds of thousands of personal computers with an aggregate computing power
comparable to the top supercomputers. The distinctive advantage of desktop grids is the low costs of building
and operating such infrastructures, which makes it affordable for small research projects.
    Despite the abundance of computing resources, the inherent complexity of distributed computing technologies
and infrastructures, along with the lack of required IT expertise among the researchers, limit the wide adoption of
these technologies in practice. This problem can be solved by providing high-level services with domain-specific
interfaces that hide the mentioned complexity from the user. These services should automate all common actions

Copyright c by the paper’s authors. Copying permitted for private and academic purposes.
In: E. Ivashko, A. Rumyantsev (eds.): Proceedings of the Third International Conference BOINC:FAST 2017, Petrozavodsk, Russia,
August 28 - September 01, 2017, published at http://ceur-ws.org


                                                        102
needed to perform computations on remote resources or DCIs. The use of service-oriented approach can also
improve the research productivity by enabling publication, sharing and composition of computing applications
as services.
   Everest [SVA15, Eve] is a web-based distributed computing platform that implements the described approach.
The platform supports publication, execution and composition of computing applications in a distributed envi-
ronment. Unlike other solutions, Everest is based on the Platform as a Service (PaaS) cloud computing model
by providing its functionality via remote web and programming interfaces. A distinctive feature of Everest is the
support for running applications on arbitrary combinations of computing resources attached by users. Currently
the platform supports integration with standalone servers, computing clusters and European Grid Infrastructure
[SSV16].
   This papers presents integration of Everest platform with BOINC-based desktop grids. The proposed integra-
tion has the following benefits. First, it enables Everest users to seamlessly access computing resources of desktop
grids and combine them with other types of resources already supported by the platform. Second, it makes it
possible to use Everest for building generic or domain-specific web services for submission and automation of
computations in BOINC-based desktop grids. Hopefully, the proposed approach will make such DCIs accessible
to a wider group of researchers.
   The paper is structured as follows. Section 2 provides a brief overview and relevant technical details of both
used technologies. Section 3 discusses the possible approaches for integration of Everest and BOINC and presents
the current prototype implementation. Section 4 concludes and discusses future work.

2     Everest and BOINC Overview
2.1   Everest
Everest [SVA15] is a web-based distributed computing platform. It provides users with tools to quickly publish
and share computing applications as web services. The platform also manages execution of applications on
external computing resources attached by users. In contrast to traditional distributed computing platforms,
Everest implements the PaaS model by providing its functionality via remote web and programming interfaces.
A single instance of the platform can be accessed by many users in order to create, run and share applications
with each other. The platform is available online to all interested users [Eve].
   Everest supports development and execution of computing applications following a common model. An
application has a number of inputs that constitute a valid request to the application and a number of outputs
that constitute a result of computation corresponding to some request. Upon each request Everest creates a
new job consisting of one or more computational tasks generated by the application according to the job inputs.
The tasks are executed by the platform on computing resources specified by a user. The dependencies between
tasks are currently managed internally by applications. The results of completed tasks are passed back to the
application and are used to produce job outputs or new tasks if needed. The job is completed when there are
no incomplete tasks are left. The described application model is generic enough to support a wide range of
computing applications, including many-task applications.
   Users can publish applications via provided generic application template that makes it possible to avoid
programming. The template supports running arbitrary applications with command-line interface and produces
a single task corresponding to a single command run. There are two approaches for implementing many-task
applications on Everest. First, it is possible to dynamically add new tasks or invoke other applications from a
running application via the Everest API. This enabled users to create and publish complex many-task applications
with dependencies between tasks, such as workflows. Everest also provides a ready-to-use generic application
[VS15] for running a large number of independent parametrized tasks. i.e. parameter sweep experiments.
   An application is automatically published as a RESTful web service with a unified interface. This enables
programmatic access to applications, integration with third-party tools and composition of applications into
workflows. The platform’s web user interface also generates a web form for running the application via web
browser. The application owner can manage the list of users that are allowed to run the application.
   Instead of using a dedicated computing infrastructure, Everest performs execution of application tasks on
external resources attached by users. The platform implements integration with standalone machines and clusters
through a developed program called agent [SSV16]. The agent runs on the resource and acts as a mediator
between it and Everest enabling the platform to submit and manage computations on the resource. The platform
also supports integration with resources of the European Grid Infrastructure. Everest manages execution of tasks
on remote resources and performs routine actions related to staging of input files, submitting a task, monitoring


                                                    103
a task state and downloading task results. The platform also monitors the state of resources and uses this
information during scheduling.
   Everest users can flexibly bind the attached resources to applications. In particular, a user can specify
multiple resources, possibly of different type, for running an application [SSV16]. In this case the platform
performs dynamic scheduling of application tasks across the specified resource pool.

2.2     BOINC
BOINC (Berkeley Open Infrastructure for Network Computing) [And04, ACA06] is an open-source middleware
platform for volunteer computing projects. Originally developed to support the SETI@home project, it became a
de facto standard for running volunteer computing projects supporting research in different areas of science and
technology. Volunteers participate by running the BOINC client software on their computers. They can attach
each computer to any set of projects, and can control the allocation of resources among the projects. Currently
BOINC brings together about 200-300 thousand active participants and 700-800 thousand active computers
worldwide contributing around 16-17 PetaFLOPS of processing power in total.
   Each BOINC-based project provides its own server to manage execution of project-specific applications on
volunteers’ computers and distribute data files. BOINC clients download application executables and data files
from the project server, carry out tasks or workunits by running applications against the specific data files,
and upload the output files to the server. BOINC software includes server-side components, such as scheduler
and daemon programs that manage job distribution and collection, and web-based interfaces for volunteers and
project administrators. All information about project applications and jobs is stored in a relational database on
the server.

2.2.1    Application Development
The inherent feature of desktop grids is the heterogeneity of computing hosts resulting in different CPU archi-
tectures, OS types, software versions, etc. The original approach of porting an application to BOINC is to build
a separate application binary for each supported platform. This approach requires significant efforts, especially
in cases when the application was originally developed to run in a particular environment.
    A more recent approach is running applications in virtual machines, the so-called ”VM apps” [Boid]. In this
case the developer creates a VM image with all required software that will be downloaded and used by BOINC
client for running the application. This approach avoids tedious porting of application to different platforms,
but requires installation of VirtualBox on hosts, introduces a small runtime overhead and doesn’t support GPU
applications.
    Recently, a new approach for BOINC application deployment based on Docker containers has been introduced
[Boia]. In comparison to virtual machines, Docker images are generally much smaller, can be instantiated
much faster, and have less runtime overhead. The developer creates or reuses a Docker image for running
the application. Docker-based applications are run under a universal BOINC application that uses a generic
VirtualBox VM to host Docker containers. The developer can use arbitrary Docker images when submitting
jobs. The Docker image files are passed inside a BOINC workunit and are cached on the hosts. This approach
is implemented in boinc2docker tool [boib].
    In this work the Docker-based approach was used for packaging and running applications in BOINC, because
it requires a minimal effort from the developer.

2.2.2    Job Submission
The original approach for submitting jobs to a BOINC application uses the create work command-line tool. This
tool is intended for the use by project administrators and requires the submitter to have a login access to the
BOINC server. Recently, the remote job submission mechanism was introduced to support submitting jobs by
programs running outside the BOINC server [Boic]. This mechanism is based on Web RPCs and does not require
login access to the server.

3     Integration of Everest with BOINC
3.1     Prototype Implementation
The most straightforward approach for integration of Everest with BOINC is to reuse the same mechanism that
is used for integration with computing clusters.


                                                  104
                                                                                            BOINC Server Host

                                                                                                  create/monitor
                                                                     download/upload                workunits
      submit job             job state/results                                                                      BOINC
                                                          Everest       task files        Agent                    Database            BOINC
                                                                                                                                       Server
                                                                                          boinc
                                          task input/output files
              Application


                                                                      HTTP
                                                                                                                                   boinc2docker
                                                                                                                     File
                                                                                                    store/read
                                                                                                     workunit
                                                                                                                   Storage
      job (tasks)            task state
                                                                                                       files


                                                                              WebSocket
                                                     File
             Job Manager                           Storage                                  Desktop
                                                                                            Grid
           task              task state                                                                               BOINC Client
                                                                                                                   BOINC
                                                                                                                     BOINC
                                                                                                                        boot2docker
                                                                                                                   Client
                                                                                                                      Client
                    Agent                                                                                              VirtualBox VM
                    Client                                          monitor agent state
                                                                      submit tasks
                                                                    monitor task state


                                     Figure 1: Integration of Everest with BOINC via Everest agent
   In this mechanism, the Everest agent runs on the cluster submission node, receives tasks from the platform
and translates them to jobs submitted to the cluster. The agent supports interaction with various resource
managers through the pluggable adapters. The adapter receives generic resource requests (get resource state,
submit task, get task state, cancel task, etc.) from the agent and translates these requests into commands
specific to a particular resource manager. The developed adapters for computing clusters support integration
with commonly used batch systems such as TORQUE, Slurm and Sun Grid Engine.
   In case of BOINC, the similar approach has been implemented as follows (see Figure 1). The Everest agent
is deployed on a BOINC server and connected to the platform. To support the interaction of the agent with
BOINC, a special boinc adapter has been developed. This adapter converts a task received from Everest to a
BOINC job workunit by placing the task input files in the downloads folder and registering the workunit in
the BOINC server database. The adapter monitors the state of submitted workunits by querying the database.
When the workunit result is assimilated, the adapter passes the output files to the agent which uploads them to
Everest.
   To support the submission of arbitrary tasks from Everest to BOINC, the current implementation relies on the
universal boinc2docker BOINC application [boib]. This application supports running BOINC jobs in arbitrary
Docker images, thereby implementing the Docker-based approach described in Section 2. The required Docker
image is specified during the job submission. Currently, the Docker image used for running Everest tasks is
fixed in the boinc adapter configuration. Also, to be able to locate the output files of each workunit, the current
implementation relies on a custom assimilator that copies output files to a directory checked by the boinc adapter.
   The implementations of the boinc adapter and the custom assimilator are available in the git repository of
the Everest agent1 .

3.2     Experimental Evaluation
The end-to-end testing of the described implementation has been performed by running a real-world parameter
sweep application. The application represented the virtual screening of 100 ligand molecules using the molecular
docking program Autodock Vina.
   The application, including the vina Linux binary, auxiliary Python script and input files, was submitted to
Everest via the generic Parameter Sweep service [VS15]. During the submission, a resource has been selected
that represented the BOINC project attached to Everest using the described implementation. The agent was
configured to submit workunits using the frolvlad/alpine-python2 Docker image, a compact (∼50MB) image with
Alpine Linux and Python 2.7. The project had four active desktop hosts during the testing.
  1 https://gitlab.com/everest/agent/tree/boinc


                                                                             105
   The application was successfully executed via Everest by dispatching all application tasks to BOINC and
collecting the task results. This demonstrated that the described implementation enables using BOINC-based
desktop grids as computing resources for running existing Everest applications.

3.3     Alternative Approaches
In addition to the presented implementation, there are other possible approaches for integration of Everest and
BOINC that are briefly discussed below.

3.3.1     Remote Job Submission
This approach relies on the remote job submission mechanism introduced recently in BOINC server [Boic].
Instead of running the Everest agent on the server, the platform could use this mechanism to directly submit
jobs to BOINC, query their status and download results.
   In comparison to the current implementation, this approach avoids running of additional software (agent) on
the BOINC server and relies only on passing to Everest the authenticator token of submission account. However,
this approach requires a more complex implementation of the integration logic on the server side of Everest.
Besides that, the remote job submission mechanism is relatively new and not stable that further complicates the
implementation of this approach.

3.3.2     Dynamic Agent Deployment
This approach leverages existing Everest agents by dynamically deploying them to hosts attached to a BOINC
project. To do this, Everest submits to the BOINC server generic workunits containing the agent. The number
of submitted agents can depend on the number of unprocessed tasks. When started on a host the agent connects
to Everest and begins to execute tasks assigned by the platform on the host. The input and output files are
transferred directly between the agent and the platform. A similar approach has been used previously for
integration of Everest with European Grid Infrastructure [SSV16].
   In this approach, there is no need to convert tasks of Everest applications to BOINC workunits. Also, the
interaction between the agent and Everest does not differ from the case of a standalone resources. However, the
agent is not deployed by a user and a single Everest resource representing the desktop grid can be backed by a
dynamic pool of agents. Furthermore, when running as a BOINC workunit, the agent has a limited lifetime and
can be suspended. The agent should also terminate its execution in the absence of new tasks from Everest.
   Since this approach bypasses BOINC server for task scheduling and data transfer, its’ potential advantages
are related to more flexible scheduling by means of Everest, such as supporting small tasks or implementing
dynamic load balancing. However, it is not clear how to implement validation of results or grant credits under
this approach. Also, having thousands of agents connected to Everest could hit the current scalability limits of
the platform.

4     Conclusion and Future Work
This paper discussed the possible approaches for integration of Everest and BOINC-based desktop grids and
presented the current prototype implementation. The proposed integration enables Everest users to seamlessly
access computing resources of desktop grids and build generic or domain-specific web services for submission
and automation of computations in desktop grids. Future work will focus on improving the described imple-
mentation (e.g., adding support for integration with arbitrary BOINC applications) and conducting a large-scale
experimental evaluation.

Acknowledgments
This work is supported by the Russian Science Foundation (project No. 16-11-10352).

References
[ACA06]     David P Anderson, Carl Christensen, and Bruce Allen. Designing a runtime system for volunteer
            computing. In SC 2006 Conference, Proceedings of the ACM/IEEE, pages 33–33. IEEE, 2006.
[And04]     David P Anderson. Boinc: A system for public-resource computing and storage. In Grid Computing,
            2004. Proceedings. Fifth IEEE/ACM International Workshop on, pages 4–10. IEEE, 2004.


                                                  106
[Boia]     BOINC and Docker. [online]. https://boinc.berkeley.edu/trac/wiki/BoincDocker.
[boib]     boinc2docker. [online]. https://github.com/marius311/boinc2docker.

[Boic]     RPCs for remote job submission. [online]. https://boinc.berkeley.edu/trac/wiki/RemoteJobs.
[Boid]     Running apps in VirtualBox virtual machines. [online]. http://boinc.berkeley.edu/trac/wiki/
           VboxApps.
[Eve]      Everest. [online]. http://everest.distcomp.org/.

[Fed12]    Gilles Fedak. Desktop grid computing. Chapman and Hall/CRC, 2012.
[Iva14]    Evgeny Evgen’evich Ivashko. Enterprise desktop grids. Programmnye Sistemy: Teoriya i Prilozheniya
           [Program Systems: Theory and Applications], (1):19, 2014.
[KTB+ 04] Derrick Kondo, Michela Taufer, Charles L Brooks, Henri Casanova, and Andrew A Chien. Char-
          acterizing and evaluating desktop grids: An empirical study. In Parallel and Distributed Processing
          Symposium, 2004. Proceedings. 18th International, page 26. IEEE, 2004.
[SSV16]    Sergey Smirnov, Oleg Sukhoroslov, and Sergey Volkov. Integration and combined use of distributed
           computing resources with everest. Procedia Computer Science, 101:359–368, 2016.
[SVA15]    O. Sukhoroslov, S. Volkov, and A. Afanasiev. A web-based platform for publication and distributed
           execution of computing applications. In Parallel and Distributed Computing (ISPDC), 2015 14th
           International Symposium on, pages 175–184, June 2015.
[TTL05]    Douglas Thain, Todd Tannenbaum, and Miron Livny. Distributed computing in practice: the condor
           experience. Concurrency and computation: practice and experience, 17(2-4):323–356, 2005.

[VS15]     Sergey Volkov and Oleg Sukhoroslov. A generic web service for running parameter sweep experiments
           in distributed computing environment. Procedia Computer Science, 66:477–486, 2015.


                                                107