=Paper=
{{Paper
|id=Vol-1800/paper2
|storemode=property
|title=SLA-aware Interactive Workflow Assistant for HPC Parameter Sweeping Experiments
|pdfUrl=https://ceur-ws.org/Vol-1800/paper2.pdf
|volume=Vol-1800
|authors=Bruno Silva,Marco Netto,Renato Cunha
|dblpUrl=https://dblp.org/rec/conf/sc/SilvaNC16
}}
==SLA-aware Interactive Workflow Assistant for HPC Parameter Sweeping Experiments==
SLA-aware Interactive Workflow Assistant for HPC Parameter Sweeping Experiments Bruno Silva, Marco A. S. Netto, Renato L. F. Cunha IBM Research ABSTRACT existing work does not exploit human knowledge/feedback A common workflow in science and engineering is to (i) setup and DOE techniques to assist users on parameter selection and deploy large experiments with tasks comprising an ap- decisions under these submission-execution-analysis work- plication and multiple parameter values; (ii) generate in- flows with Service Level Agreement (SLA) constraints. termediate results; (iii) analyze them; and (iv) reprioritize Over the years, optimization methods have been created the tasks. These steps are repeated until the desired goal to solve complex problems in industry and academia. How- is achieved, which can be the evaluation/simulation of com- ever, fully automatic solutions face some obstacles to obtain plex systems or model calibration. Due to time and cost satisfactory results due to the following reasons [20]: constraints, sweeping all possible parameter values of the • It is hard to obtain/create an optimization model that user application is not always feasible. Experimental Design reflects all aspects of a real-world problem, especially techniques can help users reorganize submission-execution- when multi-criteria objectives are involved; analysis workflows to bring a solution in a more timely man- ner. This paper introduces a novel tool that leverages users’ • Even when the optimization model is adequate, the feedback on analyzing intermediate results of parameter sweep- time/cost to find an optimal solution may violate user’s ing experiments to advise them about their strategies on time or cost restrictions; parameter selections tied to their SLA constraints. We eval- uated our tool with three applications of distinct domains • The analyst expertise and creativity are hard to code. and search space shapes. Our main finding is that users with Then, for problems that involve complex and impor- submission-execution-analysis workflows can benefit from their tant decisions (e.g., financial trading decisions), even interaction with intermediate results and adapt themselves fully automatic solutions must be, in the end, validated according to their domain expertise and SLA constraints. by humans to be adopted. Generally, it is prohibitive to run application instances Keywords with all possible parameter values due to time and cost interactive optimization workflow; parametric sweeping ap- constraints—cost, in particular, becomes a key factor con- plication; design of experiments; high performance comput- sidering execution of these applications in outsourced envi- ing; user expertise; SLAs ronments (e.g., public clouds). Additionally, optimization techniques can be adopted to suggest parameter values in order to reduce the number of evaluated scenarios and conse- quently reduce the experiment costs. Therefore, SLA-aware 1. INTRODUCTION prediction methods and mechanisms to suggest parameter Evaluation of test scenarios and model calibration are values would help engineers and scientists to evaluate para- common in several industries including finance, aerospace, metric applications. health, and energy. Normally users run applications that In this work, interactive optimization methods are pro- contain a set of parameters, each able to assume a broad posed to take advantage of human expertise/creativity and range of values. These applications are known as parame- processing power to solve optimization problems. We study ter sweeping applications or parametric applications. In this how the decision maker’s experience can be leveraged to find context, it is necessary to run several application instances solutions for parameter sweeping applications. The analyst’s varying their parameter values to find a combination that decisions can be impacted by the DOE analysis that shows meets a given specification or optimization criteria. which parameters have more impact on the output. For in- A popular practice in this context is to employ Design of stance, an experiment explorer tool can show which param- Experiments (DOE) techniques [21] to select parameter val- eter values cause more impact on experiment results and the ues to be evaluated. These methods help users understand user can select suitable parameter value combinations based the impact of each parameter value on the generated output. on this information. As users can check the results produced by each execution, We introduce a tool, called Copper (Cognitive Optimizer they can utilize the intermediate results to select the next and Parameter Explorer), to help users in the evaluation of parameter values to be evaluated. The knowledge on the im- search spaces considering human feedback and domain ex- pact of parameter values helps users explore the search space pertise. As new samples are evaluated, Copper updates the more effectively. Nevertheless, to the best of our knowledge, impact of each parameter value on the results and generates Copyright held by the author(s). 4 WORKS 2016 Workshop, Workflows in Support of Large-Scale Science, November 2016, Salt Lake City, Utah intermediate job experiment submission execution result analysis reprioritization completion (1) (2) (3) (4) (5) (5) (6) (1) SLA execute jobs high level user workflow steps - response time predict SLA violation - cost create alternative strategies (3) - result quality trigger execution of jobs (2) search space manually change job priorities Copper change trust on the DOE technique (6) analyst DFO strategies Computing infrastructure (on-premise clusters, cloud, grid) notify SLA violation (4) (5) suggest search strategy analyze intermediate results track SLA modify search strategy results of completed jobs Figure 1: User workflow with job submission, job execution, and intermediate result analysis steps. new samples for evaluation by using derivative free optimiza- produced results, what they called simulation-analysis loop. tion (DFO) methods [4]. Moreover, clustering techniques are Netto et al. [23] introduced a scheduler system able to au- employed to evaluate the regions of search space in which the tomatically offer more resources to parametric application user is interested and to propose the user to change her ap- jobs based on the quality of their intermediate generated re- proach when the SLA is predicted to be violated. The main sults so as users could get faster to their desired goal. More contributions of this paper are therefore: recently, Mattoso et al. [18] surveyed the use of steering in the context of High Performance Computing (HPC) scien- • A tool to assist users in the evaluation of scenarios with tific workflows highlighting a tighter integration between the parameter sweep applications using human feedback user and the underlying workflow execution system. An- and domain expertise (§ 3); other area to help users is the development of workflow man- • Clustering methods to evaluate user behavior and pro- agement systems [6, 17]. pose strategy changes in case of imminent SLA viola- An important research topic related to our work is opti- tions (§ 3.2); mization assisted by humans [19, 20]. Meignan et al. [20] provided a detailed survey and taxonomy of efforts in in- • Evaluation of the tool using three applications from teractive optimization applied to operations research. They different domains and search space shapes (§ 4). explored the different roles a user can have in an optimiza- tion process, such as adjusting or adding new constraints 2. BACKGROUND and objective, helping on the optimization process itself, This section presents research efforts related to our work and guiding the optimization process by providing informa- and an overview of the problem investigated in this paper. tion related to decision variables. Meignan and Knust [19] proposed a system that employs analyst feedback on the op- 2.1 Related Work timization to use as long-term preferences for future runs. Assisting users in executing applications has been a re- Nascimento and Eades [7] proposed a framework for hu- search goal of several groups. Related to our work there are mans to assist the optimization process via inserting do- efforts from multiple areas: computational steering, human- main knowledge, escaping from local minimum, reducing the guided search, workflow management, design of experiment, search space to be explored, and avoiding ambiguity for op- interactive optimization, among others. timal multi-solutions. Researchers have also explored visu- Computational steering [5, 11–13, 22, 32] aims at provid- alization techniques to add humans in the optimization pro- ing users with tools that enable parameter reconfiguration cess [3, 9, 15, 29]. WorkWays [24, 25] is a science gateway while experiments are in progress. Parker and Johnson [26] with human-in-the-loop support for running and managing introduced a system called SCIRun that uses a dataflow pro- scientific workflows. gramming model and visual programming to simplify the Several efforts in the design of experiments happened over tasks of creating, debugging, optimizing, and controlling the last years. Kleijnen et al. [16] developed a survey and a complex scientific simulations. Van Wijk et al. [30] high- user guide on advances in this area until 2005. Using frac- lighted that the implementation of computational steering tional factorial design technique, Abramson et al. [1] devel- in practice is hard. To overcome this problem they im- oped a system to facilitate parameter exploration and sup- plemented an environment in which a data manager entity port for abstracting the underlying computing platform. facilitates the interaction between the application and the Our research is based on existing work of these aforemen- steering components. Chin et al. [2] incorporated computa- tioned efforts in order to build a tool able to provide feed- tional steering in mesoscale lattice Boltzmann simulations back to the user on how her interaction with the parameter and showed the benefits of their work. They discussed that sweeping experiments impacts the defined time (and cost) large scale simulations require not only computational re- constraints. sources but tools to manage these simulations and their 5 WORKS 2016 Workshop, Workflows in Support of Large-Scale Science, November 2016, Salt Lake City, Utah Figure 2: Overview of Copper and application adapter. 2.2 Problem Description the output of that process. Problems from several industries are tackled by users, also Analysts may have initial insights about the regions of named analysts, running complex computer simulations and search space that lead to good results. However, these in- optimizers that constitute a software with a set of parame- sights may be wrong, and can change as long as intermediate ters, where each parameter can receive different values. We results are evaluated. For several problems, the codification consider applications with n parameters and each parame- of human insights is not feasible due to the following rea- ter has a finite discrete domain Di where i ∈ {1, · · · , n}. A sons. These insights come from observations of historical function f : D1 × · · · × Dn → R is adopted to evaluate the data which may not be available. Even when this data is quality of parameter values and is specific for each paramet- available, the time to analyze and codify the human insight ric application. The considered optimization problem P is may not be affordable. Human insights can be based on described as follows: domain expertise, and the analyst has no experience/time to explain or codify this knowledge. Therefore, a mixed strategy that combines human expertise and optimization max f (x) approaches may be useful to help analysts in the evaluation P x ∈ D1 × · · · × Dn of complex optimization problems. We assume the user is able to generate f (x) according to In this paper, we introduce a tool and a set of techniques the simulation output. For instance, if the output g(x) of a to help scientists and engineers to properly prioritize their simulation process is a real value that should be minimized, experiments, which can comprise traditional HPC jobs, or then f (x) can be assigned as f (x) = −g(x) and the problem jobs following high throughput computing [27]. By doing so, definition remains the same. If the parametric application they can find the expected solution in a more cost-effective a generates multiple outputs (e.g., a : D1 × · · · × Dn → way, and perhaps discard unnecessary work to be processed. O1 × · · · × Om ), a wrapper function h : O1 × · · · × Om → Figure 1 presents the submission-execution-analysis work- R should be employed to represent the behavior of f (i.e., flow investigated in this scenario. In Step 1, the user submits f (x) = h(a(x))). to Copper a request to run an application with her SLA con- For instance, suppose the analyst wants to calibrate a sim- straints, which can contain deadline, cost, or result quality ulation model that generates a list of predicted values over restrictions. In this work we focus on deadline constraints. a given period. In this case, the output of the parametric Others restrictions (e.g., cost or energy) can be mapped into application corresponds to a list of values, one for each time the time it takes to execute jobs on the computing infras- instant. A wrapper function (e.g., Root Mean Squared Er- tructure. In a scenario where Copper is executed in the ror) can be employed to generate a single quality value to cloud, cost may become an important factor to be consid- compare the difference between the generated values and a ered. Users may decide to change their execution strategies reference series. so as to meet cost constraints based on the cloud provider in- Running all possible values for each parameter can be stance prices and charging model having job execution time unfeasible—even with large high performance computing ma- and resource requirements as input. Another example, an chines. Therefore, users need strategies to select a subset of energy utilization rate can be employed to define the max- values for each parameter that covers parts of the explo- imum amount of energy to run the experiments depending ration space that will give them enough information to an- on the execution time. The user also describes in the request swer their questions. Such questions can be: what is the best the parameters and possible values to run her application, set of parameter values that provides a (close-to-)optimal the job submission strategy (e.g., random search or range of solution or what are the parameter relationships that have parameter values), and the specification of the computing more influence on the phenomenon being analyzed. The environment (e.g. number of processors and their configura- strategies for parameter-value selections can be defined by tion). In Step 2, Copper requests the resource management Design of Experiments (DOE), which corresponds to a sys- system to configure the environment and creates compute tematic method to determine the relationship between pa- jobs corresponding to parameter-value pairs. rameters (also known as features) affecting a process and Once jobs are executing, Copper monitors the state of the 6 WORKS 2016 Workshop, Workflows in Support of Large-Scale Science, November 2016, Salt Lake City, Utah running/completed/pending jobs to predict SLA violation Figure 3 shows Copper in the submission-execution-analysis (Step 3). If SLA is about to be violated, a new strategy workflow. Initially, the analyst selects the first jobs to be ex- is defined to run the pending jobs. If that is the case, a ecuted (initialEvaluation()). Those jobs can be selected at notification of SLA violation and a new strategy is sent to random by Copper or strategically selected by the analyst. the user (Step 4) who, based on this information, can act Next, Copper submits the jobs to the computational infras- on the strategy to execute her jobs (Step 5). Meanwhile, tructure (submitJobs()). Once the first jobs are finished, the analyst can also manually change job priorities based Copper updates the internal DOE model (updateModel()) on her domain expertise and on her trust on the ongoing and presents the results to the analyst. strategy (Step 6). As Copper exploits the analyst exper- After the initial phase, in the Evaluation Loop the ana- tise and her feedback is not instantaneous, the job priority lyst submits the jobs and evaluates the results. The analyst asynchronously changes to avoid unnecessary delays in trig- initially receives a list of job suggestions (jobSuggestion()) gering new jobs. This is particularly important as Copper from Copper and adapts the list by including or removing is designed to manage long execution time jobs while users jobs according to her expertise (includeUserJobs()). Then, follow their submission-execution-analysis workflows. the jobs are submitted, the Copper model is updated, and the results are presented to the analyst (similar to the initial The specific problems tackled in paper are therefore: phase). Whenever the analyst considers the results reached the desired goal, the loop stops. If the ongoing strategy • How to suggest jobs for execution to the analyst and is predicted to lead to an SLA violation, the result of the exploit her expertise to get faster problem resolution? jobSuggestion() method returns the violation warning and • How to predict SLA violations based on clustering the new samples to be evaluated according to the analyst’s strategies and pending/running/completed jobs? predefined strategy (§ 3.2). • How to suggest to the analyst to change her strategy to avoid SLA violations based on intermediate results and her expertise? Analyst Copper Computing Infrastructure 3. ARCHITECTURE AND METHODS The proposed solution comprises the Copper tool, a com- initialEvaluation() puting infrastructure, and a web client. Copper is a ser- submitJobs() vice, which can be executed in the cloud or on-premise, re- sponsible for learning the user strategy, suggesting strategy results changes in case of probable SLA violations, and proposing updateModel() job executions to the analyst. A simple adapter should be created on the computational infrastructure to connect ex- isting user applications and Copper. This component is spe- results cific to each application and allows Copper to trigger job execution by using REST commands. It is also responsible Evaluation Loop for sending intermediate results and the list of parameters jobSuggestion() with their possible values to Copper. Figure 2 presents an overview of the Copper architecture. suggestions The manager orchestrates the Copper operation, reads the results, and stores them in a database as soon as jobs are includeUserJobs() executed. The SLA evaluator and sampler receive those in- termediate results to update the selected DFO method state and cluster models. In order the produce new job candidates evaluateJobs() (§ 3.1), the sampler utilizes DFO strategies such as gener- submitJobs() alized greedy randomized adaptive search (GRASP), gen- eral pattern search (GPS), or particle swarm (PS) optimiza- results tion [4, 10]. The analyst selects the optimization strategy updateModel() according to its preference. In this paper, we implemented a GRASP variation that employs DOE to estimate the quality of a given sample. As the focus of this paper is related to results the interaction between the analyst and Copper, the com- parison between the developed GRASP version and other isFinished() optimization approaches is beyond the scope of this paper. For SLA violation prediction (§ 3.1), the SLA evaluator employs clustering methods to estimate the number of pend- ing jobs according to the user strategy. This number is com- pared to the number of jobs that can be executed based on the remaining time/budget defined in the SLA contract. If Figure 3: Copper in the submission-execution- the user has no resources to evaluate all pending jobs ac- analysis workflow. cording to a given strategy, Copper sends a warning and suggests the user to change her approach. 7 WORKS 2016 Workshop, Workflows in Support of Large-Scale Science, November 2016, Salt Lake City, Utah Algorithm 1: sampling - proposed sampling method. number of generated samples (S). The greediness level (β) Input : M , P , β, n is a real value contained in [0, 1] and represents how random Output: samples is the selection of parameter values according to their esti- 1 solution ← construction(M , β , P ); mated quality. If the greediness level is close to one, only 2 samples ← localSearch(M , P , n, solution); high-quality parameter values can be selected. On the other 3 return samples; hand, β values close to zero allow the selection of a broad range of values including low-quality values. The construction phase is presented in Algorithm 2 and finds a candidate based on the greedy heuristic. Initially, an Algorithm 2: construction - algorithm for creating a empty solution is created (Line 1) to return the list of values solution candidate. for each parameter of the result solution. For each parameter Input : M , P p (Line 2), a reduced candidate list (RCL) is created and this Output: solution list is randomly sampled to get a value that composes the 1 solution ← ∅; final solution. 2 foreach p of P do We use the full factorial DOE method [21] to evaluate 3 RCL ← {e ∈ listValues(p, M ) | q(e) ≥ the quality of each parameter value. q min and q max are the q min + β(q max − q min )}; minimum and maximum quality values of a given parameter 4 solution[indexOf(p)] ← getRandomElement(RCL); value estimated by the DOE method. The RCL is formed by 5 end all feasible parameter values that can be used to construct 6 return solution; the partial solution. In order to insert a parameter value in the RCL (Line 3) its quality level (q) should be higher or equal than a threshold q min + β(q max − q min ). Observe that Algorithm 3: localSearch - find neighbors of a given if β ∼ 1, the quality threshold approximates to q max and solution. only the best quality results will be selected (pure greedy Input : M , P , n, solution approach). On the other hand, if β ∼ 0, the quality thresh- Output: solutions old is close to q min and most of the parameter values can be 1 solutions ← ∅; used to compose the final solution (i.e., random approach). 2 for i ← 1 to n do Whenever part of the search space has enough evaluated 3 solutions[i] ← getNeighbor(M , P , solution); samples to apply the full factorial, the DOE model is up- 4 end dated. Next, a component of RCL is randomly sampled to 5 return solutions; compose the final solution (Line 4), and finally the solution is returned (Line 6). The last phase of the sampling algorithm corresponds to the local search (Algorithm 3). In this phase, n samples 3.1 Default Sampling Method - GRASP close to the candidate solution found in the previous phase Combinatorial optimization corresponds to a class of prob- are selected. The getNeighbor() function is employed to get lems that consist of finding an optimal solution from a fi- a solution candidate neighbor by using euclidean distance nite set of objects [31]. Generally, brute-force approaches between samples. cannot be applied to solve those problems due to time or cost constraints. Therefore, analysts or computer programs 3.2 SLA violation prediction method should employ heuristics to select candidate solutions that The method to predict SLA violations is based on cluster- will probably represent good (or optimal) solutions. This ing techniques. We identify the clusters of the search space section presents how Copper generates evaluation samples 1 that the analyst is interested in and compare the number of to the analyst using GRASP algorithm. This method is pending jobs in these clusters to the number of jobs that can based on the original GRASP algorithm [10] but instead of be executed according to the SLA contract. In this work, we automatically evaluating the produced samples, the analyst use Density-Based Spatial Clustering of Applications with decides whether the samples will be assessed or not. This Noise (DBSCAN) [14] method to identify the analyst strat- sampling approach corresponds to the default method to egy. This clustering method is attractive as the analyst suggest evaluation of the search space samples to the user. does not need to inform the number of clusters before eval- Our approach is not strictly tied to this optimization method uation [31]. The DBSCAN algorithm groups samples with and other techniques can be adopted to generate user sam- many neighbors determining high density clusters. Samples ples (e.g., GPS or PS). that lie in low-density clusters are defined as outliers and The GRASP sampling method is presented in Algorithm may not be classified in any cluster. Here this algorithm 1, which comprises two phases: construction and local search. is adopted to determine previously evaluated clusters of the The construction phase builds a feasible solution for each al- search space with good solutions and suggests these clusters gorithm execution (Line 1). Next, the local search generates in case of possible SLA violations. The user defines how the the neighbors of the previous solution (Line 2). Finally, the pending jobs will be distributed over the identified clusters. set of results is returned (including the initial solution). The Figure 4 presents the basic idea of the proposed solution. algorithm inputs are the following: the DOE model (M ), Suppose the search space is composed of two parameters the list of parameters (P ), the greediness level (β), and the and each particular combination (v1, v2) represents a job 1 Samples and jobs represent an instance of the application that will be submitted to the computational infrastructure. with a set of parameter values. These terms are used in the v1 and v2 are possible values for parameters 1 and 2 respec- paper interchangeably. tively. As long as the analyst submits new jobs, the clus- 8 WORKS 2016 Workshop, Workflows in Support of Large-Scale Science, November 2016, Salt Lake City, Utah 4.1.1 Applications job not executed job executed + bad result interest regions Here is a short description of the applications we used in job executed + good result best results (not executed yet) the evaluation with an overview of their input parameters and output users are interested in. 1. ifm: The Integrated Flood Model (IFM) [28] is a search space hydrological model aimed at providing high resolution flood parameter 2 (values) forecasts. IFM contains a soil and an overland routing model where the soil model estimates the surface-runoff based on incoming precipitation, soil, and land use properties. When IFM is deployed to calculate flooding predictions, it requires parameter 1 (values) a calibration which consists in running several simulations that match data collected by real sensors. To evaluate IFM, we varied the parameters related to incoming precipitation Figure 4: Example of search space with a set of in- and soil properties with the goal of minimizing the difference terest clusters. between simulated and real data captured by sensors. 2. schedsim: Scheduler Simulator (SchedSim) is a sim- tering approach identifies the clusters of the search space ulator to assist in the creation of policies to manage High that provide good results. The quality of job results can be Performance Computing clusters. It accepts a variety of previously defined by using a threshold value or the analyst parameters including number of processors in the cluster, can qualify each job result (e.g., good or bad) whenever it partitions, and scheduling algorithms. Tuning the sched- becomes available. uler and cluster properties to meet client business goals is In this example, the analyst is searching the best result in not a trivial task and several scenarios must be executed to three clusters of the search space. The clusters are identified achieve that. For the evaluation, we had the following input: by the clustering algorithm and the number of pending jobs (i) a workload containing historical data of jobs submitted is counted. If this number is higher than the remaining jobs to a cluster; (ii) a fixed number of cluster processors; and defined by the SLA the user should change her strategy. In (iii) two variable partitions. We wanted to know what was this case, the number of pending jobs is 15 (including the the partition sizes and which jobs (requested time and al- best result). If the analyst can run 15 jobs, at most, jobs due located processors) should go to such partitions having the to SLA restrictions, Copper advises the analyst to change following goals: (i) minimize the overall job response time her strategy. and (ii) small jobs (under 1h requested time) should wait no The strategy can be defined considering exploration and more than 3 hours in the cluster waiting queue. exploitation aspects. We define exploitation level (ex ) as the proportion of pending jobs that will be executed considering 3. mazerunner: Maze Runner is an in-house created samples inside the detected clusters. For instance, if the game in which simulated runner has to find the way out of exploitation level is 1.0 all the pending jobs will be executed a maze. The objective of the game is to find a runner con- taking the parameters contained in the clusters of interest. figuration (e.g., velocity, movement strategy) that leads to a If the exploitation level is 0, pending jobs in not explored shorter time to find the maze exit. There are enemies in the clusters (outside the clusters) will be given higher priority. maze, and the runner needs to escape from them. Addition- Considering all possible pending jobs in all clusters (i.e. ally, the runner can slip on obstacles and stay vulnerable for ex ×#pending jobs), the user defines which ones of these jobs some time depending on her velocity. This game is simple will execute for each cluster. The default approach utilizes and intuitive and was created to demonstrate concepts of weighted average according the mean quality of evaluated the proposed assistant for a broader audience. jobs inside the clusters. For jobs outside the clusters (i.e. The search spaces related to the applications used in this (1 − ex ) × #pending jobs), the default GRASP strategy re- work are illustrated in Figure 5. Multiple parameter com- mains the same (§ 3.1). binations are represented in a single axis and each square corresponds to a parameter combination. The quality of a parameter configuration is represented as a given color (dark 4. EVALUATION colors represent better results). For instance, the top-right In this section, we evaluate the Copper effectiveness in square of Figure 5c presents the (V1 1, V2 1, V3 2, V4 3 ) helping users run optimization and computer simulations parameter configuration of maze runner. In this work, we under SLA constraints. We first describe three applications assume the output of parametric applications as black box served as use cases and metrics analyzed to understand the functions, therefore we will not enter in details about the benefits of the assistant. In particular, we are interested meaning of parameters and their values. in analyzing three major aspects considering submission- By observing the search spaces of the three applications, execution-analysis workflows: (i) the interaction of Copper SchedSim presents a well behaved variation of the output with analysts of different expertise levels; (ii) how the an- when the parameters values change. A similar behavior hap- alyst confidence on Copper affects the evaluation of user pens to the IFM search space. However, for higher values applications; (iii) the impact of Copper SLA violation pre- of P1 the output presents no variation. This ‘flat’ search dictions on user experiments. space region reduces the Copper capacity to predict good parameter values. Regarding the Maze Runner search space, 4.1 Experiment Setup configurations whose parameter P1 = V1 3 provide better This section presents an overview of the setup to evaluate results. However, the output variation does not have the Copper. same gradual characteristics as in the other applications. 9 WORKS 2016 Workshop, Workflows in Support of Large-Scale Science, November 2016, Salt Lake City, Utah V2_10 V2_11 V2_12 V2_13 V2_14 V2_15 V2_16 V2_17 V2_18 V2_19 V2_20 V2_1 V2_2 V2_3 V2_4 V2_5 V2_6 V2_7 V2_8 V2_9 66.0 71.0 76.0 81.0 86.0 91.0 96.0 1.0 6.0 11.0 16.0 21.0 26.0 31.0 36.0 41.0 46.0 51.0 56.0 61.0 66.0 71.0 76.0 81.0 86.0 91.0 96.0 1.0 6.0 11.0 16.0 21.0 26.0 31.0 36.0 41.0 46.0 51.0 56.0 61.0 66.0 71.0 76.0 81.0 86.0 91.0 96.0 V4_1 V4_2 V4_3 V4_4 V4_5 V4_6 P4 V3_10 V3_11 V3_12 V3_13 V3_14 V3_15 V3_16 V3_17 V3_18 V3_19 V3_20 V3_1 V3_2 V3_3 V3_4 V3_5 V3_6 V3_7 V3_8 V3_9 P3 { { { { { { V3_2 V3_2 V3_2 V3_2 V3_2 V3_2 V3_1 V3_1 V3_1 V3_1 V3_1 V3_1 P3 V3_5 V3_5 V3_5 V3_5 V3_5 V3_5 V3_3 V3_3 V3_3 V3_3 V3_3 V3_3 V3_6 V3_6 V3_6 V3_6 V3_6 V3_6 V3_4 V3_4 V3_4 V3_4 V3_4 V3_4 P2 { { { { { { { { { { { { { { { { { { { { 1.0 6.0 11.0 16.0 21.0 26.0 31.0 36.0 41.0 46.0 51.0 56.0 61.0 66.0 71.0 76.0 81.0 86.0 91.0 96.0 1.0 6.0 11.0 16.0 21.0 26.0 31.0 36.0 41.0 46.0 51.0 56.0 61.0 66.0 71.0 76.0 81.0 86.0 91.0 96.0 1.0 6.0 11.0 16.0 21.0 26.0 31.0 36.0 41.0 46.0 51.0 56.0 61.0 66.0 71.0 76.0 81.0 86.0 91.0 96.0 1.0 6.0 11.0 16.0 21.0 26.0 31.0 36.0 41.0 46.0 51.0 56.0 61.0 66.0 71.0 76.0 81.0 86.0 91.0 96.0 1.0 6.0 11.0 16.0 21.0 26.0 31.0 36.0 41.0 46.0 51.0 56.0 61.0 66.0 71.0 76.0 81.0 86.0 91.0 96.0 1.0 6.0 11.0 16.0 21.0 26.0 31.0 36.0 41.0 46.0 51.0 56.0 61.0 66.0 71.0 76.0 81.0 86.0 91.0 96.0 1.0 6.0 11.0 16.0 21.0 26.0 31.0 36.0 41.0 46.0 51.0 56.0 61.0 66.0 71.0 76.0 81.0 86.0 91.0 96.0 1.0 6.0 11.0 16.0 21.0 26.0 31.0 36.0 41.0 46.0 51.0 56.0 61.0 66.0 71.0 76.0 81.0 86.0 91.0 96.0 1.0 6.0 11.0 16.0 21.0 26.0 31.0 36.0 41.0 46.0 51.0 56.0 61.0 66.0 71.0 76.0 81.0 86.0 91.0 96.0 1.0 6.0 11.0 16.0 21.0 26.0 31.0 36.0 41.0 46.0 51.0 56.0 61.0 66.0 71.0 76.0 81.0 86.0 91.0 96.0 1.0 6.0 11.0 16.0 21.0 26.0 31.0 36.0 41.0 46.0 51.0 56.0 61.0 66.0 71.0 76.0 81.0 86.0 91.0 96.0 1.0 6.0 11.0 16.0 21.0 26.0 31.0 36.0 41.0 46.0 51.0 56.0 61.0 66.0 71.0 76.0 81.0 86.0 91.0 96.0 1.0 6.0 11.0 16.0 21.0 26.0 31.0 36.0 41.0 46.0 51.0 56.0 61.0 66.0 71.0 76.0 81.0 86.0 91.0 96.0 1.0 6.0 11.0 16.0 21.0 26.0 31.0 36.0 41.0 46.0 51.0 56.0 61.0 66.0 71.0 76.0 81.0 86.0 91.0 96.0 1.0 6.0 11.0 16.0 21.0 26.0 31.0 36.0 41.0 46.0 51.0 56.0 61.0 66.0 71.0 76.0 81.0 86.0 91.0 96.0 1.0 6.0 11.0 16.0 21.0 26.0 31.0 36.0 41.0 46.0 51.0 56.0 61.0 66.0 71.0 76.0 81.0 86.0 91.0 96.0 1.0 6.0 11.0 16.0 21.0 26.0 31.0 36.0 41.0 46.0 51.0 56.0 61.0 66.0 71.0 76.0 81.0 86.0 91.0 96.0 1.0 6.0 11.0 16.0 21.0 26.0 31.0 36.0 41.0 46.0 51.0 56.0 61.0 66.0 71.0 76.0 81.0 86.0 91.0 96.0 1.0 6.0 11.0 16.0 21.0 26.0 31.0 36.0 41.0 46.0 51.0 56.0 61.0 66.0 71.0 76.0 81.0 86.0 91.0 96.0 1.0 6.0 11.0 16.0 21.0 26.0 31.0 36.0 41.0 46.0 51.0 56.0 61.0 66.0 71.0 76.0 81.0 86.0 91.0 96.0 V1_1 { V2_1 V2_2 V1_2 V2_3 V1_3 V1_1 V2_4 V2_5 V1_4 { V2_6 V2_1 V1_5 V2_2 V1_6 V2_3 V1_2 V1_7 V2_4 V2_5 V1_8 { V2_6 V2_1 V1_9 V2_2 V2_3 V1_10 V1_3 V2_4 V2_5 V1_11 V1_12 { V2_6 V2_1 V2_2 V1_13 V2_3 V1_14 V1_4 V2_4 V2_5 V1_15 { V2_6 V2_1 V1_16 V2_2 V1_17 V2_3 V1_5 V1_18 V2_4 Normalized Goal [0,1] V2_5 V1_19 { V2_6 V2_1 V1_20 V2_2 V2_3 V1_21 V1_6 V2_4 V2_5 V1_22 V2_6 P1 P2 P1 b) SchedSim search space a) IFM search space V4_1 V4_2 V4_3 P4 { { { V3_1 V3_2 V3_1 V3_2 V3_1 V3_2 P3 { V2_1 V1_1 V2_2 V2_3 { V2_1 V1_2 V2_2 V2_3 { V2_1 V1_3 V2_2 V2_3 P1 P2 c) Maze Runner search space Figure 5: Search spaces for IFM, SchedSim and Maze Runner applications. Each application has n parameters (P 1 · · · P n) and each parameter P j can assume m values (V j 1, · · · , V j m). The search spaces sizes for IFM, SchedSim, and Maze Runner are 6 × 6 × 6 × 6 = 1296, 22 × 20 × 20 = 8800, and 3 × 3 × 2 × 3 = 54, respectively. 4.1.2 Experiments Description mal solutions. Therefore, we divided the agents into three We consider software agents to simulate the human be- categories based on the insight quality level. The first group havior [19] and divide the experiments into two main groups. of agents are related to users that have a previous intuition First, the agents evaluate the tool with no SLA requirements about good parameter values, and this intuition may lead (§ 4.2). In this case, the assessment aims at finding how to bad solutions (bad insights). The second group contains many jobs are necessary to find an optimal solution. The agents that have previous ideas about the parameter values software agent only interacts with Copper to obtain sugges- and these values result in proper solutions (good insights). tions of jobs to be executed and no SLA-aware assessment The last group of agents represents users with no insight is performed. about the clusters of search space that may contain the op- The second evaluation is performed to find the best result timal solution (no insights). considering a limited number of jobs (§ 4.3). Therefore, For each agent insight level, we vary the user confidence agents interact with Copper to obtain strategy suggestions level on the job suggestions that come from Copper and the if the ongoing experiments tend to extrapolate the number capacity of an agent to learn from intermediate results. The of jobs established in the SLA. confidence level determines the percentage of jobs that are submitted taking into account Copper suggestions. For in- stance, if the user submits 100 jobs and the confidence level 4.2 Finding an Optimal Solution is 40%, then 40 jobs are submitted based on Copper recom- In this evaluation, we assess the number of jobs that must mendation and the rest is triggered based on the agent intu- be executed to find an optimal solution. The agents have ition. This intuition corresponds to a subset of search space insights about which parameter values may result in opti- in which the analyst believes the optimal result is contained. 10 WORKS 2016 Workshop, Workflows in Support of Large-Scale Science, November 2016, Salt Lake City, Utah Agent does not learn from intermediate results Agent learns from intermediate results 350 90 160 300 80 140 70 250 120 60 200 100 50 150 40 80 30 60 100 20 40 50 10 20 0 0 0 3500 300 700 3000 250 600 2500 500 200 2000 400 150 1500 300 100 1000 200 500 50 100 0 0 0 50 25 40 35 40 20 30 30 15 25 20 20 10 15 10 5 10 5 0 0 0 10 20 30 40 50 60 70 80 90 100 10 20 30 40 50 60 70 80 90 100 10 20 30 40 50 60 70 80 90 100 Percentage of samples from advisor (agent confidence on Copper - confidence level) Figure 6: # Jobs to find an optimal solution as a function of agent confidence level for all applications. 100% confidence level experiments correspond to the proposed GRASP solution without analyst intervention. We evaluate agents that learn or not from intermediate re- for these applications. For agents with bad insights, as the sults. When the agent learns from intermediate results, it confidence level on recommendations made by Copper in- increases the probability of selecting a given parameter value crease, the number of jobs to find an optimal result reduces. if this value leads to good intermediate results and decreases Then, users with bad insights may have benefits when using the likelihood of choosing a given value if this parameter Copper. For Agents that do not learn from intermediate value leads to bad results. As the focus of this experiment results, SchedSim results present the most significant differ- is to present how the framework takes benefits from users ence between relying on Copper to provide the jobs param- recommendations, we do not explore in details here how the eters or not. Increasing the confidence in suggestions made agents learn from intermediate results. by Copper also reduces the number of jobs to find the opti- Figure 6 shows the percentage of jobs that are submitted mal configuration in IFM and Maze Runner. However, this according to Copper recommendation (confidence level) for difference is less significant than the previous one. As the all applications as a function of the number of jobs that must SchedSim search space has interesting characteristics about be submitted to find an optimal solution. As the proposed the objective function growing related to the input param- sampling algorithm is stochastic, multiple experiments were eters, the proposed sampling algorithm takes less steps to performed to find a 95% confidence interval of the mean find the optimal parameter configuration. number of jobs to find an optimal result. The bootstrap For bad insight agents that learn from intermediate re- method [8] was adopted to find the confidence interval and sults, there is a significant difference between learning from the sample size for each result is 500—this number is high intermediate results or not. For example, bad insight agents and reduces the size of the confidence interval. The bar for SchedSim with 10% confidence level presents a big differ- graphs show the mean value and the confidence interval for ence between learning from intermediate results or not. This each experiment. behavior can be explained by the interesting characteristics The first, second, and third rows of Figure 6 are related of ShedSim search space. Figure 5 shows that low values of respectively to IFM, SchedSim, and Maze Runner results. SchedSim parameters P1, P2, and P3 lead to good results. The three columns of the same figure, shows the experi- The agents learn that these values lead to good results and ments for agents with bad, good and no previous insights increase the probability of choosing combinations of these 11 WORKS 2016 Workshop, Workflows in Support of Large-Scale Science, November 2016, Salt Lake City, Utah values. 1.00 Best Results For agents with good insights, the number of jobs to find the optimal solution increases with the agent confidence 0.98 level. This result is the opposite of the previous one and indicates that users with good insights may have a signifi- 0.96 cant reduction of steps to find an optimal solution. Agents 0.94 50 Jobs Quality (%) with good insights about the behavior of SchedSim and Maze 100 Jobs Runner present a regular growth in the number of jobs to 0.92 150 Jobs find the optimal solution when the confidence level on Cop- 200 Jobs 0.90 per increases. However, IFM does not present this charac- teristic as there is an abrupt increase from 90 to 100 percent 0.88 of confidence level. Whenever 100% confidence level is em- 0.86 ployed, the agent only takes Copper suggestions to submit new jobs. In this case, Copper needs more jobs to select pa- 0.0 0.33 0.66 1.0 Exploitation Level rameters outside the flat area of IFM search space. As 90% confidence level agents submit 10% of jobs based on user in- sight. The DOE model is updated considering the non-flat Figure 7: Best results for experiments with SLA areas (agent good insight) and speeds-up the optimization restrictions - IFM. process. For agents with good insights, there is no expres- sive difference between learning from intermediate results or 1.00 Best Results not. As the number of jobs to find the optimal solution is reduced, the number of samples is not sufficient to update 0.99 the user preference and significantly affect the result. For agents with no insights, the agent’s efficiency is im- 0.98 pacted by the confidence level. Whenever the confidence 30 Jobs level on Copper increases the results get better. However, Quality (%) 0.97 60 Jobs this impact is less notable to no insight agents when com- 90 Jobs pared to the results of bad insight agents. Similarly to the 120 Jobs 0.96 bad insight agents, there is a difference between learning from intermediate results. 0.95 4.3 Restricted Experiments 0.94 0.0 0.33 0.66 1.0 This set of experiments utilizes the same agents to find the Exploitation Level best effort solution but with limited number of jobs that can be executed due to SLA constraints. We assume that all jobs Figure 8: Best results for experiments with SLA take the same time to execute, and the SLA constraints cor- restrictions - SchedSim. respond only to time requirements. Other conditions such as costs or energy consumption can be used by mapping these 1.000 Best Results requirements into the number of submitted jobs. The qual- ity of a result is given by the ratio of the result found and the optimal result. 0.998 For each experiment set, we changed the maximum num- ber of jobs that can be submitted and the exploitation level. 0.996 15 Jobs Quality (%) The exploitation level is defined as the proportion of pend- 20 Jobs ing jobs that will be submitted considering samples inside 25 Jobs the clusters defined in § 3.2. As the output of an experi- 0.994 30 Jobs ment is not deterministic, each bar in the chart presents the mean value and the 95% confidence related to 500 runs for 0.992 each maximum jobs, and exploitation level pair. In these ex- periments, the proposed clustering method is applied when 0.990 the number of submitted jobs reaches 70% of the maximum 0.0 0.33 0.66 1.0 number of jobs and the agent accepts the Copper recom- Exploitation Level mendation for all experiments. The restricted experiment results for IFM, SchedSim, and Maze Runner are respec- Figure 9: Best results for experiments with SLA tively presented in Figures 7, 8, and 9. restrictions - Maze Runner. The IFM experiment set examines the best effort scenario with a limited number of jobs. In this case, instead of pre- ferring exploration or exploration of sampling space, it is jobs is reduced (in this case 50 jobs). Based on these results preferable to adopt a mixed strategy to find good param- we can conclude that for this experiment it is preferable to eter values. For instance, for 50 jobs the experiment with adopt a mixed strategy to evaluate the search space. There- exploitation level of 33% presents better quality when com- fore, experienced analysts can take advantage of combined pared to the others. Similar to the other experiment sets, search strategies of exploitation/exploration to achieve bet- the variation of results quality is higher when the number of ter results. 12 WORKS 2016 Workshop, Workflows in Support of Large-Scale Science, November 2016, Salt Lake City, Utah For the SchedSim results (Figure 8), it is possible to ob- gineering using many-task computing. IEEE Transac- serve that for a reduced number of jobs (i.e., 30) the exploita- tions on Parallel and Distributed Systems, 22(6):960– tion level has a significant influence on the output quality. 973, 2011. The impact of exploitation level decreases as the number of submitted jobs increase because the probability of finding [2] J. Chin, J. Harting, S. Jha, P. V. Coveney, A. R. better results are higher if more jobs are executed. For this Porter, and S. M. Pickles. Steering in computational experiment set, it is preferable to explore the search space science: mesoscale modelling and simulation. Contem- instead of concentrating the submitted jobs in the cluster porary Physics, 44(5):417–434, 2003. areas. This result suggests that the best results are spread [3] L. Colgan, R. Spence, and P. Rankin. The cock- through the solution search space. pit metaphor. Behaviour & Information Technology, Figure 9 presents the results of experiments with a re- 14(4):251–263, 1995. stricted number of jobs related to Maze Runner. Differ- ent from the previous application, whenever the exploitation [4] A. R. Conn, K. Scheinberg, and L. N. Vicente. Intro- level is higher, the results have better quality. On the other duction to derivative-free optimization, volume 8. Siam, hand, when the SLA constraint is relaxed (more jobs are 2009. allowed) the impact of exploitation levels becomes less im- portant. We can conclude that the best results are grouped [5] B. K. Danani and B. D. D’Amora. Computational steer- in the solution search space, then strategies that exploit in- ing for high performance computing: applications on termediate results can be adopted to get better results. blue gene/q system. In Proceedings of the Symposium on High Performance Computing, pages 202–209. Soci- ety for Computer Simulation International, 2015. 5. CONCLUSIONS Several areas in science and engineering rely on many ex- [6] E. Deelman, D. Gannon, M. Shields, and I. Taylor. ecutions of a software system with different values for each Workflows and e-science: An overview of workflow sys- supported parameter. These executions are responsible for tem features and capabilities. Future Generation Com- evaluating test scenarios and calibrating models. Normally, puter Systems, 25(5):528–540, 2009. users follow a workflow where they setup a set of exper- iments, generate intermediate results, analyze them, and [7] H. A. D. do Nascimento and P. Eades. User hints: a reprioritize jobs for the next batch. Reprioritization is a framework for interactive optimization. Future Gener- key step as executing all possible values for all supported ation Computer Systems, 21(7):1177–1191, 2005. parameters is usually not feasible, especially under cost and [8] B. Efron and R. Tibshirani. An Introduction to the deadline constraints determine by strict SLAs. Bootstrap. Chapman & Hall/CRC Monographs on This work introduced a tool to help users in executing Statistics & Applied Probability. Taylor & Francis, their software systems by exploring their domain expertise 1994. and design of experiment techniques. By learning user strate- gies on sweeping parameters, our tool is able to detect if [9] A. Endert, M. S. Hossain, N. Ramakrishnan, C. North, the ongoing strategy is able to meet SLA requirements and P. Fiaux, and C. Andrews. The human is the loop: new suggest how jobs can be reprioritized in case an SLA viola- directions for visual analytics. Journal of intelligent tion is predicted to happen while users run their submission- information systems, 43(3):411–435, 2014. execution-analysis workflows. Our main lessons while devel- oping the tool and evaluating with three applications are: [10] T. A. Feo and M. G. Resende. Greedy randomized (i) it is important to facilitate user interaction with an au- adaptive search procedures. Journal of global optimiza- tomated design of experiment technique as users may bring tion, 6(2):109–133, 1995. domain expertise that can reach desired results faster; (ii) as [11] M. Garcı́a, J. Duque, P. Boulanger, and P. Figueroa. the desired solution may be found by mixing exploration and Computational steering of cfd simulations using a exploitation strategies (e.g. breadth-first search and depth- grid computing environment. International Journal first search), users can benefit from a tool that can identify on Interactive Design and Manufacturing (IJIDeM), if the ongoing strategy is correct and can meet SLA con- 9(3):235–245, 2015. straints; (iii) evaluations with a restricted number of exper- iments are highly impacted by Copper suggestions. Then, [12] W. Gu, J. Vetter, and K. Schwan. An annotated bibli- the proposed tool and methods are especially suitable for ography of interactive program steering. ACM Sigplan evaluation of parametric applications with tight deadlines. Notices, 29(9):140–148, 1994. [13] E. Kail, P. Kacsuk, and M. Kozlovszky. A novel ap- Acknowledgements proach to user-steering in scientific workflows. In Pro- We would like to thank Miguel Paredes Quinones and the ceedings of the International Symposium on Applied anonymous reviewers for their valuable comments. This Computational Intelligence and Informatics (SACI), work has been partially supported by FINEP/MCTI un- pages 233–236. IEEE, 2015. der grant no. 03.14.0062.00. [14] K. Khan, S. U. Rehman, K. Aziz, S. Fong, and S. Saras- vady. Dbscan: Past, present and future. In Appli- REFERENCES cations of Digital Information and Web Technologies [1] D. Abramson, B. Bethwaite, C. Enticott, S. Garic, and (ICADIWT), 2014 Fifth International Conference on T. Peachey. Parameter exploration in science and en- the, pages 232–238. IEEE, 2014. 13 WORKS 2016 Workshop, Workflows in Support of Large-Scale Science, November 2016, Salt Lake City, Utah [15] G. W. Klau, N. Lesh, J. Marks, and M. Mitzen- [24] H. Nguyen and D. Abramson. WorkWays: Interac- macher. Human-guided search. Journal of Heuristics, tive workflow-based science gateways. In Proceedings of 16(3):289–310, 2010. the International Conference on E-Science (e-Science). IEEE, 2012. [16] J. P. Kleijnen, S. M. Sanchez, T. W. Lucas, and T. M. Cioppa. State-of-the-art review: a user’s guide to [25] H. A. Nguyen, D. Abramson, T. Kipouros, A. Janke, the brave new world of designing simulation experi- and G. Galloway. WorkWays: interacting with scien- ments. INFORMS Journal on Computing, 17(3):263– tific workflows. Concurrency and Computation: Prac- 289, 2005. tice and Experience, 27(16):4377–4397, 2015. [17] B. Ludäscher, I. Altintas, C. Berkley, D. Higgins, [26] S. G. Parker and C. R. Johnson. SCIRun: a scientific E. Jaeger, M. Jones, E. A. Lee, J. Tao, and Y. Zhao. programming environment for computational steering. Scientific workflow management and the kepler system. In Proceedings of the 1995 ACM/IEEE conference on Concurrency and Computation: Practice and Experi- Supercomputing, page 52. ACM, 1995. ence, 18(10):1039–1065, 2006. [27] I. Raicu, I. Foster, M. Wilde, Z. Zhang, K. Iskra, [18] M. Mattoso, J. Dias, K. A. Ocaña, E. Ogasawara, P. Beckman, Y. Zhao, A. Szalay, A. Choudhary, P. Lit- F. Costa, F. Horta, V. Silva, and D. de Oliveira. Dy- tle, et al. Middleware support for many-task comput- namic steering of HPC scientific workflows: A sur- ing. Cluster Computing, 13(3):291–314, 2010. vey. Future Generation Computer Systems, 46:100–113, 2015. [28] S. Singhal, S. Aneja, F. Liu, L. V. Real, and T. George. IFM: A scalable high resolution flood modeling frame- [19] D. Meignan and S. Knust. Interactive optimization with work. In Proceedings of the International European long-term preferences inference on a shift scheduling Conference on Parallel and Distributed Computing problem. In Proceedings of the 14th European Meta- (Euro-Par). Springer, 2014. heuristics Workshop, 2013. [20] D. Meignan, S. Knust, J.-M. Frayret, G. Pesant, and [29] L. Tweedie, B. Spence, D. Williams, and R. Bhogal. N. Gaud. A review and taxonomy of interactive The attribute explorer. In Conference companion on optimization methods in operations research. ACM Human factors in computing systems, pages 435–436. Transactions on Interactive Intelligent Systems (TiiS), ACM, 1994. 5(3):17, 2015. [30] J. J. Van Wijk, R. Van Liere, and J. D. Mulder. Bring- [21] D. C. Montgomery. Design and analysis of experiments. ing computational steering to the user. In Scientific John Wiley & Sons, 2008. Visualization Conference, 1997, pages 304–304. IEEE, 1997. [22] J. D. Mulder, J. J. van Wijk, and R. van Liere. A survey of computational steering environments. Future [31] L. A. Wolsey and G. L. Nemhauser. Integer and com- generation computer systems, 15(1):119–129, 1999. binatorial optimization. John Wiley & Sons, 2014. [23] M. A. S. Netto, A. Breda, and O. de Souza. Scheduling [32] H. Wright, R. H. Crompton, S. Kharche, and complex computer simulations on heterogeneous non- P. Wenisch. Steering and visualization: Enabling tech- dedicated machines: a case study in structural bioin- nologies for computational science. Future Generation formatics. In Proceedings of the IEEE International Computer Systems, 26(3):506–513, 2010. Symposium on Cluster Computing and the Grid (CC- Grid 2005). IEEE, 2005. 14