CEUR Proceedings 4th Workshop HCP Human Centered Processes, February 10-11, 2011 Making context explicit towards decision support for a flexible scientific workflow system Xiaoliang Fan (1, 2, xiaoliang.fan@gmail.com), Patrick Brézillon (1, patrick.brezillon@lip6.fr), Ruisheng Zhang (2, zhangrs@lzu.edu.cn) and Lian Li (2, lil@lzu.edu.cn) (1) LIP6, Box 169, Université Pierre et Marie Curie 4 Place Jussieu, Paris 75005 France (2) School of Information Science and Engineering, Lanzhou University 222 South Tianshui Road, Lanzhou 730000 P.R.China Abstract methods, change parameters, re-design the experiment) is measured by a decision node in workflow design Scientific workflow (SWF) system is a specific workflow accompanying with a numerical value (e.g. IF the variable management system applied to science arena. For years, is greater than 5, THEN execute the activity A, ELSE SWF systems are widely applied to many applications, execute activity B; WAIT for 2 minutes to execute activity namely in physics, climate modeling, drug discovery C). However, scientific discovery is by nature a process, etc. However, current SWF systems face the knowledge-intensive one (van der Aalst et al., 2005) that challenge to adapt the flexibility and lack of decision scientists' decisions rely not only on data and information support for scientist. We believe the major reason for the available, but also on a learning process in which user’s failure is due to do not make context explicit. We propose a solution to introduce contextual graphs (CxG) in the four preference, knowledge, and situation are captured to adapt phases of the SWF lifecycle, each of which is expressed in the human-centered processes. a standard format, including a case study in virtual Such challenges mentioned above become an obstacle screening. Contextual graph allows to model scientists’ when scientists are making adaptive decisions to deliver decision making processes as a uniform representation of new outcomes with fresh data and its context (Fan et al., knowledge, reasoning, and of contexts, so that scientists 2010). Brézillon and Pomerol (1999) define context as are closely involved in each phase of SWF lifecycle to “what constrains the resolution of a problem without maximize the decision support. Finally, we conclude and explicit intervention in it”. We believe that the main highlight that using CxG is the key human-centered reason for this failure is largely due to the lack of context process for SWF systems. management in an explicit way. In this paper we propose four ways of making context explicit in scientific Introduction workflow, by introducing contextual graph to in the four Scientific workflow system liberates the computational phases of scientific workflow lifecycle. Representing and scientists from burden of data-centric operations to making “context” explicit in SWF system would provide concentration on their scientific problems (Altintas et al., sustainable decision supports for scientists by formalizing 2004; Goble et al., 2007). However, it is not yet satisfied, their research, strategies, and customization information, considering that computational science (Roache, 1998) is where elements of knowledge, reasoning and contexts are always reproduced in a flexible and exploratory pattern. represented in a uniform way. Consider virtual screening (Chen & Shoichet, 2009) for Hereafter, the paper is organized in the following way. example, the choice of one software over others depends Section 2 introduces the four phases of the scientific much on contextual information that are highly specific of workflow lifecycle. Section 3 investigates the possibility the situation at hand, and where, when, how and by whom of integrating contextual graphs to the four phases of the scientific workflow is executed. Thus a strong and scientific workflow lifecycle through a case study in sustainable decision support is urged for scientists to virtual screening. Section 4 discusses previous works on transfer hypotheses to discovery. workflow flexibility in order to point out what is reusable Workflow flexibility becomes a critical challenge to deal while problems remain to support decision-makings in a with intermittently available resources, execution failures, flexible scientific workflow system. The general and to support human-centric decision-makings. However, conclusion and future work in Section 5 closes the paper. identifying how scientists make decisions to address workflow flexibility is a very complicated issue. The ways Scientific Workflow Lifecycle of scientists make their decision vary from one another: (1) Scientific workflow lifecycle is coming from workflow based on their past experience considering successful or lifecycle (van de Aalst & van Dongen, 2003; Gil et al., failed ones; (2) inherited from the best practices within 2007; Deelman & Chervenak, 2008). It normally starts science communities; (3) from the observed intermediate from the scientific hypotheses (Beaulah et al., 2008; results; and (4) just follow their own distinguished way. Tadmor & Tidor, 2005; Claus & Johnson, 2008) to reach Various approaches (Zhang et al., 2008; Courtney, 2001; a specific experimental goal, which includes four phases Tabak et al., 1985) are proposed to get user involved to (see Figure 1): describe their decision making processes. Normally in such applications, a decision making (e.g., choose 3 CEUR Proceedings 4th Workshop HCP Human Centered Processes, February 10-11, 2011 iterative manner. Furthermore, it must then be facilitated to publish the workflow on a repository, so that SWF could be archived for re-use later. Figure 1 shows the relationship among each phases of scientific workflow lifecycle: hypotheses arrive as keywords to search pre-existing scientific workflow in SWF repository; then scientist begin to design the workflow model and maintain the mapping from an abstract workflow to a concrete one; workflow execution phase enacts the workflow model on available resources according to data and control dependencies; if a change is encountered, there is an iterative process to re-design the workflow model as well as re-execute the workflow instance; if executed successful, scientist will publish the workflow in the SWF repository for the sake of Figure 1: SWF lifecycle reproduction in the research communities. Current studies (van de Aalst & van Dongen, 2003;  Workflow Searching: before initiating a brand Deelman & Chervenak, 2008) on SWF lifecycle new workflow designing, scientists get used to generally result in the weakness to manage the workflow firstly consult a public SWF repository for changes and exceptions. We believe that the major failure searching previously published workflows is due to do not make context explicit in the SWF systems. (Wroe et al., 2007). Once found, it would be easy to reproduce the pre-existing workflow to constitute a new one. Workflow searching Make Context Explicit in SWF Lifecycle results of sharing SWF considered with its Representing and making context explicit in SWF system context of use. The more shared SWFs are taken is a challenge that could promote a SWF system more place in the SWF repository, the more accurate flexible and enhance its intelligence to facilitate effective the searching result would be. decision-makings. In this section, we discuss managing  Workflow Designing is then initiated for contexts explicit throughout the four phases of the SWF constructing a workflow model (Ludascher et al., lifecycle, each of which is described using a standard 2009). An abstract workflow model will firstly format including: motivation, realization approach, be designed, in which scientific tasks and their example, and discussion. execution orders, as well as data and its The example is represented in the Contextual graphs dependencies will be described. Secondly, the formalism (Brézillon, 2005) through a case study entitled phase involves the mapping from abstract “Virtual screening research on avian influenza H5N1 workflow to concrete/executable workflow virus”, which aims to find dozens of drug candidates for where the required resources are selected. By H5N1 virus (He et al., 2008), by docking 7.7 million mapping the workflow instance onto the small molecules separately on H5N1 protein (Chen & available execution resources, an executable Shoichet, 2009). Figure 2 shows a docking example, workflow is created for the next phase. which binds a molecule (ZINC12050767) to a virus  Workflow Execution is the enactment of protein (H5N1 PAC Polymerase, known as Bird flu) executable workflow by a workflow engine through the Dock 6.2 software. Virtual screening could be (Deelman & Chervenak, 2008), in which input considered as millions of docking procedures on the PAC data is consumed and output data is produced protein. (Tan et al., 2010). Workflow engine follows the order of tasks and their dependencies defined in the workflow model. It is common to re-execute the workflow iteratively, considering the evolutionary changes of workflow model (e.g., in workflow design, adding or skipping tasks, and altering task dependencies) or momentary changes of a running workflow instance (e.g., making local decisions in response to a special situation, alter decision after analysing observed intermediate result, reporting exceptional cases). Figure 2: Docking example  Workflow Publishing is a post-execution phase for scientists to interpret workflow results (Tan The application is not only a time-consuming workflow et al., 2010; Ludascher et al., 2009) and to application in which intensive computing is expected to publish the SWF in its context of use (Wroe et be performed by docking software, but also a very flexible al., 2007; Deelman & Gil, 2006). Depending on one that there is no unique solution for each computing the workflow outcomes and analysis results, the because they vary from each other on selecting docking original hypotheses or experimental goals may software. For example, scientists should identify the be revised or refined, giving rise to another context in which the experiment is organized as a round of workflow design/execution in an scientific workflow. According to the current focus and 4 CEUR Proceedings 4th Workshop HCP Human Centered Processes, February 10-11, 2011 context, they link a specific resource (e.g., software, Example: In Figure 4 (Left), CE1 is a contextual element database, and instrument) with the workflow to realize a (blue circle with number 1). The instantiation of the CE1 specific task. The concept of human-centered process is (Is the protein rigid or flexible?) leads to the generation particularly relevant in such domains. of two scientific workflow instances in Figure 4 (Right): Figure 3 provides the definition of the elements in a one is SWF_1 (i.e. value of CE2= “Rigid”), and the other contextual graph (actions, contextual elements, sub-graphs, is SWF_2 (i.e. value of CE2=“Flexible”). In the activities and temporal branching). A more complete application, if scientists want to do a rigid virtual presentation of this formalism and its implementation can screening, “rigid” will become a keyword when be found in (Brézillon, 2005). performing the searching. Thus, SWF_1 will be selected. Similarly, SWF_2 is chosen when searching for a “flexible” screening. As a result, CxGs act as an interface to make decisions to choose SWF from the SWF repository. Discussion: It is normal to expect nothing from the repository, scientist could move to the next phase to start workflow design from scratch. Workflow Designing Figure 3: Elements in Contextual graph Motivation: During workflow design, a certain degree of freedom is given to the user to execute a workflow by Workflow Searching offering multiple alternative execution paths. Classical Motivation: Before the workflow design, context workflow systems reduce the degree of flexibility by behaves as an interface to determine which SWF should offering powerful design constructs (e.g., start, if/else, be chosen from a library of SWFs, or a SWF repository. repeat until, parallel execution, end), in which decision- In this case, a scientist plays a role as a context provider making is always measured by a decision node to guide the choice of the right SWF model according to accompanying with a numerical value. However, human current focus and context at hand, so as to largely match decision is so complex that a numerical decision is less what the scientific hypotheses indicate. descriptive than a simple question. As a result, we describe execution paths of workflow in contextual graphs Realization approach: (CxGs) which model contextualized information (CEs)  Scientist firstly searches a SWF from a SWF and their dependencies. In a contextual graph, the most repository, using keywords which could best appropriate execution path could be selected from those describe their hypotheses and are coherent with the encoded during the execution time to address the context context at hand. at hand.  If the pre-existing SWF is exactly what they want, the scientist could skip workflow design phase and Realization approach: just replace with their own parameters for workflow  Firstly, it is necessary to know all the current execution directly. instances of the CEs at the moment of the  Otherwise if it is similar to their needs, slight application of the workflow. An instantiation is the modifications will be carried out shortly in the value that a contextual element can take for a workflow design. specific instantiation of the focus at hand.  Then, a group of contextualized information is generalized as a set of CEs.  CEs are then formalized in a contextual graph by their dependencies. The contextual graph is ready for the workflow execution, when a SWF instance corresponds to a specific execution path under the instantiation of context. In CxG, the execution path is a sequence of actions, connected by the instantiation of the selected contextual elements. Example: In Figure 5, a scientist designs the workflow of Context graph: virtual screening on protein PAc protein preparation as a contextual graph with a set of 1: Is the protein rigid or flexible? contextual elements (CE1 and CE4) and their execution Rigid 2: Activity: perform first rigid screening dependencies. The possible execution paths are controlled Flexible 3: Activity: perform second flexible screening by the value of each contextual element. For example, the 4: analyze the result instantiation of CE1 (i.e., value of CE1= “Yes”) and CE4 (i.e., value of CE4= “Yes”) leads to the execution path of Figure 4: (Left) Contextual graph of virtual screening on “1→2→4→11→5→6→9”. H5N1 protein; (Right) Choosing one SWF from two SWFs (SWF_1 and SWF_2) 5 CEUR Proceedings 4th Workshop HCP Human Centered Processes, February 10-11, 2011 optimize the protein) is invoked as a new SWF component. Furthermore, the contextual graph is updated along with the change of CEs, and it is necessary to record such update in a knowledge base for the sake of workflow sharing, which will be discussed in the next section. Contextual graph: protein preparation (old) 1: Can you find the protein by yourself? Yes 2: download it from "Protein Data Bank" No 3: ask for help until you get the protein 4: Do you need to do "protein preparation"? Yes 11: enter parameters during "protein preparation" 5: Activity: remove unrelated molecules 6: Activity: add hydrogen and charge 9: store the protein prepared in the database Figure 6: Contextual graph: protein preparation (new) No Discussion: It would be a risk of incoherence between the Figure 5: Contextual graph: protein preparation (old) running workflow instance and results. For example, when you made a decision two minutes ago and the Discussion: Describing a completely set of all possible contextual graph chooses an execution path for the execution paths during workflow design might be either workflow. But later, right before the workflow execution, undesirable or impossible (Schonenberg et al., 2008). For a new context arrives to urge the adaptation of a new example, a certain number of possible execution paths are contextual graph. unknown before execution. As a result, late-modelling (Han et al., 1998) could enable to make sub-model Workflow Publishing dynamically defined during execution. Motivation: If executed successfully, the scientist then try to analyse the results generalized by workflow execution. Workflow Execution Type of result analysis includes: 1) evaluate data quality Motivation: Scientists frequently re-execute the scientific (e.g., does this result make sense?), 2) examine execution workflow by adding or ignoring portions of workflow traces and data dependencies (e.g., which results were realized at design time. Context should support the “tainted” by this input dataset?), 3) debug runs (e.g., why assembling of SWF components, which must be did this step fail?), or 4) simply analyse performance (e.g., recompiled each time when a new context arrives (i.e., a which steps took the longest time?). After the result contextual element takes a new instance). As a result, a analysis process, it is possible to re-design and re-execute new execution path, or even a new contextual graph will the workflow iteratively until the new context is addressed. be inserted or removed when SWF evolves along with its Incremental knowledge acquisition should be proceeded context. to make contextual graph growing to be more efficient. Furthermore, one of the motivations what scientists are Realization approach: counting on SWF is the sharing, reproduction,  Each time a new instantiation of a CE occurs, the transformation, and evolution of the “old” SWF to be a contextual graph is re-executed, and the SWF is brand “new” one. It is expected to enable sharing of SWFs recompiled for generating a new SWF instance for according to their contexts of use. In this circumstance, execution. the context defines the status of the knowledge and also  If the scientist wants to re-design the workflow by maintains the relationship between different kinds of adding or ignoring portion of SWF, they first stop knowledge. the current workflow execution.  Then, a new group of contextualized information, Realization approach: including the information representing the workflow  A SWF repository is build up to document changes, should be generalized as a new set of workflows with their contexts of use. contextual elements.  When workflow is re-executed, the contextual graph  If a CE with the following activities/actions is added is adapted incrementally to trace the workflow or ignored, a new contextual graph is produced to flexibility. Once a new contextual graph is address the new focus. generated, add it as a new scenario to SWF repository. Example: Figure 6 is inherited from Figure 5. During the  Conscientious users might partition the workflow execution phase, the scientist finds something wrong with into coherent fragments and publish them. the intermediate result, because he doesn't take into account whether the protein is flexible or rigid. So he Example: Once a contextual element is modified, a new decides to stop the current execution and re-design the CxG is created to address the new focus and its context. experiment. As a result, a new contextual element CE7 (Is Drawn from Figure 6, Figure 7 shows a new contextual it a rigid or flexible screening?) is added. When the value graph to be added in a SWF repository for future sharing of CE7 is “flexible screening”, Activity13 (Activity: with other scientists. 6 CEUR Proceedings 4th Workshop HCP Human Centered Processes, February 10-11, 2011 Context has been considered as a key element to support decision making in human centered processes for a long time (Brézillon, 2003; Brézillon, 2010). To address a coherent formalism of context, Sowa (1984) proposes conceptual graphs with their mechanisms of aggregation and expansion. Then, Sowa (2000) introduces a way to manage the context in conceptual graphs. Brézillon (2005) presents a simpler formalism of Contextual Graphs (CxGs) Contextual graph: protein preparation (new) for representing context. Compared with other approaches, 1: Can you find the protein by yourself? Yes 2: Download it from "Protein Data Bank" CxGs formalism is good at describing decision making in which context influences the line of reasoning. No 3: Ask for help until you get the protein In the implementation level, a number of applications 4: Do you need to do "protein preparation"? exist for preparing formal representation of context. Yes 11: Enter parameters during "protein preparation" McCarthy (1993) formalizes contexts as formal objects, 5: Activity: remove unrelated molecules and the basic relation is ist(c,p). It asserts that the 6: Activity: add hydrogen and charge 7: Is it a rigid or flexible screening? proposition p is true in the context c, where c is meant to capture all that is not explicit in p that is required to make Rigid p a meaningful statement representing what it is intended Flexible 13: Activity: optimize the protein to state. Formulas ist(c,p) are always asserted within a 9: store the protein prepared in the database context, i.e., something like ist(c', ist(c,p)): c': ist (c, p). No Sharma (1995) gives a list of desirable properties for Figure 7: Contextual graph: protein preparation (new) contexts in a formal language and distinguishes four approaches for formalizing contexts: (1) incrementing arity; (2) variation on implication; (3) modal operator Discussion: Encourage sharing of scientific workflow forms; and (4) syntactic treatment. Based on McCarthy's with its context, would make it as a complementary of work on context logic, Farquhar et al. (1995) present an paper-based publications. In such a case, scientific approach to integrating disparate heterogeneous workflow would be archived along with paper-based publications. However, the quality of sharing data and information sources. In Table 1, we compare various approaches to model workflow becomes a new question. decision making in workflow, as implementation of “Exclusive Choice workflow pattern” (van de Aalst & Summary Hofstede, 2003). Contextual graphs are a formalism of representation allowing the description of decision making in which Table 1: Comparison of various implementations of context influences the line of reasoning (e.g. choice of a “Exclusive Choice workflow pattern” method for accomplishing a task). The advantage of contextual graphs relies on that: (i) CxGs provide naturally learning and explanation capabilities in the Approach Decision Decision Decision system; and (ii) CxGs allow a learning process for Element Value Type integrating new situations by assimilation and BPEL , Condition Numerical accommodation. In short, the notion of context is made (Zhang et al., value explicit during the four phases of scientific workflow 2008) lifecycle by contextual graphs. Contextual Graphs formalism has been already used in different domains CxG Contextual Value of Any value such as medicine, incident management on a subway line, (Brézillon, 2005) Element CE road sign interpretation by a driver, computer security, UML Decision Condition Numerical psychology, cognitive ergonomics, etc. (Courtney, 2001) Node value Related Works Petri-net Exclusive Arc Numerical (Tabak et al., choice expression value Various approaches, such as BPEL (Zhang et al., 2008), 1985) UML (Courtney, 2001), Petri-net (Tabak et al., 1985), are proposed to address the issue of workflow flexibility by getting user involved in representing decision-making. By comparison, Contextual Graphs plays an equivalent Applications (Yu et al., 2005; Hey et al., 2009) have role to other approaches for representing decision making. proven the significance of current systems to handle Furthermore, the advantage of contextual graphs embraces: numerical decision-making as control-flow functions, (1) multiple representations of decision making, not only such as “wait 30 second, and then proceed the next task”, with a numerical value, but also with any kind of answers “if the value is greater than 5 then execute the task_A, else to questions to get scientists involved in a local decision- execute the task_B”. However, it becomes an obstacle to making process; (2) it is directly readable (e.g. generally manage the common but important decisions, such as “are something as “If the contextual element C has the value you satisfied with the result?” and “do you need to do the V1, thus use method M1, and with the value V2 use protein preparation again”, which is more comprehensive method M2”); and (3) it is very easy to have an for scientists. incremental growth of a contextual graph by addition of contextual elements and branches for representing 7 CEUR Proceedings 4th Workshop HCP Human Centered Processes, February 10-11, 2011 practices developed by users and not yet known by the Context: 5th International and Interdisciplinary system. Conference (pp. 55-68 ). Berlin :Springer Verlag. Brézillon, P. and Pomerol, J.-Ch. (2010). Framing Conclusion decision making at two levels. In Respicio, A., F. Adam, The human-centered processes must be considered at a G. Phillips-Wren, C. Teixeira & J. Telhada (Eds.), global level to deal with the user, the task at hand, and the Bridging the Socio-Technica Gap in Decision Support context in which the task is accomplished. Take a flexible Systems- Challenges for the next Decade. Amsterdam: scientific workflow for example, scientists could not IOS Press. handle the transferring from hypotheses to discovery in Chen, Y., & Shoichet, BK. (2009). Molecular docking and the SWF system without taking into account the context. ligand specificity in fragment-based inhibitor discovery, We propose a solution to introduce contextual graphs in Nature Chemical Biology, 5 (5), 358-364. the four phases of SWF lifecycle, each of which is Claus, B., and Johnson, S. (2008). Grid computing in expressed in a standard format, including a concrete large pharmaceutical molecular modeling, Drug example in the area of virtual screening. In our application discovery today, 13(13-14), 578-583. on virtual screening, we use contextual graphs to model Courtney, J.F. (2001). Decision making and knowledge the decision making processes of scientists as a uniform management in inquiring organizations: toward a new representation of knowledge, reasoning, and contexts. As decision-making paradigm for DSS. Decision Support a result, scientists are closely involved in each phase of Systems, 31(1), 17–38. SWF lifecycle to maximize the decision support received Deelman, E. and Chervenak, A.L. (2008). Data from the system. Management Challenges of Data-Intensive Scientific We believe that all of data, information and knowledge Workflows. Proceedings of the 8th IEEE International should be invoked, assembled, organized, structured and Symposium on Cluster Computing and the Grid (pp. situated according to the given focus, and finally be 687-692), Lyon: IEEE CS Press. formulated as the chunk of professional knowledge for Deelman, E. and Gil, Y. (2006). Managing large-scale scientists to maintain their research sustainability. scientific workflows in distributed environments: The extension of our work includes the development of Experiences and challenges. Proceedings of the Second a prototype interface between scientific workflow system IEEE International Conference on e-Science and Grid and contextual graphs. Representing and making Computing (pp. 144), Washington: IEEE Computer “context” explicit in SWF system by contextual graph Society. would enhance workflow flexibility by formalizing Fan, X. et al. (2010). Context-oriented scientific scientists' research, strategies, and customization workflow system and its application in virtual information, where elements of knowledge, reasoning and screening. In Respicio, A., Adam, F., Phillips-Wren, G., contexts are represented in a uniform way. Teixeira C., and Telhada J. (Eds.). Bridging the Socio- Technical Gap in Decision Support Systems- Challenges for the next Decade (pp. 335-345). Acknowledgments Amsterdam: IOS Press. This work is supported by grants from National Natural Farquhar, A, Dappert, J, Fikes, R and Pratt, W. (1995). Science Foundation of China (90912003, 60773108, Integrating information sources using context logic 90812001, 61011130212), Centre national de la (Tech. Rep. KSL-95-12). Palo Alto, California: Stanford recherche scientifique (Researcher exchange project with University, Knowledge Systems Laboratories. NSFC 2010), and Région Ile-de-France (CP10-201), and Gil, Y., Deelman, E., Ellisman, M., Fahringer, T., Fox G., by scholarships from China Scholarship Council Gannon, D., Goble, C., Livny, M., Moreau, L., Myers, J. (2008618047), and Égide (690544G). (2007). Examining the Challenges of Scientific Workflows. Computer, 12(40), 24-32. References Goble, CA., et al. (2007). myExperiment: social networking for workflow-using e-scientists. Altintas, I., et al. (2004), Kepler: an extensible system for Proceedings of 2nd workshop on Workflows in support design and execution of scientific workflows. of large-scale science (pp. 1-2). NY: ACM. Proceedings of 16th International Conference on Han, Y., Sheth, A., Bussler, C. (1998). A Taxonomy of Scientific and Statistical Database Management (pp. Adaptive Workflow Management. In Workshop of the 423-424). Danvers, MA: IEEE Computer Society. 1998 ACM Conference on Computer Supported Beaulah, S., Correll, M., Munro, R., Sheldon, J. (2008). Cooperative Work. Seattle, Washington, USA. Addressing informatics challenges in Translational He, X., et al. (2008). Crystal structure of the polymerase Research with workflow technology. Drug discovery PA(C)-PB1(N) complex from an avian influenza H5N1 today, 13(17-18), 771-777. virus. Nature, 454, 1123–1126. Brézillon P., and Pomerol J.C. (1999). Contextual Ludascher, B., Altintas, I., Bowaers, S., Cummings, J., knowledge sharing and cooperation in intelligent Critchlow, T., Deelman, E., De, R.D., Freire, J., Goble, assistant systems, Le Travail Humain, 62 (3), 223-246. C., Jones, M., Klasky, S., McPhillips, T., Podhorszki, Brézillon, P. (2003). Focusing on Context in Human- N., Silva, C., Taylor, I., and Vouk, M. (2009). Centered Computing. IEEE Intelligent Systems, 3(18), Scientific process automation and workflow 62-66. management. In Shoshani, A., and Rotem D. (Eds), Brézillon, P. (2005). Task-realization models in contextual Scientific Data Management: Challenges, Existing graphs. In Dey, A.K., Kokinov, B., Leake, D., and Technology, and Deployment, Computational Science Turner, R. (Eds), Proceedings of Modeling and Using Series, chapter 13. Chapman & Hall/CRC. 8 CEUR Proceedings 4th Workshop HCP Human Centered Processes, February 10-11, 2011 McCarthy, J. (1993). Notes on formalizing context. In Tan, W., Missier, P., Foster, I., Madduri, R., De Roure, D., Bajcsy, R. (Ed.), Proceedings of the Thirteenth Goble, C. (2010). A comparison of using Taverna and International Joint Conference on Artificial Intelligence BPEL in building scientific workflows: the case of (pp. 555-560). San Mateo, California: Morgan caGrid, Concurrency and Computation: Practice and Kaufmann. Experience, 22(9), 1098–1117. Roache, P.J. (1998). Verification and validation in Yu, J. and Buyya, R. (2005). A Taxonomy of Scientific computational science and engineering. Albuquerque, Workflow Systems for Grid Computing. In special NM: Hermosa Publishers. issue on Scientific Workflows, SIGMOD Record, 34(3), Hey, T., et al. (2009). The Fourth Paradigm: Data- 44-49. Intensive Scientific Discovery. Redmond, Washington: van de Aalst, W.M.P., van Dongen, B.F., et al. (2003). Microsoft research. Workflow Mining: A Survey of Issues and Approaches. Schonenberg, H., Mans, R., Russell, N., Mulyar, N., van Data & Knowledge Engineering, 47, 237–267. der Aalst, W.M.P. (2008). Process flexibility: A survey van de Aalst, W.M.P., Hofstede, A., et al. (2003). of contemporary approaches. In: Dietz J., Albani A., Workflow Patterns. Distributed and Parallel Databases. and Barjis J. (Eds), Advances in enterprise engineering 14(3), 5–51. I. Berlin: Springer-Verlag. van der Aalst, W.M.P., Weske, M., and Grunbauer, D. Sharma, N (1995). On formalising and reasoning with (2005). Case Handling: A New Paradigm for Business contexts (Tech. Rep. 352). Brisbane, Queensland: Process Support. Data and Knowledge Engineering University of Queensland, Department of Computer 53(2), 129–162. Science. Wroe, C., Goble, C., Goderis, A. et al. (2007). Recycling Sowa, J.F. (1984). Conceptual Structures: Information workflows and services through discovery and reuse. Processing in Mind and Machine. Reading, MA: Concurrency and Computation: Practice and Addison Wesley Publishing. Experience, 19(2), 181-194. Sowa, J.F. (2000). Knowledge Representation: Logical, Zhang, H., et al. (2008). Extending BPEL2.0 for Grid- Philosophical, and Computational Foundations. Pacific Based Scientific Workflow Systems, Proceedings of Grove, CA: Brooks Cole Publishing. IEEE Asia-Pacific Services Computing Conference (pp. Tabak, D. and Levis, A.H. (1985). Petri Net 757-762), Yilan: IEEE CS press. Representation of Decision Models. IEEE Transactions on Systems- Man and Cybernetics, 15(6), 812–818. Tadmor, B., and Tidor, B. (2005). Interdisciplinary research and education at the biology-engineering- computer science interface: a perspective (reprinted article), Drug discovery today, 10(23-24), 1706-1712. 9