-

Antonio Vallecillo

av@lcc.uma.es 0

Martin Gogolla

gogolla@informatik.uni-bremen.de 1 0 Universidad de Ma ́laga , Spain 1 University of Bremen , Germany

-This paper presents an extension of OCL to allow modellers to deal with random numbers and probability distributions in their OCL specifications. We show its implementation in the tool USE and discuss some advantages of this new feature for the validation and verification of models.

There are many situations in which there is a degree of uncertainty about some aspects, features or properties of the system to be modelled. For example, if we are modelling a community of human beings, what is the percentage of female persons that we want to include in our models? How many of them are expected to be left-handed? This may be important in order to generate the models that will be used during the testing, validation and verification processes so that they are as much accurate and representative as possible. Similarly, assuming that we are modelling a manufacturing system with

UML and OCL [10], how to handle the requirements that

orders are received following an exponential distribution, or that 0.5 % of the parts are produced with some kind of defect?

In most modelling and simulation environments, the use of random numbers and probability distributions are used to combine definite knowledge (female, male; left-handed, righthanded) with an uncertain view on the result or the population for a test case. Expectations and assumptions that remain uncertain or imprecise at high-level, are made precise and can be realized by stating the corresponding percentages, or the probability distributions that some properties or parameters follow. In this way, confidence on the validation and verification processes can be increased by experimenting with different percentages (that have different likelihoods) and by inspecting the results obtained. Likewise, random numbers are used to generate test models with varying sizes and characteristics, in order to increase the coverage of the test cases.

In this paper, we present an extension to OCL that allows

modellers to handle random numbers in their specifications, as well as probability distributions. We show its implementation in the tool USE [ 4 ], [5] (UML-based Specification

Environment), and discuss some examples for the validation and verification of models. Non-determinism and random-like behavior is already

present in OCL, for example, through the operation any().

Most proof-theoretic OCL approaches, such as [1] or [3]

do not speak about the evaluation of equations involving any(), for example, Setf7,8g->any(true) = Setf7,8g->any(true). In principle, an OCL evaluator could give different results for the two calls of any(), although USE [ 4 ] (and every other OCL evaluator that we are aware of) gives the same result for both calls. But there are also other OCL operations that introduce non-deterministic, random-like effects, for example, when one converts a collection without order into an order-aware collection, e.g., in Bagf7,8,7g->asSequence(). However, randomness also involves returning different (valid) values in separate executions of the operations.

The paper is organized in 7 sections. After this introduction, Sect. 2 provides an overview of what OCL currently offers and

what should be required. Then, Sect. 3 and Sect. 4 describe the

OCL extensions to deal with random numbers and probability

distributions, respectively. Section 5 illustrates the proposal with a couple of examples, and Sect. 6 gives details about how the new OCL operations have been implemented. Finally,

Sect. 7 concludes and outlines some lines of future work. II. RANDOMNESS IN OCL Non-determinism and random-like behavior is already present in OCL mainly through collections operations asSequence() and any() [8]. More precisely,

OCL defines any() as follows: Returns any element in the source collection for which body evaluates to true. Returns invalid if any body evaluates to invalid for any element, otherwise if there are one or more elements for which body is true, an indeterminate choice of one of them is returned, otherwise the result is invalid.

Then, the OCL standard [8, 11.9.1] formally specifies it as follows:

source >any ( iterator j body ) = source >select ( iterator j body ) >asSequence ( ) >first ( )

As we can see, it bases its indeterminism in the behaviour of operation asSequence(), which is defined for general type

Collection and returns a Sequence that contains all the elements from self, in an order dependent on the particular concrete collection type. Its specification is as follows: p o s t : result >forAll ( elem j self >includes ( elem ) ) p o s t : self >forAll ( elem j result >includes ( elem ) )

For example, for a given Set it returns a Sequence that

contains all the set elements, in undefined order. p o s t : result >forAll ( elem j self >includes ( elem ) ) p o s t : self >forAll ( elem j result >count ( elem ) = 1 )

As mentioned above, despite in theory this allows any OCL evaluator to return a different value for the same Set every time it is executed, in practice this does not happen and the same element is always returned. The problem is that when it comes to other collections, the

OCL specification of any() seems to be wrong, since it only works for Set and Bag collections because for the other two there is no indeterminism at all. More precisely, for Bag it may work, since operation asSequence() returns a Sequence that contains all the elements from self, in undefined order. p o s t : result >forAll ( elem j

self >count ( elem ) = result >count ( elem ) ) p o s t : self >forAll ( elem j

self >count ( elem ) = result >count ( elem ) )

However, the behaviour of asSequence() is completely

deterministic for collections OrderedSet and Sequence.

For the former, the operation returns a Sequence that con

tains all the elements from self, in the same order. p o s t : Sequence f 1 . . self . size ( )g >

forAll ( i j result >at ( i ) = self >at ( i ) )

Similarly for Sequence collections, where

asSequence() returns the Sequence identical to the object itself. This operation exists for convenience reasons. p o s t : result = self

This means that any() applied to a Sequence or an OrderSet will always return its first element, and not

an indeterminate choice of one of them as its specification requires.

This is why we propose the following specification for

operation any(), which does not have this problem: p o s t : self >includes ( result )

III. SPECIFYING RANDOM NUMBERS IN OCL

Random numbers are generated by extending OCL type Real with an operation called rand(). If x.oclIsOfType(Real) then x.rand() returns a random Real number between 0 and x. c o n t e x t Real : : rand ( ) : Real p o s t indeterminism : i f self > 0 . 0 then

( 0 . 0 <= result ) and ( result < self ) e l s e i f self < 0 . 0 then

( result <= 0 . 0 ) and ( self < result ) e l s e / self = 0 . 0 / result = self e n d i f e n d i f

For example 1.rand() returns a random number in the in

terval [0..1). If you need a number in the interval [a..b) you can use the expression “a + (b-a).rand().”

Note that every invocation of rand() operation may return a different number, and that randomness requires an additional requirement to the postcondition (indeterminism) expressed above. This is why operation any() is not enough to implement random numbers. Randomness also requires that the sequence of results obtained by consecutive calls to operation rand() contains no recognizable patterns or regularities—i.e., that the sequence is statistically random [7].

However, specifying this property in OCL deserves its own

line of research [ 2 ] and it is postponed for future work.

In addition, we need seeds. Operation srand() permits knowing and changing the seed for the random number generator. It is defined over Integers:

c o n t e x t I n t e g e r : : srand ( ) : I n t e g e r

Then, given an integer n, if n > 0 then n.srand() starts a new random sequence with n as the new seed (the seed is an integer), and returns the value of the previous seed. To accommodate to the current possibilities of USE, we decided to restrict to integer values below 107. Thus, this operation takes the given value modulo 107. If n <= 0, this operation generates a seed automatically using the current time and other system values.

To illustrate how these operations work, the following listing shows their results when executed in USE:

IV. PROBABILITY DISTRIBUTIONS IN OCL Using the random number generator operation, it is easy to build the most commonly used distribution probability functions:

c o n t e x t Real : : normalDistr ( s : Real ) : Real c o n t e x t Real : : pdf01 ( ) : Real c o n t e x t Real : : pdf ( m : Real , s : Real ) : Real c o n t e x t Real : : cdf01 ( ) : Real c o n t e x t Real : : cdf ( m : Real , s : Real ) : Real c o n t e x t Real : : expDistr ( ) : Real

They are all defined as extensions to type Real. With this, if x is a real number, then x.normalDistr(s) returns a value of a Normal (Gaussian) distribution N (x; s) (for example 0.normalDistr(1) returns the value of a N (0; 1) distribution), x.pdf01() returns a value of the distribution function

P DF (x) of a Gaussian Distribution N (0; 1), i.e., with mean=0 and = 1, x.pdf(m,s) returns a value of the distribution function

P DF (x) of a N (m; s), x.cdf01() returns a value of the cumulative distribution function CDF (x) of a Gaussian Distribution N (0; 1), x.cdf(m,s) returns a value of the cumulative distribution function CDF (x) of a N (m; s), and x.expDistr() returns a value of an exponential distribution with mean x, i.e., Exp(1=x), or 0:0 if x=0.0.

V. TWO SIMPLE EXAMPLES A. A Production System To illustrate our proposal let us suppose first a simple

production system whose metamodel is depicted in Fig. 1. Producers produce items that are placed in trays (bounded buffers), from where consumers collect them when informed that elements are ready (by operation elementReady()), polish them, and finally place them in the storage trays. We want to simulate the system with some uncertainty about the time producers take generating items and the probability of producers and consumers to introduce defects when handling the items.

For example, suppose that we want producers to produce

items according an exponential distribution with mean 5:0, and that the probability of machines to introduce defects is 0:05. Using our OCL extension and its implementation in USE the description of operations Producer::produce() and Consumer::elementReady() is as follows: produce ( ) : Item begin result:=new Item ; self . counter:=self . counter+1; result . productionTime :=

self . meanProductionTime . expDistr ( ) ; result . polished := false ; result . defective := i f 1 . rand ( ) < 0 . 0 5 then true e l s e false e n d i f ; end elementReady ( ) begin declare it : Item ; it:=self . input . get ( ) ; it . polished := i f 1 . rand ( ) < 0 . 0 5 then false e l s e true e n d i f ; it . defective := it . defective or ( i f 1 . rand ( ) < 0 . 0 5 then true e l s e false e n d i f ) ; self . storageTray . put ( it ) ; self . counter:=self . counter+1; end T:Tray x=1 y=0 cap=3 Object diagram

P:Producer

One possible result of executing the system after the production of 3 items is shown in Fig. 2. B. A Social Network Random numbers can also be used to determine other

parameters of the system, or even the number of objects that we would like to have in our test models.

In Fig. 3 a simple model of a social network is shown. For

validation purposes, two object diagrams (shown in Fig. 4) have been generated by operation generate() using the proposed random features. The generated object diagrams differ with respect to attribute values and the structure that is defined by the Friendship links.

The definition of operation generate() is given below. The example demonstrates that, with the newly introduced OCL features, the generation of test cases showing different characteristics is supported.

generate ( numObj : Int , numLink : Int ) begin declare i : Int , p , q : Profile , ps : Seq ( Profile ) ; ps:= Sequence fg; for i in Sequence f 1 . . numObjg do p:=new Profile ; ps:=ps >including ( p ) ; p . firstN:=

names >at(1+names >size ( ) . rand ( ) . floor ( ) ) ; end ; for i in Sequence f 1 . . numLinkg do p:=ps >at(1+ps >size ( ) . rand ( ) . floor ( ) ) ; q:=ps >at(1+ps >size ( ) . rand ( ) . floor ( ) ) ; i f p . inviter >excludes ( q ) and p . invitee >excludes ( q ) then

insert ( p , q ) into Friendship end ; end ; end

Let us describe now how we have implemented this ex

tension in USE. First, USE provides an extension mechanism that permits adding operations to basic types. Folder oclextensions in the USE directory permits adding new files with the signature of the new operations, and their implementation in Ruby [9].

For example, to add operation sqrt to OCL type Real we use the following piece of code in one of the files (e.g.

Real.xml) in the oclextensions folder: <operation source=” Real ” name=”sqrt” returnType=” Real”> <body><![CDATA [

Math . sqrt ( $self ) ]]> </body> </operation>

Making use of this mechanism, and the Random library available in Ruby, the implementation of rand() and srand() operations is simple: <operation source=” Real ” name=”rand” returnType=” Real”> <body><![CDATA [

$self Random . rand ]]> </body> </operation> <operation source=” I n t e g e r ” name=”srand”

returnType=” I n t e g e r”> <body><![CDATA [ i f $self > 0

return Random . srand ( $self ) % 1000000 e l s e

return Random . srand ( ) % 10000000 end ]]> </body> </operation>

If self is positive then srand() starts a new random sequence with self as new seed (the seed is integer), and it returns the current seed. Given that Ruby’s initial seed is a huge integer number that cannot be handled by USE, this operation takes the modulo with 107. If self is equal or less than 0 then the operation uses the default Ruby srand() operation that generates a seed automatically using the time and other system values.

Finally, we have also implemented the probability distributions mentioned in Sect. 3 and show some of them in the following listing. <operation source=” Real ” name=”expDistr” returnType=” Real”> <body><![CDATA [ i f $self != 0 return $self ( 6 . 9 0 7 7 5 5 3

Math . log ( Random . rand ( 1 0 0 0 ) + 1 ) ) e l s e

return 0 . 0 end ]]> </body> </operation> <operation source=” Real ” name=”normalDistr” returnType=” ,!Real”> <parameter> <par name=”s” type=” Real ” /> </parameter> <body><![CDATA [ return $self + ( $s Math . sqrt( 2.0

Math . log ( Random . rand ) )

Math . cos ( 6 . 2 8 3 1 8 5 3 0 7 Random . rand ) ) ]]> </body> </operation> <operation source=” Real ” name=”pdf” returnType=” Real”> <parameter> <par name=”m” type=” Real ” /> <par name=”s” type=” Real ” /> </parameter> <body><![CDATA [ ( 1 . 0 / ( Math . sqrt ( 2 Math : : PI ) ) ) Math : : exp ( ( ( ( ( $self $m ) / $s ) 2) / 2 . 0 ) ) / $s ]]> </body> </operation> <operation source=” Real ” name=”cdf” returnType=” Real”> <parameter> <par name=”m” type=” Real ” /> <par name=”s” type=” Real ” /> </parameter> <body><![CDATA [ # Distribution . Normal . cdf ( ( $self $m ) / $s ) def cdf01 ( z ) 0 . 0 i f z < 12 1 . 0 i f z > 12 0 . 5 i f z == 0 . 0 z2 ) / ( Math . sqrt ( 2

Math : : i f z > 0 . 0

e = true e l s e e = false z = z end z = z . to_f z2 = z z t = q = z Math . exp( 0.5

,!PI ) ) 3 . step ( 1 9 9 , 2 ) do j i j prev = q t = z2 / i q += t i f q <= prev

return ( e ? 0 . 5 + q : 0 . 5 end end e ? 1 . 0 : 0 . 0 end cdf01 ( ( $self $m ) / $s ) ]]> </body> </operation>

VII. CONCLUSIONS In this paper we have introduced a simple extension of

OCL to deal with random numbers and probability distributions in OCL specifications. It uses the USE extension mechanisms to implement the new operations, employing the underlying Ruby implementation and some of its supported functions. All files and operations described here can be downloaded from https://www.dropbox.com/s/ 2j9tgejbj507id0/oclextensions.zip?dl=0. To our knowledge, the only similar proposal is [6], a modelling framework for the predictive analysis of architectural properties.

Counting on these new operations offers interesting benefits to model developers and testers. For example, they are now able to capture some assumptions of the real world that correspond to stochastic events, or for which there is little information. We are also able to generate random sets of models, and models with random values in their elements’ attributes, thus permitting richer input test suites for achieving model-based testing.

Our current plans for extensions of this work include the

experimentation with larger case studies, in order to analyze the applicability and expressiveness of our approach, and the addition of further probability distributions that could be required in other situations.

Acknowledgments. This work was supported by Research

Project TIN2014-52034-R.

[1]

Baar . Non-deterministic Constructs in OCL: What Does any() Mean . In Proc. 12th Int. SDL Forum, LNCS 3530 , pages 32 - 46 , 2005 .

[2]

Robert

Bill ,

Achim D.

Brucker , Jordi Cabot, Martin Gogolla, Antonio Vallecillo, , and

Edward D.

Willink . Workshop in ocl and textual modelling. report on recent trends and panel discussions . In Proc. of STAF 2017 Satellite Events, LNCS . Springer, 2017 .

[3]

A.D.

Brucker and

Wolff . HOL-OCL: A Formal Proof Environment for UML/OCL . In Proc. FASE'08, LNCS 4961 , pages 97 - 100 , 2008 .

[4]

Gogolla , F. Bu¨ttner, and M. Richters . USE: A UML-based specification environment for validating UML and OCL . Science of Computer Programming , 69 : 27 - 34 , 2007 .