=Paper= {{Paper |id=Vol-2019/flexmde_1 |storemode=property |title=Agile Model-driven Engineering of Financial Applications |pdfUrl=https://ceur-ws.org/Vol-2019/flexmde_1.pdf |volume=Vol-2019 |authors=Kevin Lano,Howard Haughton,Sobhan Yassipour-Tehrani,Hessa Alfraihi |dblpUrl=https://dblp.org/rec/conf/models/LanoHTA17 }} ==Agile Model-driven Engineering of Financial Applications== https://ceur-ws.org/Vol-2019/flexmde_1.pdf
            Agile model-driven engineering of financial applications

                                Kevin Lano, Howard Haughton, Sobhan Yassipour-Tehrani, Hessa Alfraihi

                                                                Dept. of Informatics
                                                        King’s College London, London, UK
                                      Email: { kevin.lano, sobhan.yassipour tehrani, hessa.alfraihi }@kcl.ac.uk

Abstract—Flexibility and agility are key properties for financial applica-    (i) they involve complex mathematical computations, which need to
tions development: systems need to be easy to change and evolve to keep       be specified in a clear and comprehensible manner and related to the
up with changing market conditions and opportunities, and changes in
                                                                              appropriate finance theory, and (ii) they have high requirements for
the regulatory environment.
                                                                              efficiency and accuracy. In addition, (iii) a high degree of flexibility
     In many cases, Excel with VB is the main platform used for the           and agility is typically necessary.
development of applications in the finance industry. This offers advantages
of flexibility and user customisation of applications in rapid development        Our solution to these challenges involved using agile MDD and
cycles. However, this dependence upon a proprietary software package          executable modelling as follows:
causes problems when applications must be migrated to an alternative
platform, or when precise control over functionality is needed. Model-            •     Specification of applications in an executable UML dialect,
driven development (MDD) and model transformations have a good                          UML-RSDS [9], which supports efficient code generation in
potential for improving these aspects of financial development. In this
paper, we describe techniques that enable the use of Excel as an MDD                    Java, C#, C and C++.
front-end for the specification and development of financial applications
                                                                                  •     Adoption of an agile process based on the Scrum method
in a rigorous manner.
                                                                                        [14].

                           I.   I NTRODUCTION                                     •     Using Excel to provide a user-friendly means of defining
                                                                                        UML-RSDS specifications, so that end-users do not need to
    Software development in the financial services sector is usually                    write or understand OCL.
carried out on Excel and VBA-based platforms, with modelling and
prototyping performed using Excel [8], [7]. After prototyping, the                 We considered that this approach was suitable to address issues
Excel/VBA solution may be manually re-coded in an implementation              (i), (ii) and (iii), because UML-RSDS supports high-level declarative
platform (Figure 1). This approach has deficiencies with regard to            specification of systems using UML class diagrams and use cases,
maintainability, software architecture and documentation. Separate            enabling the system specification to be expressed in the context
spreadsheet-based applications are difficult to integrate into composite      of relevant financial concepts, and to be structured in a systematic
systems, and lack the structure and modularity of more advanced               manner. In addition, it supports the execution of specifications by
programming environments.                                                     the synthesis of designs and programming language implementations.
                                                                              UML-RSDS supports agility by enabling rapid modification of system
                                                                              specifications: only one application model (integrating class diagrams
                                                                              and use cases) needs to be modified, in contrast to MDD approaches
                                                                              which involve multiple linked models whose consistency needs to
                                                                              be managed at each specification change. The integration of agility
                                                                              and MDD in this case is achieved by making the MDD aspect as
                                                                              lightweight as possible, and by adopting a ‘model as code’ viewpoint,
                                                                              with agile practices applied to the application models instead of the
                                                                              code.
                                                                                 In Section II we give an overview of the approach. In Sections
                                                                              IV and V we describe an example financial application we have
                                                                              developed. Section VI gives related work, and Section VII gives
                                                                              conclusions.

                                                                               II.    I NTEGRATING AGILE AND MODEL - DRIVEN D EVELOPMENT
                                                                                  The key principles of agile development include (agilemani-
                                                                              festo.org): (i) satisfy the customer through early and continuous
                                                                              software delivery; (ii) welcome changing requirements; (iii) deliver
Figure 1.   Conventional financial application development process            working software frequently (every 2 weeks to every 2 months); (iv)
                                                                              business people and developers to work together daily; (v) rely on
                                                                              face-to-face communication to convey information; (vi) continuous
     Starting in 2014, the MDD team at King’s College were asked
                                                                              attention to software quality; (vii) simplicity is essential.
to look at alternative model-based approaches for specifying and
implementing financial software for multiple platforms. The charac-               Integration of agile and MDD aims to achieve these principles
teristics of such software represent a challenge for MDD, because:            in the context of model-centered development, instead of traditional
code-centered development. Research on integration has produced a              Typically, a migration or refinement model transformation in-
number of agile MDD approaches, such as MDD-SLAP [15] and                  volves the mapping of instances a : A of a source entity type A to
Hybrid MDD [4]. These usually focus on a particular domain, and            instances b : B of a target entity type B. Such mappings are formally
do not cover all lifecycle stages. In contrast, UML-RSDS can be            expressed by postconditions
used as the basis of a general-purpose agile MDD approach, with
                                                                                A ::
specialised interfaces and tools for particular domains. We adopt the
                                                                                       ACond ⇒ B→exists(b | P(self , b))
ideas of three-phase iterations from [15] and parallel tooling teams
from [4] (as shown in Figure 2).                                           of a use case that defines the transformation. The postcondition asserts
                                                                           that (at termination of the transformation) for every instance self of A
                                                                           that satisfies ACond, there must exist an instance b of B that satisfies
                                                                           P(self , b).
                                                                               Many financial applications involve processing of a similar form,
                                                                           where some source data, such as market data, is used as input to
                                                                           derive secondary information and analysis via various functions and
                                                                           procedures. For example, the derivation of the internal rate of return
                                                                           or Macaulay duration for a coupon bond, or the derivation of a price
                                                                           for a derivative security, based on market information.
                                                                               In Excel, the application data is typically laid out as a table,
                                                                           with columns for each significant attribute of the data (Table Bond in
                                                                           Figure 3). Derived data may be placed in cells of additional columns
                                                                           of the table, or in the cells of separate tables (such as BondYield). The
                                                                           second situation corresponds directly to rules of migration/refinement
                                                                           model transformations, where A is a source class (eg., Bond), and B
                                                                           a target (eg., BondYield).


Figure 2.   UML-RSDS agile MDD process: iterations


     To achieve the agile principles, we adopt the following agile
practices with UML-RSDS: (i) short iterations (principles (i) and
(iii)); (ii) model refactoring (principle (vi)); (iii) emphasis on model
and specification simplicity (principle (vii)); (iv) product and itera-
tion backlogs (principles (i), (ii)); (v) Scrumboards (principles (i),
(ii)); (vi) daily standup meetings, continuous integration and testing
(principle (vi)); (vii) Customer involvement in modelling (principle
(iv)).
    The following MDD practices are used: (i) metamodelling; (ii)
transformations; (iii) executable modelling.
    Within each iteration, phases of requirements analysis, specifi-
cation, implementation and testing are applied to each task, using
exploratory prototyping based on partial specifications, and review of
the prototypes/specifications with stakeholders/team members. Figure       Figure 3.    Excel spreadsheet example
2 shows these cycles for the main team on the LHS. Reuse of
components, and the contribution of new/updated components to a
library, can take place during these iterations. A tool development            The correspondence between Excel and UML is as follows:
team applies a similar agile process (on the RHS) to respond to tool           •       An Excel table defines a UML class with attributes for
support requirements from the main team.                                               each column of the table, with the attribute types obtained
    Prototyping and testing of completed specifications is followed                    from the data in an example row of the table. For example,
by integration with other software elements, and iterative revision of                 the Bond table defines a UML class Bond with an integer
specifications as necessary to pass tests and efficiency requirements.                 attribute Settlement, double attribute Price, etc.
                                                                                       Multiple source and target classes can be defined as tables
                                                                                       in the same spreadsheet.
              III.   MDD SPECIFICATION USING E XCEL
                                                                               •       The spreadsheet functionality corresponds to a model trans-
     Class diagrams and use cases are the core specification notations
                                                                                       formation (defined as a use case, sheet) operating on the
used in UML-RSDS to specify applications (including model trans-
                                                                                       class diagram corresponding to the tables.
formations). OCL constraints are used to define the logical postcon-
ditions of operations and use cases, and to define the procedural steps        •       When data item T of table B is defined as a function f (A.S)
(activities) of their designs [9].                                                     of the data S of another table A, this corresponds to a
         postcondition constraint                                           (+ve cash flows to the investor), and repayment of capital at the
                                                                            end of term (Figure 4). Eg.: a £100 bond is purchased for a price
              A ::
                                                                            of £105, with an investment term of 10 years, it pays 8% annual
                     true ⇒ B→exists(b | b.BId = AId & b.T = f (S))
                                                                            interest bi-annually (20 payments of £4), then £100 capital repayment
         in cases where f expects single data values, and to a              (redemption) at the end of the term. In contrast, a zero-coupon bond
         constraint                                                         only pays back accumulated gains and capital at the end of its term.
              ::
                     AS = f (A→collect(S)) ⇒ B→exists(b | b.T = AS)
         in cases where f expects aggregate data (eg., f is a function
         such as the SUM, AVERAGE, etc of a column/sequence of
         individual data items).
         In the first case, each instance (table row) of A is mapped
         to an instance (table row) of B. In the second, a single B
         instance is created with its attribute values derived from
         aggregates of A data.

The dataflow dependencies of the spreadsheet must be acyclic [7], that
is, if data item y depends on data x directly or indirectly, there cannot
be a dependency in the reverse direction. In the UML representation,
this means that the constraints can be ordered in a sequence where
all data that is read by one constraint Ci is directly available or has
been produced by earlier constraints Cj for j < i. This is the usual        Figure 4.   Finance domain bond classes in ExcelLib library
organisation of use case constraints in UML-RSDS, using the Phased
Construction transformation design pattern [10].                                Using discrete compounding, the IRR is the rate r such that:

    An example is shown in Figure 5, which is derived from Figure 3.             price = Σflows.size
                                                                                          i=1        flows[i].amount/(1 + r)flows[i].timePoint +
The use case sheet expresses the overall functionality of the spread-                                       redemption/(1 + r)flows.last.timePoint
sheet application. The derivation of UML from Excel spreadsheets in
CSV format is performed automatically by the UML-RSDS tools.                 Time can be measured in days, months, years, etc.: r will
    Excel functions such as SUM, AVERAGE, DATE, RATE, DURA-             be the rate wrt this measure. Generally, r can be estimated by
TION, PRICE, etc have been formalised using OCL. These definitions      numerical approximation techniques, eg., the secant method or
are provided in a library class ExcelLib. For example:                  bisection. Excel uses the secant procedure to estimate the IRR,
                                                                        in the function RATE(npays : int, payment : double,
query SUMPRODUCT(s1 : Sequence(double),                                 price : double, fv : double) : double where npays
                s2 : Sequence(double)) : double                         is the number of coupon payments to be made, payment is the constant
pre: s1.size = s2.size                                                  amount of each payment, price is the present value of the investment,
post:                                                                   and fv (optional) is the redemption amount at termination of the
   result = Integer.subrange(1,s1.size)->collect( i | investment. The result value is an approximation to the IRR r. In
      s1[i]*s2[i] )->sum()
                                                                        ExcelLib we provide a mathematical specification of RATE based
                                                                        on the secant procedure. From the user’s perspective, they can use
    In our style of Excel specification, logical expressions such       RATE in the same way as in Excel. Behind the scenes, they are
as Bond.Settlement are used to denote data items/groups, instead        using the ExcelLib library specification written in OCL, and from this,
of physical spreadsheet locations/ranges such as B7:B17. Logical        procedural code in Java, C#, C or C++ can be automatically generated.
specifications of Excel functions and other financial and numerical     Note that in our version of OCL, computational numeric types are
functions are provided in ExcelLib using OCL, facilitating analysis     used: 32-bit and 64-bit integers, and IEEE 754 standard double-
and formal verification. In the reverse direction, the tables of a      precision floating-point numbers. This reduces the semantic gap
spreadsheet can be synthesised from a class diagram: each concrete      between the specification and implementation, improving verifiability.
class maps to a table, with the attributes of the class mapped to table
columns.                                                                     The operation defineFlows1() of Figure 4 is specified as:
    In Sections IV and V we describe two iterations of an agile
                                                                            defineFlows1()
MDD case study in finance, for a bond-pricing application. This
                                                                            pre: true
was developed in collaboration with an industrial expert in financial
                                                                            post:
software construction.
                                                                             Integer.subrange(1,
                                                                              (maturity*frequency).floor)->forAll( t |
  IV.   C ASE STUDY ITERATION 1: I NTERNAL RATE OF RETURN
                                                                                $CashFlow->exists( f | f.amount = coupon &
     The bond-pricing application iteration 1 computes the internal               f.timePoint = (t*1.0)/frequency &
rate of return (IRR) of a bond. The IRR measures the quality of an                f : flows ) )
investment: the effective rate at which the investment returns value
over its term. A coupon bond consists of a series of cash flows: an ini-    This creates all the positive cash flows from coupon payments over
tial payment (-ve cash flow), followed by coupon payments/dividends         the lifetime of the bond. (maturity in $Bond is the term of the bond).
    To derive the IRR of a series of bonds, the user specifies two
tables in a spreadsheet, as in Figure 3. The Bond table identifies
the attributes of the Bond class, and their types, whilst the BondYield
table also defines the mapping of Bond to BondYield, using the RATE
function. The function in cell D12 is

RATE((Bond.Maturity - Bond.Settlement) *
  Bond.Frequency,
    Bond.Coupon,
    Bond.Price, 100)

    The UML-RSDS tools convert the spreadsheet to the class dia-
gram of Figure 5, and derive the following postcondition constraint
of the use case sheet:

Application::
  bondx : Bond =>
     BondYield->exists( bondyieldx |
        bondyieldx.BondId = bondx.BondId &
        bondyieldx.Yield = RATE(                                              Figure 5.   Class diagram derived from Excel spreadsheet
          ( bondx.Maturity - bondx.Settlement ) *
             bondx.Frequency,
               bondx.Coupon,bondx.Price,100) &                                the equation:
        bondyieldx.Duration = DURATION(
              DATE(bondx.Settlement,1,1),                                          y(t) = β1 + β2 ∗ (1 − exp(−t/ λ1 ))/(t/ λ1 ) +
               DATE(bondx.Maturity,1,1),
               bondx.Coupon,                                                                  β3 ∗ ((1 − exp(−t/ λ2 ))/(t/ λ2 ) − exp(−t/ λ2 ))
               bondyieldx.Yield,
               bondx.Frequency) )                                             This models how the yield y(t) of a bond varies depending on its
                                                                              duration t.
The meaning of this postcondition is that for each Bond instance, a                The yield curve in this model has a long-term rate component
BondYield instance will be produced with its Yield value set to the           (β1 ), a short-term (2nd factor), and a ‘hump’ (3rd factor). The
IRR computed using RATE. In addition, we also compute the discrete            problem is to estimate the βi and λj , given market data – ‘fitting the
Macaulay duration of the bond (the time to maturity of the equivalent         curve’ to this data. Estimation procedures to find the NSS parameters
zero-coupon bond):                                                            that fit a given set of market bond data include genetic algorithms
                                                                              (GA) and Matlab’s fminsearch using a simplex algorithm [5].
     duration =
                                                                              We use a combination of genetic algorithms and the Nelder-Mead
         (Σflows.size
            i=1       flows[i].timePoint ∗ flows[i].amount/
                                                                              simplex algorithm [12]. These algorithms are specified using OCL in
                       (1 + yield)flows[i].timePoint )/
                                                                              ExcelLib.
               (Σflows.size
                 i=1        flows[i].amount/(1 + yield)flows[i].timePoint )
                                                                                   The adapted Nelson-Siegal-Svensson model is specified in UML-
This computation can use the previously-computed yield value of               RSDS by the nelsonsiegalx and nsx functions of the finance library
each BondYield instance. In the spreadsheet, the value of the Duration        class $Bond (Figure 4). These are then used to evaluate the fitness
column (cell E12) is expressed as:                                            of a candidate solution (a tuple of values for the parameters β1 , β2 ,
                                                                              β3 , λ1 , λ2 ), based on the sum of squares of differences between the
DURATION(DATE(Bond.Settlement,1,1),                                           actual and predicted interest rates for the market data.
    DATE(Bond.Maturity,1,1),
    Bond.Coupon, BondYield@pre.Yield,                                             NSS-estimation is performed by an ExcelLib function
    Bond.Frequency)                                                           ESTIMATENSX(yields:              Sequence(double), durations:
                                                                              Sequence(double)) : Sequence(double). This is called on
The @pre annotation indicates that the Yield derived by the preceding         BondYield.Yield, BondYield.Duration (considered as sequence
equation should be used. A continuous duration function DURA-                 data) to populate a result table NelsonSiegal.
TIONC is also available.                                                          ESTIMATENSX initialises a population for a genetic algorithm
                                                                              using the bond interest rate (yield) and duration data, and iterates
      V.   C ASE STUDY ITERATION 2: Y IELD CURVE FITTING                      the genetic algorithm until there is no further improvement in the
                                                                              maximum fitness in the population. The best result is then passed to
    In iteration 2 of the bond-pricing application, the yield and             a numerical optimiser, in our case the simplex algorithm.
duration derived in iteration 1 are used as input datapoints for
a process which attempts to find the best-fitting yield curve for                 Having derived the NSS parameters, these can then be used
interest rate data, according to some yield curve model such as               to calculate the fair price of bonds of different durations, using a
the Nelson-Siegal or Nelson-Siegal-Svensson models. The adapted               formalisation of the Excel PRICE function. This functionality was
Nelson-Siegal-Svensson (NSS) model [5] which we use is defined by             developed in iteration 3.
    We compared our formalisation of this bond-pricing procedure            These could then be used directly in the Excel-style specifications
with a traditional Excel/VBA solution, in terms of software size,           used by our approach.
accuracy and efficiency A Java implementation of the UML-RSDS
specification was used, applied to a test case of 8 coupon bonds                                       VII.    C ONCLUSIONS
ranging from 1 year to 12 year terms. Code size was reduced from
34K to 22K, accuracy improved from 2 ∗ 10−4 to 6 ∗ 10−6 , however               We have described techniques to combine agile MDD with
execution time increased from 550ms to 730ms, mainly due to use             conventional Excel-based financial application development. Excel
of a genetic algorithm.                                                     spreadsheets are interpreted as UML class diagrams operated on by
                                                                            model transformation specifications, and executable implementations
   The curve produced from the estimated parameters using our               of these transformations can then be synthesised in production-quality
approach is shown in Figure 6.                                              languages. In future work we will evaluate the usability of the
                                                                            approach with financial engineers, and extend the OCL finance library
                                                                            to provide comprehensive facilities for the use of MDD in the finance
                                                                            domain.

                                                                                                            R EFERENCES
                                                                             [1]   Cincom, JP Morgan derives clear benefits from Cincom Smalltalk,
                                                                                   www.cincom.com/pdf/CS040819-1.pdf, accessed October 2016.
                                                                             [2]   J. Cunha, J. Saraiva, J. Visser, From spreadsheets to relational databases
                                                                                   and back, PEPM 2009.
                                                                             [3]   J. Cunha, J. Fernandes, J. Mendes, H. Pacheco, J. Saraiva, Bidirectional
                                                                                   transformations of model-driven spreadsheets, ICMT 2012.
                                                                             [4]   G. Guta, W. Schreiner, D. Draheim, A lightweight MDSD process
                                                                                   applied in small projects, Proceedings 35th Euromicro conference on
                                                                                   Software Engineering and Advanced Applications, IEEE, 2009.
                                                                             [5]   M. Gilli, S. Grosse, E. Schumann, Calibrating the Nelson-Siegal-
                                                                                   Svensson model, COMISEF working paper WPS-031, 2010.
                                                                             [6]   O. Hammarlid, Aggregating sectors in the infectious defaults model,
Figure 6.   Estimated NSS curve
                                                                                   Quantitative Finance, vol. 4, no. 1, 2004, pp. 64–69.
                                                                             [7]   Y. Horry, Financial information description language and visualisa-
                                                                                   tion/analysis tools, Computer Languages, Systems and Structures 50
                        VI.   R ELATED WORK                                        (2017), pp. 31–52.
                                                                             [8]   A. Kumiega, B. Van Vliet, A software development methodology for
    The Kapital system used by JP Morgan has been a highly success-                research and prototyping in financial markets, Proc. European Spread-
ful alternative approach in finance [1], enabling financial engineers to           sheet Risks Int. Grp., pp. 107–127, 2006.
quickly prototype and implement new financial products. However,             [9]   K. Lano and S. Kolahdouz-Rahimi, Constraint-based specification of
Kapital is based on a specialised platform (Smalltalk). We consider                model transformations, Journal of Systems and Software, vol. 88, no.
that basing a financial engineering platform on language-independent               2, February 2013, pp. 412–436.
UML and OCL specifications will provide more flexible and durable           [10]   K. Lano, S. Kolahdouz-Rahimi, Model-transformation Design Patterns,
                                                                                   IEEE Transactions in Software Engineering, vol 40, 2014.
support than a language-specific solution. In [11] we describe the risk
                                                                            [11]   K. Lano, H. Alfraihi, S. Yassipour-Tehrani, H. Haughton, Improving
evaluation of multiple investment instruments [6] using UML-RSDS,                  the Application of Agile Model-based Development: Experiences from
this direct MDD approach was technically effective but was consid-                 Case Studies, ICSEA 2015.
ered too difficult for financial engineers to use. As an alternative, we    [12]   J. Nelder, R. Mead, A simplex method for function minimisation,
investigated the Excel-based approach described here. An application               Computer Journal, 7, pp. 308–313.
of agile MDD in finance is given in [13], however this is focussed on       [13]   M. B. Nakicenovic, An Agile Driven Architecture Modernization to a
the management of data and message formats instead of numerical                    Model-Driven Development Solution, International Journal on Advances
computations. In [2], spreadsheets are mapped to relational databases              in Software, vol 5, nos. 3, 4, 2012, pp. 308–322.
in order to optimise the spreadsheets. In [3], spreadsheet evolution is     [14]   K. Schwaber, M. Beedble, Agile software development with Scrum,
                                                                                   Pearson, 2012.
supported via a mapping to a semantic representation. These papers
                                                                            [15]   Y. Zhang, S. Patel, Agile model-driven development in practice, IEEE
use transformations to support spreadsheet-based development, whilst
                                                                                   Software, vol. 28, no. 2, pp. 84–91, 2011.
our work uses spreadsheets to define model transformations. A similar
data-flow logical specification of financial applications is described in
[7], although all the data derivations are performed within a single
table, whilst we separate data into possibly multiple source, target
and intermediate tables.

    QuantLib (quantlib.org) is a widely-used library of financial
analysis functions, written in C++ (interfaces and partially-ported
versions also exist for other programming languages). In contrast to
QuantLib, we define financial functions in a platform-independent
formalism (OCL), from which language-specific implementations
can be automatically generated. In future work, we aim to include
substantial parts of the QuantLib libraries in our OCL financial library.