=Paper= {{Paper |id=Vol-1560/paper6 |storemode=property |title=Testing Extensible Language Debuggers |pdfUrl=https://ceur-ws.org/Vol-1560/paper6.pdf |volume=Vol-1560 |authors=Domenik Pavletic,Syed Aoun Raza,Kolja Dummann,Kim Haßlbauer |dblpUrl=https://dblp.org/rec/conf/models/PavleticRDH15 }} ==Testing Extensible Language Debuggers== https://ceur-ws.org/Vol-1560/paper6.pdf
                Testing Extensible Language Debuggers
       Domenik Pavletic               Syed Aoun Raza                       Kolja Dummann                       Kim Haßlbauer
            itemis AG                Stuttgart, Germany                      itemis AG                       Stuttgart, Germany
       Stuttgart, Germany           aoun.raza@gmail.com                 Stuttgart, Germany              kim.hasslbauer@gmail.com
      pavletic@itemis.com                                             dummann@itemis.com




   Abstract—Extensible languages allow incremental extensions of           on top of JetBrains Meta Programming System (MPS) [3] and
a host language with domain specific abstractions. Debuggers for           ships with a set of language extensions dedicated to embedded
such languages must be extensible as well to support debugging of          software development. mbeddr includes an extensible C99
different language extensions at their corresponding abstraction
level. As such languages evolve over time, it is essential to              implementation. Further, it also includes a set of predefined
constantly verify their debugging behavior. For this purpose, a            language extensions on top of C. These extensions include
General Purpose Language (GPL) can be used, however this                   state machines, components and physical units.
increases the complexity and decreases the readability of tests. To           In MPS, language implementations are separated into as-
reduce continuous verification effort, in this paper, we introduce         pects. The major aspects are Structure, Type System,
DeTeL, an extensible Domain-Specific Language (DSL) for testing
extensible language debuggers.                                             Constraints, Generator and Editor. However, for build-
   Index Terms—Formal languages, Software debugging, Soft-                 ing debugging support, the Editor aspect is irrelevant.
ware testing.
                                                                                III. L ANGUAGE E XTENSION FOR U NIT T ESTING
                       I. I NTRODUCTION
                                                                              To give an idea of building language and debugger ex-
   Software development faces the challenge that GPLs do                   tensions, we first build MUnit, a language for writing unit
not provide the appropriate abstractions for domain-specific               tests, and a corresponding debugger extension. Later, we will
problems. Traditionally there are two approaches to overcome               describe how to test this debugger extension with a DSL.
this issue. One is to use frameworks that provide domain-
specific abstractions expressed with a GPL. This approach has              A. Structure
very limited support for static semantics, e. g., no support for              Fig. 1 shows the language structure: AssertStatement is
modifying constraints or type system. The second approach                  derived from Statement and can therefore be used where
is to use external DSLs for expressing solutions to domain                 Statements are expected. It contains an Expression for
problems. This approach has some other drawbacks: these                    the condition. Testcase holds a StatementList that con-
DSLs are not inherently extensible. Extensible languages solve             tains the Statements that make up the test. Further, to
these problems. Instead of having a single monolithic DSL, ex-             have the same scope as Function, Testcase implements
tensible languages enable modular and incremental extensions               IModuleContent. ExecuteTestExpression contains a list
of a host language with domain specific abstractions [1].                  of TestcaseRef, which refer to Testcases to be executed.
   To make debugging extensible languages useful to the
language user, it is not enough to debug programs after
extensions have been translated back to the host language
(using an existing debugger for the base language). A debugger
for an extensible language must be extensible as well, to
support debugging of modular language extensions at the same
abstraction level (extension-level). Minimally, this means users
can step through constructs provided by the extension and see
watch expressions (e. g., variables) related to the extensions.                                 Fig. 1. Language structure
   Because language extensions can be based on other ex-
tensions and languages evolve over time, it is essential to                B. Type System and Constraints
constantly test if debugger behavior matches the expected                     AssertStatement requires a constraint and a type system
behavior. To test debugging behavior, a GPL can be used,                   rule. It restricts the usages only inside Testcases, meaning
however this raises the same issues discussed above. We                    an AssertStatement can only be used in a Testcase:
therefore propose in this paper DeTeL (Debugger Testing                     parentNode.ancestor.isNotNull
Language), an extensible DSL for testing debuggers.
                          II. MBEDDR                                         It also restricts the type of its child expr (condition) to
                                                                           BooleanType, so only valid conditions can be entered:
   mbeddr [2] is an extensible version of C that can be
extended with modular, domain-specific extensions. It is built              check(typeof(assertStatement.expr) :<=: );




                                                                      34
   ExecuteTestExpression returns the number of failed unit                           for specifying the debugging semantics of language concepts.
tests, hence we specify Int32tType as its type (see rule                             Second, a runtime for executing those specifications and thus
below). Later, the same type is used in the generator.                               achieving the mapping described in Fig. 2.
 check(typeof(executeTestExpression) :==: );                              In this section, we provide an overview of the specification
                                                                                     part (see Fig. 3) that is required for understanding how
                                                                                     the debugger extension for MUnit is built. While this paper
C. Generator
                                                                                     concentrates on testing debuggers for extensible languages, we
   The MUnit generator consists of many different transfor-                          have published another paper [4] describing details about the
mation rules, which translate code written with the language                         debugger framework and its implementation with MPS.
directly to mbeddr C. Listing 1 shows on the left hand side
an example program, written with mbeddr C and MUnit. The                             A. Breakpoints
right hand side shows the C program generated from it. While
                                                                                       Breakables are concepts (e. g., Statements) on which we
regular mbeddr C code is not colored, the boxes indicate how
                                                                                     can set breakpoints to suspend the program execution.
Abstract Syntax Tree (AST) nodes from the left are translated
to C code on the right.
                                                                                     B. Watches
 1    int32 main(int32 argc,             1    int32_t main(int32_t argc,
 2        string[] argv) {               2        char *(argv[])) {
                                                                                        WatchProviders are translated to low-level watches
 3         return    test[ forTest ] ;   3       return blockexpr_2() ;              (e. g., Argument) or represent watches on the extension-
 4    }                                  4    }                                      level. They are declared inside WatchProviderScopes (e. g.,
           _f;                                    |
 5                                       5
                                                                                     StatementList), which is a nestable context.
 6         blockexpr_2();                6    int32_t blockexpr_2(void) {
 7         }                             7        int32_t _f = 0;                    C. Stepping
 8         }                             8        _f += test_forTest();
           int32_t bp_2() {
                                                                                        Steppables define where program execution must suspend
 9                                       9        return _f;
10         i32_t _f = 0;                 10   }
                                                                                     next, after the user steps over an instance of Steppable (e. g.,
11                                       11                                          Statement). If a Steppable contains a StepIntoable (e. g.,
12     testcase forTest {                12   int32_t test_forTest() {               FunctionCall), then the Steppable also supports step into.
13         |                             13       int32_t _f = 0;                    StepIntoables are concepts that branch execution into a
14         int32 sum = 0;                14       int32_t sum = 0;                   SteppableComposite (e. g., Function).
15         assert:   sum == 0 ;          15       if(!( sum == 0 )) { _f++; }
16         int32[] nums = {1, 2, 3};     16       int32_t[] nums = {1, 2, 3};
                                                                                        All stepping is implemented by setting low-level break-
17         for(int32 i=0;i<3;i++){       17       for(int32_t i=0;i<3;i++){          points and then resuming execution until one of these break-
18           sum += nums[i];                          sum += nums[i];
19         }
                                         18
                                         19       }
                                                                                     points is hit (approach is based on [6]). The particular stepping
20         assert:   sum == 6 ;          20       if(!( sum == 6 )) { _f++; }        behavior is realized through stepping-related concepts by
21     _f++;                             21       return _f;                         utilizing DebugStrategies.
22     }                                 22   }
                                                                                     D. Call Stack
 Listing 1. Example mbeddr program using the unit test language on the left
 and the C code that has been generated from it on the right                           StackFrameContributors are concepts that have callable
                                                                                     semantics on the extension-level or are translated to low-level
                 IV. MBEDDR D EBUGGER F RAMEWORK                                     callables (e. g., Functions). While the latter do not contribute
                                                                                     any StackFrames to the high level call stack, the former
   mbeddr comes with a debugger, which allows users to                               contribute at least one StackFrame.
debug their mbeddr code on the abstraction levels of the used
languages. For that, each language contributes a debugger
extension, which is built with a framework also provided by
mbeddr [4]. Those extensions are always language-specific in
contrast to domain-specific debuggers (e. g., the moldable de-
bugger [5]), which provide application-specific debug actions
and views on the program state. Hence, debugging support is
implemented specifically for the language by lifting the call                        Fig. 3. Meta-model used for specifying the debugging semantics of language
stack/program state from the base-level to the extension-level                       concepts [4]. Colors indicate the different debugging aspects
(see Fig. 2) and stepping/breakpoints vice versa.
                                                                                      V. D EBUGGER E XTENSION FOR THE MU NIT L ANGUAGE
                                                                                        This section describes the implementation of a debugger ex-
     Fig. 2. Flow of debug information between base and extension level [4]
                                                                                     tension for the MUnit language. This extension is defined with
  The debugger framework can be separated into two different                         the mbeddr debugger specification DSL and the abstractions
parts: First, a DSL and a set of interfaces (shown in Fig. 3)                        of the debugging meta-model shown in Fig. 3.



                                                                                35
A. Breakpoints                                                                                 VI. R EQUIREMENTS
  To enable breakpoints on AssertStatements, an imple-
                                                                            The debugger testing DSL must allow us to verify at
mentation of the Breakable interface is required. Assert-
                                                                         least four aspects: call stack, program state, breakpoints and
Statement is derived from Statement that already imple-
                                                                         stepping. To cover these requirements in DeTeL we delineate
ments this interface, thus breakpoints are already supported.
                                                                         in this section requirements. While we consider some of those
B. Watches                                                               requirements as required (R), others are either context (CS)
   Since ExecuteTestExpression’s stack frame is not                      or mbeddr specific (MS).
shown in the high-level call stack, none of its watches are
mapped. In contrast, stack frames for Testcases are visible              A. Required
thus we need to consider its watches. In case of Testcase,
the LocalVariableDeclaration _f has no corresponding                        R1 Debug state validation: Changes in generators can
representation on the extension-level, and is therefore not              modify names of generated procedures or variables and this
shown (specified in listing below).                                      way, e. g., invalidate program state lifting in the debugger. For
   The mbeddr debugger framework uses a pessimistic ap-                  being able to identify those problems, we need a mechanism to
proach for lifting watches: those that should not be shown in            validate the call stack, and for each of its frames the program
the UI are marked as hidden. Otherwise, the debugger shows               state and the location where execution is suspended. For the
the low-level watch (in this case the C local variable _f) with          call stack, a specification of expected stack frames with their
its respective value.                                                    respective names is required. In terms of program state, we
 hide local variable with identifier "_f";
                                                                         need to verify the names of watches and their respective
                                                                         values, which can either be simple or complex. Further, a
C. Stepping                                                              location specifies where program execution is expected to
                                                                         suspend and tests can be written for a specific platform.
   AssertStatement is a Statement, which already provides
                                                                            R2 Debug control: Similarly as in R1, generator changes
step over behavior. However, to be able to step into the
                                                                         also affect the stepping behavior. Consider changing the
condition we overwrite Statement’s step into behavior:
                                                                         FunctionCall generator to inline the body of called functions
 break on nodes to step-into: this.expr;
                                                                         instead of calling them. This change would require modifica-
   break on nodes searches in condition for instances of                 tions in the implementation of step into as well. For being
StepIntoable and contributes their step into strategies.                 able to identify those problems, we need the ability to execute
   ExecuteTestExpression implements StepIntoable to                      stepping commands (in, over and out) and specify locations
allow step into the referenced Testcases. A minimal imple-               where to break.
mentation puts a breakpoint in each Testcase:                               R3 Language integration: The DSL must integrate with
 foreach testRef in this.tests {
                                                                         language extensions. This integration is required for specifying
    break on node: testRef.test.body.statements.first;                   in programs under test locations where to break (see R2) and
 }
                                                                         for validating where program execution is suspended (see R1).

D. Call Stack
                                                                         B. Context Specific
   Testcase and ExecuteTestExpression are translated to
base-level callables and therefore implement StackFrame-                    CS1 Reusability: For writing debugger tests in an efficient
Contributor. They contribute StackFrames, each is linked                 way, we expect from DeTeL the ability to provide reuse: (1)
to a base-level stack frame and states whether it is visible in          test data, (2) validation rules and (3) the structure of tests. The
the extension-level call stack or not.                                   first covers the ability to have one mbeddr program as test data
   The implementation of ExecuteTestExpression links the                 for multiple test cases. The second refers to single definition
low-level stack frame to the respective instance (see listing be-        and multiple usage of validation rules among different test
low). Further, it hides the frame from the high-level call stack,        cases. Finally, the third refers to extending test cases and
since ExecuteTestExpression has no callable semantics.                   having the possibility to specialize them.
 contribute frame mapping for frames.select(name=getName());                CS2 Extensibility: Languages should provide support for
                                                                         contributing new validation rules thus achieving extensibility.
   Similarly the mapping for Testcase also requires linking              Those new rules can be used for testing further debugger
the low-level stack frame to the respective instance. However,           functionality not covered by DeTeL (e. g., mbeddr’s upcoming
it declares to show the stack frame in the high-level call stack:        support for multi-level debugging [7]) or for writing tests more
 String frameName = "test_" + this.name;                                 efficiently.
 contribute frame mapping for frames.select(name=frameName);
                                                                            CS3 Automated test execution: For fast feedback about
Further, we provide the name of the actual Testcase, which               newly introduced debugger bugs, we require the ability to
is represented in the call stack view: Consider Listing 1, where         integrate our tests into an automatic execution environment
we would show the name forTest instead of test_forTest.                  (e. g., an IDE or a build server).



                                                                    36
C. Mbeddr Specific                                                        An extending CallStack inherits all StackFrames from the
   MS1 Exchangeable debugger backends: mbeddr targets                     extended CallStack in the form of StackFrameExtensions,
the embedded domain where platform vendors require differ-                with the possibility of specializing inherited properties (CS1),
ent compilers and debuggers. Hence, we require the ability                and can declare additional StackFrames.
to run our tests against different debugger backends and on
different platforms.
               VII. D EBUGGER T ESTING DSL
   DeTeL is open-source and is shipped as part of mbeddr [8].
It is integrated in MPS and interacts with the mbeddr debugger
API. DeTeL is currently tightly coupled to mbeddr, however
it could interact with a generic debugger API and could be
implemented independent of MPS. This section describes the
structure of DeTeL and the implementation of requirements
discussed in Section VI. The language syntax is not docu-
mented, but can easily be derived by looking at its editor
definitions in MPS.                                                                         Fig. 5. Structure of CallStack

A. DebuggerTest                                                              IStackFrame has three parts, each with two different
   Fig. 4 shows the structure of DebuggerTest, which is                   implementations: a name (IName), a location where program
a module that contains IDebuggerTestContents, currently                   execution should suspend (ILocation) and visible watches
implemented by DebuggerTestcase and CallStack (de-                        (IWatches).
scribed later). This interface facilitates extensibility inside              IName implementations: SpecificName verifies the spec-
DebuggerTest (CS2). Further, DebuggerTest refers to a                     ified name matches the actual and AnyName ignores it com-
Binary, which is a concept from mbeddr representing the                   pletely. ILocation implementations: AnyLocation that does
compiled mbeddr program under test (R3), the imports of                   not perform any validation and ProgramMarkerRef that refers
IDebuggerTestContents from other DebuggerTests (CS1)                      via ProgramMarker to a specific location in a program under
and an IDebuggerBackend that specifies the debugger back-                 test (R3). These markers just annotate nodes in the AST and
end (CS2, MS1). The later is implemented by GdbBackend                    have no influence on code generation. IWatch implementa-
and allows this way to run debugger tests with the GNU                    tions: AnyWatches performs no validations and WatchList
Debugger (GDB) [9].                                                       contains a list of Watches, each specifies a name/value
                                                                          (IValue) pair. The value can be either PrimitiveValue
                                                                          (e. g., numbers) or ComplexValue (e. g., arrays).
                                                                          C. DebuggerTestcase
                                                                             Fig. 6 shows the structure of DebuggerTestcase:
                                                                          it can extend other DebuggerTestcases (CS1), has a
                                                                          name, and can be abstract. Further it contains the
                                                                          following parts: SuspendConfig, SteppingConfig and
                                                                          ValidationConfig. Concrete DebuggerTestcases require
                                                                          a SuspendConfig and a ValidationConfig (can be inher-
                                                                          ited), while an abstract DebuggerTestcase requires none
                 Fig. 4. Structure of DebuggerTest                        of these.
   MPS already contains the language mps.lang.test for
writing type system and editor tests. This allows (1) automatic
execution of tests on the command-line and (2) visualization of
test results in a table view inside MPS. All of that functionality
is built for future implementations of ITestcase - an interface
from mps.lang.test. By implementing this interface in
DebuggerTest (our container for DebuggerTestcases), we
benefit from available features (CS3).
                                                                                        Fig. 6. Structure of DebuggerTestcase
B. CallStack                                                                 SuspendConfig contains a ProgramMarkerRef that points
  CallStack implements IDebuggerTestContent (see                          to the first program location where execution suspends (R2).
Fig. 5) and contains IStackFrames (CS2, R1), which has two                   SteppingConfig is optional and contains a list of IStep-
implementations: StackFrame and StackFrameExtension.                      pingCommands (CS2) that are executed after suspending on



                                                                     37
     location (R2). This interface is implemented by StepInto,
                                                                 1 call stack csInMainFunction {
     StepOver, and StepOut (each performs the respective com- 2       0:main
     mand n times).                                              3       location: onReturnInMain
                                                                 4       watches: {argc, argv}
        ValidationConfig contains a list of IValidations 5 }
     (CS2, R1), implemented by CallStack, CallStackRef and 6
                                                                 7 call stack csInTestcase extends csInMainFunction {
     OnPlatform. CallStackRef refers to a CallStack that can- 8       1:forTest
     not be modified. Finally, OnPlatform specifies a Platform 9         location: 
                                                                10       watches: {sum, nums}
     (Mac, Unix or Windows) and contains validations, which are 11    0:main
                                                                12 }
     only executed on the specific platform (R1).
                                                                                                 Listing 4. CallStack declarations
                  VIII. W RITING D EBUGGER T ESTS
                                                                                  Listing 5 contains the DebuggerTestcase stepIntoTest-
       In this section, we describe an application scenario where
                                                                               case, which uses the CallStack csInTestcase to verify step
     we apply DeTeL to test the debugger extension of MUnit.
                                                                               into for instances of ExecuteTestExpression. As a first
       Before writing debugger tests, we first take the program
                                                                               step, program execution is suspended at onReturnInMain, next,
     using MUnit from Listing 1 and annotate it in Listing 2
                                                                               a single StepInto is performed before the actual call stack
     with ProgramMarkers. Those markers are later used by
                                                                               is validated against a custom CallStack derived from csIn-
     DebuggerTestcases for specification and verification of code
                                                                               Testcase. This custom declaration specializes the StackFrame
     locations where program execution should suspend.
                                                                               forTest i. e., program execution is expected to suspend at
 1    int32 main(int32 argc, string[ ] argv) {                                 onSumDeclaration.
 2       [return test[forTest];] onReturnInMain
 3    }                                                                    1    testcase stepIntoTestcase {
 4    int32 add(int32 a, int32 b) {                                        2       suspend at:
 5       [return a+b;] inAdd                                               3          onReturnInMain
 6    }                                                                    4       then perform:
 7    testcase forTest {                                                   5          step into 1 times
 8       [int32 sum = 0;] onSumDeclaration                                 6       finally validate:
 9       [assert: sum == 0;] firstAssert                                   7          call stack csOnSumDeclInTestcase extends csInTestcase {
10       [int32[ ] nums = {1, 2, 3};] onArrayDecl                          8              1:forTest
11       for(int32_t i=0;i<3;i++) { sum += nums[i]; }                      9                 overwrite location: onSumDeclaration
12       [assert: sum == 6;] secondAssert                                 10                 watches: {sum, nums}
13    }                                                                   11              0:main
                                                                          12          }
                        Listing 2. Annotated program                      13    }


        Next, in the Listing 3 a stub of DebuggerTest UnitTesting                           Listing 5. Step into ExecuteTestExpression
     is created that will later contain all DebuggerTestcases
     described in this section. UnitTesting tests against the Binary           B. Step into/over AssertStatement
     UnitTestingBinary, which is compiled from Listing 2. Addi-
     tionally, it instructs the debugger runtime to execute tests with            After verifying step into for ExecuteTestExpression in
     the GdbBackend.                                                           the previous section, we now test step into and over for
                                                                               AssertStatement. Both stepping commands have the same
 1    DebuggerTest UnitTesting    tests binary: UnitTestingBinary {
 2                                uses debugger: gdb                           result when performed at firstAssert, hence common test
 3    }                                                                        behavior is extracted into the abstract DebuggerTestcase
                        Listing 3. DebuggerTest stub
                                                                               stepOnAssert as shown in Listing 6: (1) program execution
                                                                               suspends at firstAssert, (2) a custom CallStack verifies
                                                                               program execution suspended in forTest on onArrayDecl and
     A. Step Into ExecuteTestExpression                                        (3) the Watch num holds the PrimitiveValue zero.
        For testing step into on instances of Execute- 1 abstract testcase stepOnAssert {
     TestExpression, in the Listing 4, we create a CallStack 2        suspend at:
                                                                 3        firstAssert
     that specifies the stack organization after performing step 4    finally validate:
     into on onReturnInMain. To reuse information and minimize 5          call stack csOnArrayDeclInTestcase extends csInTestcase {
                                                                 6           1:forTest
     redundancy in subsequent DebuggerTestcases, two separate 7                 overwrite location: onArrayDecl
     CallStacks are created: First, csInMainFunction contains 8                 overwrite watches: {sum=0,nums}
                                                                 9           0:main
     a single StackFrame that expects (1) program execution 10            }
     to suspend at onReturnInMain and (2) two Watches (argc 11 }
     and argv). Second, csInTestcase extends csInMainFunction                     Listing 6. Abstract DebuggerTestcase
     by adding an additional StackFrame forTest on top of
     the StackFrameExtension main (colored in gray). This            The      DebuggerTestcase           stepIntoAssert    extending
     StackFrame specifies two Watches (sum and nums) and no stepOnAssert performs a StepInto command and stepOver-
     specific location (AnyLocation).                              Assert performs a StepOver:



                                                                         38
                                                                                 in the next section how language evolution will invalidate the
 1    testcase stepIntoAssert extends stepOnAssert {
 2       then perform:                                                           debugger definition and this way cause all of our tests to fail.
 3          step into 1 times
 4    }
 5    testcase stepOverAssert extends stepOnAssert {
 6       then perform:
 7          step over 1 times
 8    }

                   Listing 7. Extending DebuggerTestcases

     C. Step on last Statement in Testcase
       The last testing scenario verifies that stepping on the last                       Fig. 7. Successful execution of DebuggerTestcases in MPS
     Statement (secondAssert) inside a Testcase suspends exe-
     cution on the ExecuteTestExpression (onReturnInMain).                                        X. L ANGUAGE E VOLUTION
     Again, we create an abstract DebuggerTestcase steppin-                        The previous sections have shown how to build a language
     gOnLastStmnt that suspends execution on secondAssert and                    extension for mbeddr in MPS, define a debugger for this
     verifies that the actual call stack has the same structure as               extension and use DeTeL to test its debugging behavior. This
     CallStack csInMainFunction:                                                 section demonstrates how DeTeL is used to identify invalid
 1    abstract testcase steppingOnLastStmnt {
                                                                                 definitions in debugger extensions after evolving the language.
 2       suspend at:
 3          secondAssert
                                                                                 A. Evolving MUnit
 4       finally validate:
 5          call stack csInMainFunction
                                                                                    In this section we modify the MUnit generator to demon-
 6    }                                                                          strate how this affects the debugger. Currently, the generator
                                                                                 reduces a Testcase to a Function: its name is prefixed with
        Listing 8. Assumptions after suspending program execution in main
                                                                                 test_, followed by the Testcase name (see Listing 1). We
        Next, separate DebuggerTestcases are created, each for                   now change this generator, so the Function name is prefixed
     step over, into and out, which extend steppingOnLastStmnt                   with testcase_, instead of test_. The listing below shows how
     and specify only the respective ISteppingCommand:                           our example program from Listing 1 is now generated to C.
 1    testcase stepOverLastStmnt extends steppingOnLastStmnt {                    int32_t main(int32_t argc,           int32_t testcase_forTest() {
 2       then perform:                                                                char *(argv[])) {
                                                                                                                           int32_t _f = 0;
 3          step over 1 times                                                        return blockexpr_2() ;
 4    }                                                                                                                    int32_t sum = 0;
                                                                                  }
 5                                                                                                                         if(!( sum == 0 )) { _f++; }
      testcase stepIntoLastStmnt extends steppingOnLastStmnt {                        |
 6                                                                                                                         int32_t[] nums = {1, 2, 3};
 7       then perform:                                                            int32_t blockexpr_2(void) {              for(int32_t i=0;i<3;i++){
 8          step into 1 times                                                                                                  sum += nums[i];
 9    }                                                                               int32_t _f = 0;                      }
10                                                                                    _f += testcase_forTest();            if(!( sum == 6 )) { _f++; }
11    testcase stepOutFromLastStmnt extends steppingOnLastStmnt {
12       then perform:                                                                return _f;                           return _f;
13          step out 1 times                                                      }                                    }
14    }
                                                                                  Listing 10. C code that has been generated with the modified Testcase
       Listing 9. Test stepping commands on last Statement in Testcase            generator for the example program from Listing 1

        In each DebuggerTestcase from the listing above exe-                        Because of our generator modification, Testcases are
     cution suspends on the same Statement (OnReturnInMain),                     now generated to Functions with a different identifier as
     although different stepping commands are performed. Remem-                  before. However, we have not updated the debugger extension,
     ber, since secondAssert does not contain any children of type               therefore, the call stack construction for all Testcases fails
     StepIntoable (e. g., FunctionCall), performing a step into                  and this way all of our DebuggerTests fail as well (see
     on the Statement has the same effect as a step over.                        Fig. 8). Although those debugger tests fail, they are still valid,
                                                                                 since they are written on the abstraction level of the languages,
                  IX. E XECUTING D EBUGGER T ESTS                                not the generator. The next section shows how we update the
        Our test cases from the previous section are generated                   debugger extension to solve the call stack construction.
     to plain Java code and can be executed in MPS with an
     action from the context menu. This functionality is ob-
     tained by implementing ITestcase in DebuggerTest (see
     Section VII-A). By executing this action, test results are
     visualized in a table view, provided by MPS: for each
     DebuggerTestcase, the result (success or fail) is indicated
     with a colored bubble and a text field shows the process output.
        As indicated by a green bubble on the left side of Fig. 7, all
                                                                                      Fig. 8. Failing DebuggerTestcases after modifying the generator
     of our previously written DebuggerTestcases pass. We show



                                                                            39
B. Updating the Debugger Extension                                                   approach is applicable for testing any imperative language
   The MUnit debugger extension tries to lift for each                               debugger. Further, we have shown in this paper (1) the
Testcase a stack frame whose name is prefixed with test_,                            implementation of a language extension, (2) how debugging
followed by the name of the respective Testcase (see                                 support is build for it and (3) how the debugger is tested with
Section V-D). However, due to our generator modification,                            use of our DSL. The language is designed for extensibility,
this frame is not present and therefore the whole call stack                         so others can contribute their own context-specific validation
construction fails with an error. To solve this problem, we                          rules. In addition, we concentrated on reuse, so test data, test
update the name used for matching the stack frame name:                              structures and validation rules can be shared among tests.
                                                                                        In the future, we plan to investigate ways for integrating
 String frameName = "testcase_" + this.name;
 contribute frame mapping for frames.select(name=frameName);                         the debugger specification DSL with the DSL for testing the
                                                                                     debugger extension. From this integration we expect to (1)
   Other aspects, such as stepping, breakpoints or watches are                       gain advances in validating debugger test cases and (2) the
not affected by the generator modification and hence do not                          possibility to automatically generate test cases from formal
need to be changed. Therefore, after fixing the call stack lifting                   debugger specifications (based on work from [12], [13] and
for Testcase our debugger tests pass again.                                          [14]). In addition, we will continue researching on languages
                        XI. R ELATED W ORK                                           for testing non-functional aspects, such as testing the perfor-
                                                                                     mance of stepping commands and lifting of program state.
   Wu et al. describe a unit testing framework for DSLs [10]
with focus on testing the semantics of the language. However,                                                     R EFERENCES
from our perspective, it is necessary to have testing DSLs for                        [1] M. Voelter, “Language and IDE Development, Modularization and Com-
all aspects of the language definition, e. g., editor (concrete                           position with MPS,” in Generative and Transformational Techniques in
                                                                                          Software Engineering, ser. Lecture Notes in Computer Science, 2011.
syntax), type system, scoping, transformation rules, and finally                      [2] M. Voelter, D. Ratiu, B. Schaetz, and B. Kolb, “Mbeddr: An extensible
the debugger.1 mbeddr contains tests for the editor, type                                 c-based programming language and ide for embedded systems,” in
system, scoping and transformation rules, our work contributes                            Proceedings of the 3rd Annual Conference on Systems, Programming,
                                                                                          and Applications: Software for Humanity, ser. SPLASH ’12. New York,
the language for testing the debugger aspect.                                             NY, USA: ACM, 2012, pp. 121–140.
   The Low Level Virtual Machine (LLVM) project [11] comes                            [3] JetBrains, “Meta Programming System,” 2015. [Online]. Available:
with a C debugger named Low Level Debugger (LLDB). Test                                   http://www.jetbrains.com/mps
                                                                                      [4] D. Pavletic, M. Voelter, S. A. Raza, B. Kolb, and T. Kehrer, “Extensible
cases for this debugger are written in Python and the unit test                           debugger framework for extensible languages,” in Reliable Software
framework of Python. While those tests verify the command                                 Technologies - Ada-Europe 2015 - 20th Ada-Europe International Con-
line interface and the scripting Application Programming In-                              ference on Reliable Software Technologies, Madrid Spain, June 22-26,
                                                                                          2015, Proceedings, ser. Lecture Notes in Computer Science, J. A. de la
terface (API) of the debugger, they also test other functionality,                        Puente and T. Vardanega, Eds., vol. 9111. Springer, 2015, pp. 33–49.
such as using the help menu or changing the debugger settings.                        [5] A. Chis, T. Gîrba, and O. Nierstrasz, “The moldable debugger: A
Further, some of the LLDB tests verify the debugging behavior                             framework for developing domain-specific debuggers,” in Software Lan-
                                                                                          guage Engineering - 7th International Conference, SLE 2014, Västerås,
on different platforms, such as Darwin or Linux. In contrast,                             Sweden, September 15-16, 2014. Proceedings, 2014, pp. 102–121.
we only concentrate on testing the debugging behavior, but                            [6] H. Wu, “Grammar-driven Generation of Domain-specific Language Test-
also support writing tests for specific platforms. Our approach                           ing Tools,” in 20th Annual ACM Special Interest Group on Programming
                                                                                          Languages (SIGPLAN) Conference on Object-oriented Programming,
for testing the debugging behavior is derived from the LLDB                               Systems, Languages, and Applications. San Diego, CA, USA: ACM,
project: write a program in the source-language (mbeddr),                                 2005, pp. 210–211.
compile it to an executable and debug it through test cases,                          [7] D. Pavletic and S. A. Raza, “Multi-Level Debugging for Extensible
                                                                                          Languages,” Softwaretechnik-Trends, vol. 35, no. 1, 2015.
which verify the debugging behavior.                                                  [8] B. Kolb, M. Voelter, D. Ratiu, D. Pavletic, Z. Molotnikov, K. Dummann,
   The GDB debugger takes a similar approach as the LLDB:                                 N. Stotz, S. Lisson, S. Eberle, T. Szabo, A. Shatalin, K. Miyamoto,
tests cover different aspects of the debugger functionality and                           and S. Kaufmann, “mbeddr.core - An extensible C,” https://github.com/
                                                                                          mbeddr/mbeddr.core, GitHub repository, 2015.
are written in a scripting language [9]. Contrarily, to our                           [9] Free Software Foundation, “The GNU Project Debugger,” 2015.
approach of testing debugging for one extensible language, the                            [Online]. Available: https://www.gnu.org/software/gdb/
GDB project tests debugging behavior for all of its supported                        [10] H. Wu, J. G. Gray, and M. Mernik, “Unit testing for domain-specific
                                                                                          languages,” in Domain-Specific Languages, IFIP TC 2 Working Confer-
languages, such as C, C++, Java, Ada etc. Further, those tests                            ence, DSL 2009, Oxford, UK, July 15-17, 2009, Proceedings, ser. Lecture
run on different platforms and target configurations. Our work                            Notes in Computer Science, W. M. Taha, Ed., vol. 5658. Springer, 2009,
supports writing tests against different platforms, but does not                          pp. 125–147.
                                                                                     [11] LLVM Compiler Infrastructure, “The LLDB Debugger,” 2015. [Online].
allow users to change the target configuration via the DSL.                               Available: http://lldb.llvm.org
                                                                                     [12] H. Wu and J. Gray, “Automated generation of testing tools for domain-
             XII. S UMMARY AND F UTURE W ORK                                              specific languages.” in ASE, D. F. Redmiles, T. Ellman, and A. Zisman,
  The mbeddr extensible language comes with an extensible                                 Eds. ACM, 2005, pp. 436–439.
                                                                                     [13] P. R. Henriques, M. J. V. Pereira, M. Mernik, M. Lenic, J. Gray, and
debugger. To test this debugger, we have introduced in this                               H. Wu, “Automatic generation of language-based tools using the LISA
paper a generic and extensible testing DSL. The language is                               system,” Software, IEE Proceedings -, vol. 152, no. 2, pp. 54–69, 2005.
implemented in MPS with focus on mbeddr, but the underlying                          [14] H. Wu, J. Gray, and M. Mernik, “Grammar-driven generation of
                                                                                          domain-specific language debuggers.” Software: Practice and Experi-
  1 Specific language workbenches might require testing of additional aspects             ence, vol. 38, no. 10, pp. 1073–1103, 2008.




                                                                                40