=Paper=
{{Paper
|id=Vol-1560/paper6
|storemode=property
|title=Testing Extensible Language Debuggers
|pdfUrl=https://ceur-ws.org/Vol-1560/paper6.pdf
|volume=Vol-1560
|authors=Domenik Pavletic,Syed Aoun Raza,Kolja Dummann,Kim Haßlbauer
|dblpUrl=https://dblp.org/rec/conf/models/PavleticRDH15
}}
==Testing Extensible Language Debuggers==
Testing Extensible Language Debuggers
Domenik Pavletic Syed Aoun Raza Kolja Dummann Kim Haßlbauer
itemis AG Stuttgart, Germany itemis AG Stuttgart, Germany
Stuttgart, Germany aoun.raza@gmail.com Stuttgart, Germany kim.hasslbauer@gmail.com
pavletic@itemis.com dummann@itemis.com
Abstract—Extensible languages allow incremental extensions of on top of JetBrains Meta Programming System (MPS) [3] and
a host language with domain specific abstractions. Debuggers for ships with a set of language extensions dedicated to embedded
such languages must be extensible as well to support debugging of software development. mbeddr includes an extensible C99
different language extensions at their corresponding abstraction
level. As such languages evolve over time, it is essential to implementation. Further, it also includes a set of predefined
constantly verify their debugging behavior. For this purpose, a language extensions on top of C. These extensions include
General Purpose Language (GPL) can be used, however this state machines, components and physical units.
increases the complexity and decreases the readability of tests. To In MPS, language implementations are separated into as-
reduce continuous verification effort, in this paper, we introduce pects. The major aspects are Structure, Type System,
DeTeL, an extensible Domain-Specific Language (DSL) for testing
extensible language debuggers. Constraints, Generator and Editor. However, for build-
Index Terms—Formal languages, Software debugging, Soft- ing debugging support, the Editor aspect is irrelevant.
ware testing.
III. L ANGUAGE E XTENSION FOR U NIT T ESTING
I. I NTRODUCTION
To give an idea of building language and debugger ex-
Software development faces the challenge that GPLs do tensions, we first build MUnit, a language for writing unit
not provide the appropriate abstractions for domain-specific tests, and a corresponding debugger extension. Later, we will
problems. Traditionally there are two approaches to overcome describe how to test this debugger extension with a DSL.
this issue. One is to use frameworks that provide domain-
specific abstractions expressed with a GPL. This approach has A. Structure
very limited support for static semantics, e. g., no support for Fig. 1 shows the language structure: AssertStatement is
modifying constraints or type system. The second approach derived from Statement and can therefore be used where
is to use external DSLs for expressing solutions to domain Statements are expected. It contains an Expression for
problems. This approach has some other drawbacks: these the condition. Testcase holds a StatementList that con-
DSLs are not inherently extensible. Extensible languages solve tains the Statements that make up the test. Further, to
these problems. Instead of having a single monolithic DSL, ex- have the same scope as Function, Testcase implements
tensible languages enable modular and incremental extensions IModuleContent. ExecuteTestExpression contains a list
of a host language with domain specific abstractions [1]. of TestcaseRef, which refer to Testcases to be executed.
To make debugging extensible languages useful to the
language user, it is not enough to debug programs after
extensions have been translated back to the host language
(using an existing debugger for the base language). A debugger
for an extensible language must be extensible as well, to
support debugging of modular language extensions at the same
abstraction level (extension-level). Minimally, this means users
can step through constructs provided by the extension and see
watch expressions (e. g., variables) related to the extensions. Fig. 1. Language structure
Because language extensions can be based on other ex-
tensions and languages evolve over time, it is essential to B. Type System and Constraints
constantly test if debugger behavior matches the expected AssertStatement requires a constraint and a type system
behavior. To test debugging behavior, a GPL can be used, rule. It restricts the usages only inside Testcases, meaning
however this raises the same issues discussed above. We an AssertStatement can only be used in a Testcase:
therefore propose in this paper DeTeL (Debugger Testing parentNode.ancestor.isNotNull
Language), an extensible DSL for testing debuggers.
II. MBEDDR It also restricts the type of its child expr (condition) to
BooleanType, so only valid conditions can be entered:
mbeddr [2] is an extensible version of C that can be
extended with modular, domain-specific extensions. It is built check(typeof(assertStatement.expr) :<=: );
34
ExecuteTestExpression returns the number of failed unit for specifying the debugging semantics of language concepts.
tests, hence we specify Int32tType as its type (see rule Second, a runtime for executing those specifications and thus
below). Later, the same type is used in the generator. achieving the mapping described in Fig. 2.
check(typeof(executeTestExpression) :==: ); In this section, we provide an overview of the specification
part (see Fig. 3) that is required for understanding how
the debugger extension for MUnit is built. While this paper
C. Generator
concentrates on testing debuggers for extensible languages, we
The MUnit generator consists of many different transfor- have published another paper [4] describing details about the
mation rules, which translate code written with the language debugger framework and its implementation with MPS.
directly to mbeddr C. Listing 1 shows on the left hand side
an example program, written with mbeddr C and MUnit. The A. Breakpoints
right hand side shows the C program generated from it. While
Breakables are concepts (e. g., Statements) on which we
regular mbeddr C code is not colored, the boxes indicate how
can set breakpoints to suspend the program execution.
Abstract Syntax Tree (AST) nodes from the left are translated
to C code on the right.
B. Watches
1 int32 main(int32 argc, 1 int32_t main(int32_t argc,
2 string[] argv) { 2 char *(argv[])) {
WatchProviders are translated to low-level watches
3 return test[ forTest ] ; 3 return blockexpr_2() ; (e. g., Argument) or represent watches on the extension-
4 } 4 } level. They are declared inside WatchProviderScopes (e. g.,
_f; |
5 5
StatementList), which is a nestable context.
6 blockexpr_2(); 6 int32_t blockexpr_2(void) {
7 } 7 int32_t _f = 0; C. Stepping
8 } 8 _f += test_forTest();
int32_t bp_2() {
Steppables define where program execution must suspend
9 9 return _f;
10 i32_t _f = 0; 10 }
next, after the user steps over an instance of Steppable (e. g.,
11 11 Statement). If a Steppable contains a StepIntoable (e. g.,
12 testcase forTest { 12 int32_t test_forTest() { FunctionCall), then the Steppable also supports step into.
13 | 13 int32_t _f = 0; StepIntoables are concepts that branch execution into a
14 int32 sum = 0; 14 int32_t sum = 0; SteppableComposite (e. g., Function).
15 assert: sum == 0 ; 15 if(!( sum == 0 )) { _f++; }
16 int32[] nums = {1, 2, 3}; 16 int32_t[] nums = {1, 2, 3};
All stepping is implemented by setting low-level break-
17 for(int32 i=0;i<3;i++){ 17 for(int32_t i=0;i<3;i++){ points and then resuming execution until one of these break-
18 sum += nums[i]; sum += nums[i];
19 }
18
19 }
points is hit (approach is based on [6]). The particular stepping
20 assert: sum == 6 ; 20 if(!( sum == 6 )) { _f++; } behavior is realized through stepping-related concepts by
21 _f++; 21 return _f; utilizing DebugStrategies.
22 } 22 }
D. Call Stack
Listing 1. Example mbeddr program using the unit test language on the left
and the C code that has been generated from it on the right StackFrameContributors are concepts that have callable
semantics on the extension-level or are translated to low-level
IV. MBEDDR D EBUGGER F RAMEWORK callables (e. g., Functions). While the latter do not contribute
any StackFrames to the high level call stack, the former
mbeddr comes with a debugger, which allows users to contribute at least one StackFrame.
debug their mbeddr code on the abstraction levels of the used
languages. For that, each language contributes a debugger
extension, which is built with a framework also provided by
mbeddr [4]. Those extensions are always language-specific in
contrast to domain-specific debuggers (e. g., the moldable de-
bugger [5]), which provide application-specific debug actions
and views on the program state. Hence, debugging support is
implemented specifically for the language by lifting the call Fig. 3. Meta-model used for specifying the debugging semantics of language
stack/program state from the base-level to the extension-level concepts [4]. Colors indicate the different debugging aspects
(see Fig. 2) and stepping/breakpoints vice versa.
V. D EBUGGER E XTENSION FOR THE MU NIT L ANGUAGE
This section describes the implementation of a debugger ex-
Fig. 2. Flow of debug information between base and extension level [4]
tension for the MUnit language. This extension is defined with
The debugger framework can be separated into two different the mbeddr debugger specification DSL and the abstractions
parts: First, a DSL and a set of interfaces (shown in Fig. 3) of the debugging meta-model shown in Fig. 3.
35
A. Breakpoints VI. R EQUIREMENTS
To enable breakpoints on AssertStatements, an imple-
The debugger testing DSL must allow us to verify at
mentation of the Breakable interface is required. Assert-
least four aspects: call stack, program state, breakpoints and
Statement is derived from Statement that already imple-
stepping. To cover these requirements in DeTeL we delineate
ments this interface, thus breakpoints are already supported.
in this section requirements. While we consider some of those
B. Watches requirements as required (R), others are either context (CS)
Since ExecuteTestExpression’s stack frame is not or mbeddr specific (MS).
shown in the high-level call stack, none of its watches are
mapped. In contrast, stack frames for Testcases are visible A. Required
thus we need to consider its watches. In case of Testcase,
the LocalVariableDeclaration _f has no corresponding R1 Debug state validation: Changes in generators can
representation on the extension-level, and is therefore not modify names of generated procedures or variables and this
shown (specified in listing below). way, e. g., invalidate program state lifting in the debugger. For
The mbeddr debugger framework uses a pessimistic ap- being able to identify those problems, we need a mechanism to
proach for lifting watches: those that should not be shown in validate the call stack, and for each of its frames the program
the UI are marked as hidden. Otherwise, the debugger shows state and the location where execution is suspended. For the
the low-level watch (in this case the C local variable _f) with call stack, a specification of expected stack frames with their
its respective value. respective names is required. In terms of program state, we
hide local variable with identifier "_f";
need to verify the names of watches and their respective
values, which can either be simple or complex. Further, a
C. Stepping location specifies where program execution is expected to
suspend and tests can be written for a specific platform.
AssertStatement is a Statement, which already provides
R2 Debug control: Similarly as in R1, generator changes
step over behavior. However, to be able to step into the
also affect the stepping behavior. Consider changing the
condition we overwrite Statement’s step into behavior:
FunctionCall generator to inline the body of called functions
break on nodes to step-into: this.expr;
instead of calling them. This change would require modifica-
break on nodes searches in condition for instances of tions in the implementation of step into as well. For being
StepIntoable and contributes their step into strategies. able to identify those problems, we need the ability to execute
ExecuteTestExpression implements StepIntoable to stepping commands (in, over and out) and specify locations
allow step into the referenced Testcases. A minimal imple- where to break.
mentation puts a breakpoint in each Testcase: R3 Language integration: The DSL must integrate with
foreach testRef in this.tests {
language extensions. This integration is required for specifying
break on node: testRef.test.body.statements.first; in programs under test locations where to break (see R2) and
}
for validating where program execution is suspended (see R1).
D. Call Stack
B. Context Specific
Testcase and ExecuteTestExpression are translated to
base-level callables and therefore implement StackFrame- CS1 Reusability: For writing debugger tests in an efficient
Contributor. They contribute StackFrames, each is linked way, we expect from DeTeL the ability to provide reuse: (1)
to a base-level stack frame and states whether it is visible in test data, (2) validation rules and (3) the structure of tests. The
the extension-level call stack or not. first covers the ability to have one mbeddr program as test data
The implementation of ExecuteTestExpression links the for multiple test cases. The second refers to single definition
low-level stack frame to the respective instance (see listing be- and multiple usage of validation rules among different test
low). Further, it hides the frame from the high-level call stack, cases. Finally, the third refers to extending test cases and
since ExecuteTestExpression has no callable semantics. having the possibility to specialize them.
contribute frame mapping for frames.select(name=getName()); CS2 Extensibility: Languages should provide support for
contributing new validation rules thus achieving extensibility.
Similarly the mapping for Testcase also requires linking Those new rules can be used for testing further debugger
the low-level stack frame to the respective instance. However, functionality not covered by DeTeL (e. g., mbeddr’s upcoming
it declares to show the stack frame in the high-level call stack: support for multi-level debugging [7]) or for writing tests more
String frameName = "test_" + this.name; efficiently.
contribute frame mapping for frames.select(name=frameName);
CS3 Automated test execution: For fast feedback about
Further, we provide the name of the actual Testcase, which newly introduced debugger bugs, we require the ability to
is represented in the call stack view: Consider Listing 1, where integrate our tests into an automatic execution environment
we would show the name forTest instead of test_forTest. (e. g., an IDE or a build server).
36
C. Mbeddr Specific An extending CallStack inherits all StackFrames from the
MS1 Exchangeable debugger backends: mbeddr targets extended CallStack in the form of StackFrameExtensions,
the embedded domain where platform vendors require differ- with the possibility of specializing inherited properties (CS1),
ent compilers and debuggers. Hence, we require the ability and can declare additional StackFrames.
to run our tests against different debugger backends and on
different platforms.
VII. D EBUGGER T ESTING DSL
DeTeL is open-source and is shipped as part of mbeddr [8].
It is integrated in MPS and interacts with the mbeddr debugger
API. DeTeL is currently tightly coupled to mbeddr, however
it could interact with a generic debugger API and could be
implemented independent of MPS. This section describes the
structure of DeTeL and the implementation of requirements
discussed in Section VI. The language syntax is not docu-
mented, but can easily be derived by looking at its editor
definitions in MPS. Fig. 5. Structure of CallStack
A. DebuggerTest IStackFrame has three parts, each with two different
Fig. 4 shows the structure of DebuggerTest, which is implementations: a name (IName), a location where program
a module that contains IDebuggerTestContents, currently execution should suspend (ILocation) and visible watches
implemented by DebuggerTestcase and CallStack (de- (IWatches).
scribed later). This interface facilitates extensibility inside IName implementations: SpecificName verifies the spec-
DebuggerTest (CS2). Further, DebuggerTest refers to a ified name matches the actual and AnyName ignores it com-
Binary, which is a concept from mbeddr representing the pletely. ILocation implementations: AnyLocation that does
compiled mbeddr program under test (R3), the imports of not perform any validation and ProgramMarkerRef that refers
IDebuggerTestContents from other DebuggerTests (CS1) via ProgramMarker to a specific location in a program under
and an IDebuggerBackend that specifies the debugger back- test (R3). These markers just annotate nodes in the AST and
end (CS2, MS1). The later is implemented by GdbBackend have no influence on code generation. IWatch implementa-
and allows this way to run debugger tests with the GNU tions: AnyWatches performs no validations and WatchList
Debugger (GDB) [9]. contains a list of Watches, each specifies a name/value
(IValue) pair. The value can be either PrimitiveValue
(e. g., numbers) or ComplexValue (e. g., arrays).
C. DebuggerTestcase
Fig. 6 shows the structure of DebuggerTestcase:
it can extend other DebuggerTestcases (CS1), has a
name, and can be abstract. Further it contains the
following parts: SuspendConfig, SteppingConfig and
ValidationConfig. Concrete DebuggerTestcases require
a SuspendConfig and a ValidationConfig (can be inher-
ited), while an abstract DebuggerTestcase requires none
Fig. 4. Structure of DebuggerTest of these.
MPS already contains the language mps.lang.test for
writing type system and editor tests. This allows (1) automatic
execution of tests on the command-line and (2) visualization of
test results in a table view inside MPS. All of that functionality
is built for future implementations of ITestcase - an interface
from mps.lang.test. By implementing this interface in
DebuggerTest (our container for DebuggerTestcases), we
benefit from available features (CS3).
Fig. 6. Structure of DebuggerTestcase
B. CallStack SuspendConfig contains a ProgramMarkerRef that points
CallStack implements IDebuggerTestContent (see to the first program location where execution suspends (R2).
Fig. 5) and contains IStackFrames (CS2, R1), which has two SteppingConfig is optional and contains a list of IStep-
implementations: StackFrame and StackFrameExtension. pingCommands (CS2) that are executed after suspending on
37
location (R2). This interface is implemented by StepInto,
1 call stack csInMainFunction {
StepOver, and StepOut (each performs the respective com- 2 0:main
mand n times). 3 location: onReturnInMain
4 watches: {argc, argv}
ValidationConfig contains a list of IValidations 5 }
(CS2, R1), implemented by CallStack, CallStackRef and 6
7 call stack csInTestcase extends csInMainFunction {
OnPlatform. CallStackRef refers to a CallStack that can- 8 1:forTest
not be modified. Finally, OnPlatform specifies a Platform 9 location:
10 watches: {sum, nums}
(Mac, Unix or Windows) and contains validations, which are 11 0:main
12 }
only executed on the specific platform (R1).
Listing 4. CallStack declarations
VIII. W RITING D EBUGGER T ESTS
Listing 5 contains the DebuggerTestcase stepIntoTest-
In this section, we describe an application scenario where
case, which uses the CallStack csInTestcase to verify step
we apply DeTeL to test the debugger extension of MUnit.
into for instances of ExecuteTestExpression. As a first
Before writing debugger tests, we first take the program
step, program execution is suspended at onReturnInMain, next,
using MUnit from Listing 1 and annotate it in Listing 2
a single StepInto is performed before the actual call stack
with ProgramMarkers. Those markers are later used by
is validated against a custom CallStack derived from csIn-
DebuggerTestcases for specification and verification of code
Testcase. This custom declaration specializes the StackFrame
locations where program execution should suspend.
forTest i. e., program execution is expected to suspend at
1 int32 main(int32 argc, string[ ] argv) { onSumDeclaration.
2 [return test[forTest];] onReturnInMain
3 } 1 testcase stepIntoTestcase {
4 int32 add(int32 a, int32 b) { 2 suspend at:
5 [return a+b;] inAdd 3 onReturnInMain
6 } 4 then perform:
7 testcase forTest { 5 step into 1 times
8 [int32 sum = 0;] onSumDeclaration 6 finally validate:
9 [assert: sum == 0;] firstAssert 7 call stack csOnSumDeclInTestcase extends csInTestcase {
10 [int32[ ] nums = {1, 2, 3};] onArrayDecl 8 1:forTest
11 for(int32_t i=0;i<3;i++) { sum += nums[i]; } 9 overwrite location: onSumDeclaration
12 [assert: sum == 6;] secondAssert 10 watches: {sum, nums}
13 } 11 0:main
12 }
Listing 2. Annotated program 13 }
Next, in the Listing 3 a stub of DebuggerTest UnitTesting Listing 5. Step into ExecuteTestExpression
is created that will later contain all DebuggerTestcases
described in this section. UnitTesting tests against the Binary B. Step into/over AssertStatement
UnitTestingBinary, which is compiled from Listing 2. Addi-
tionally, it instructs the debugger runtime to execute tests with After verifying step into for ExecuteTestExpression in
the GdbBackend. the previous section, we now test step into and over for
AssertStatement. Both stepping commands have the same
1 DebuggerTest UnitTesting tests binary: UnitTestingBinary {
2 uses debugger: gdb result when performed at firstAssert, hence common test
3 } behavior is extracted into the abstract DebuggerTestcase
Listing 3. DebuggerTest stub
stepOnAssert as shown in Listing 6: (1) program execution
suspends at firstAssert, (2) a custom CallStack verifies
program execution suspended in forTest on onArrayDecl and
A. Step Into ExecuteTestExpression (3) the Watch num holds the PrimitiveValue zero.
For testing step into on instances of Execute- 1 abstract testcase stepOnAssert {
TestExpression, in the Listing 4, we create a CallStack 2 suspend at:
3 firstAssert
that specifies the stack organization after performing step 4 finally validate:
into on onReturnInMain. To reuse information and minimize 5 call stack csOnArrayDeclInTestcase extends csInTestcase {
6 1:forTest
redundancy in subsequent DebuggerTestcases, two separate 7 overwrite location: onArrayDecl
CallStacks are created: First, csInMainFunction contains 8 overwrite watches: {sum=0,nums}
9 0:main
a single StackFrame that expects (1) program execution 10 }
to suspend at onReturnInMain and (2) two Watches (argc 11 }
and argv). Second, csInTestcase extends csInMainFunction Listing 6. Abstract DebuggerTestcase
by adding an additional StackFrame forTest on top of
the StackFrameExtension main (colored in gray). This The DebuggerTestcase stepIntoAssert extending
StackFrame specifies two Watches (sum and nums) and no stepOnAssert performs a StepInto command and stepOver-
specific location (AnyLocation). Assert performs a StepOver:
38
in the next section how language evolution will invalidate the
1 testcase stepIntoAssert extends stepOnAssert {
2 then perform: debugger definition and this way cause all of our tests to fail.
3 step into 1 times
4 }
5 testcase stepOverAssert extends stepOnAssert {
6 then perform:
7 step over 1 times
8 }
Listing 7. Extending DebuggerTestcases
C. Step on last Statement in Testcase
The last testing scenario verifies that stepping on the last Fig. 7. Successful execution of DebuggerTestcases in MPS
Statement (secondAssert) inside a Testcase suspends exe-
cution on the ExecuteTestExpression (onReturnInMain). X. L ANGUAGE E VOLUTION
Again, we create an abstract DebuggerTestcase steppin- The previous sections have shown how to build a language
gOnLastStmnt that suspends execution on secondAssert and extension for mbeddr in MPS, define a debugger for this
verifies that the actual call stack has the same structure as extension and use DeTeL to test its debugging behavior. This
CallStack csInMainFunction: section demonstrates how DeTeL is used to identify invalid
1 abstract testcase steppingOnLastStmnt {
definitions in debugger extensions after evolving the language.
2 suspend at:
3 secondAssert
A. Evolving MUnit
4 finally validate:
5 call stack csInMainFunction
In this section we modify the MUnit generator to demon-
6 } strate how this affects the debugger. Currently, the generator
reduces a Testcase to a Function: its name is prefixed with
Listing 8. Assumptions after suspending program execution in main
test_, followed by the Testcase name (see Listing 1). We
Next, separate DebuggerTestcases are created, each for now change this generator, so the Function name is prefixed
step over, into and out, which extend steppingOnLastStmnt with testcase_, instead of test_. The listing below shows how
and specify only the respective ISteppingCommand: our example program from Listing 1 is now generated to C.
1 testcase stepOverLastStmnt extends steppingOnLastStmnt { int32_t main(int32_t argc, int32_t testcase_forTest() {
2 then perform: char *(argv[])) {
int32_t _f = 0;
3 step over 1 times return blockexpr_2() ;
4 } int32_t sum = 0;
}
5 if(!( sum == 0 )) { _f++; }
testcase stepIntoLastStmnt extends steppingOnLastStmnt { |
6 int32_t[] nums = {1, 2, 3};
7 then perform: int32_t blockexpr_2(void) { for(int32_t i=0;i<3;i++){
8 step into 1 times sum += nums[i];
9 } int32_t _f = 0; }
10 _f += testcase_forTest(); if(!( sum == 6 )) { _f++; }
11 testcase stepOutFromLastStmnt extends steppingOnLastStmnt {
12 then perform: return _f; return _f;
13 step out 1 times } }
14 }
Listing 10. C code that has been generated with the modified Testcase
Listing 9. Test stepping commands on last Statement in Testcase generator for the example program from Listing 1
In each DebuggerTestcase from the listing above exe- Because of our generator modification, Testcases are
cution suspends on the same Statement (OnReturnInMain), now generated to Functions with a different identifier as
although different stepping commands are performed. Remem- before. However, we have not updated the debugger extension,
ber, since secondAssert does not contain any children of type therefore, the call stack construction for all Testcases fails
StepIntoable (e. g., FunctionCall), performing a step into and this way all of our DebuggerTests fail as well (see
on the Statement has the same effect as a step over. Fig. 8). Although those debugger tests fail, they are still valid,
since they are written on the abstraction level of the languages,
IX. E XECUTING D EBUGGER T ESTS not the generator. The next section shows how we update the
Our test cases from the previous section are generated debugger extension to solve the call stack construction.
to plain Java code and can be executed in MPS with an
action from the context menu. This functionality is ob-
tained by implementing ITestcase in DebuggerTest (see
Section VII-A). By executing this action, test results are
visualized in a table view, provided by MPS: for each
DebuggerTestcase, the result (success or fail) is indicated
with a colored bubble and a text field shows the process output.
As indicated by a green bubble on the left side of Fig. 7, all
Fig. 8. Failing DebuggerTestcases after modifying the generator
of our previously written DebuggerTestcases pass. We show
39
B. Updating the Debugger Extension approach is applicable for testing any imperative language
The MUnit debugger extension tries to lift for each debugger. Further, we have shown in this paper (1) the
Testcase a stack frame whose name is prefixed with test_, implementation of a language extension, (2) how debugging
followed by the name of the respective Testcase (see support is build for it and (3) how the debugger is tested with
Section V-D). However, due to our generator modification, use of our DSL. The language is designed for extensibility,
this frame is not present and therefore the whole call stack so others can contribute their own context-specific validation
construction fails with an error. To solve this problem, we rules. In addition, we concentrated on reuse, so test data, test
update the name used for matching the stack frame name: structures and validation rules can be shared among tests.
In the future, we plan to investigate ways for integrating
String frameName = "testcase_" + this.name;
contribute frame mapping for frames.select(name=frameName); the debugger specification DSL with the DSL for testing the
debugger extension. From this integration we expect to (1)
Other aspects, such as stepping, breakpoints or watches are gain advances in validating debugger test cases and (2) the
not affected by the generator modification and hence do not possibility to automatically generate test cases from formal
need to be changed. Therefore, after fixing the call stack lifting debugger specifications (based on work from [12], [13] and
for Testcase our debugger tests pass again. [14]). In addition, we will continue researching on languages
XI. R ELATED W ORK for testing non-functional aspects, such as testing the perfor-
mance of stepping commands and lifting of program state.
Wu et al. describe a unit testing framework for DSLs [10]
with focus on testing the semantics of the language. However, R EFERENCES
from our perspective, it is necessary to have testing DSLs for [1] M. Voelter, “Language and IDE Development, Modularization and Com-
all aspects of the language definition, e. g., editor (concrete position with MPS,” in Generative and Transformational Techniques in
Software Engineering, ser. Lecture Notes in Computer Science, 2011.
syntax), type system, scoping, transformation rules, and finally [2] M. Voelter, D. Ratiu, B. Schaetz, and B. Kolb, “Mbeddr: An extensible
the debugger.1 mbeddr contains tests for the editor, type c-based programming language and ide for embedded systems,” in
system, scoping and transformation rules, our work contributes Proceedings of the 3rd Annual Conference on Systems, Programming,
and Applications: Software for Humanity, ser. SPLASH ’12. New York,
the language for testing the debugger aspect. NY, USA: ACM, 2012, pp. 121–140.
The Low Level Virtual Machine (LLVM) project [11] comes [3] JetBrains, “Meta Programming System,” 2015. [Online]. Available:
with a C debugger named Low Level Debugger (LLDB). Test http://www.jetbrains.com/mps
[4] D. Pavletic, M. Voelter, S. A. Raza, B. Kolb, and T. Kehrer, “Extensible
cases for this debugger are written in Python and the unit test debugger framework for extensible languages,” in Reliable Software
framework of Python. While those tests verify the command Technologies - Ada-Europe 2015 - 20th Ada-Europe International Con-
line interface and the scripting Application Programming In- ference on Reliable Software Technologies, Madrid Spain, June 22-26,
2015, Proceedings, ser. Lecture Notes in Computer Science, J. A. de la
terface (API) of the debugger, they also test other functionality, Puente and T. Vardanega, Eds., vol. 9111. Springer, 2015, pp. 33–49.
such as using the help menu or changing the debugger settings. [5] A. Chis, T. Gîrba, and O. Nierstrasz, “The moldable debugger: A
Further, some of the LLDB tests verify the debugging behavior framework for developing domain-specific debuggers,” in Software Lan-
guage Engineering - 7th International Conference, SLE 2014, Västerås,
on different platforms, such as Darwin or Linux. In contrast, Sweden, September 15-16, 2014. Proceedings, 2014, pp. 102–121.
we only concentrate on testing the debugging behavior, but [6] H. Wu, “Grammar-driven Generation of Domain-specific Language Test-
also support writing tests for specific platforms. Our approach ing Tools,” in 20th Annual ACM Special Interest Group on Programming
Languages (SIGPLAN) Conference on Object-oriented Programming,
for testing the debugging behavior is derived from the LLDB Systems, Languages, and Applications. San Diego, CA, USA: ACM,
project: write a program in the source-language (mbeddr), 2005, pp. 210–211.
compile it to an executable and debug it through test cases, [7] D. Pavletic and S. A. Raza, “Multi-Level Debugging for Extensible
Languages,” Softwaretechnik-Trends, vol. 35, no. 1, 2015.
which verify the debugging behavior. [8] B. Kolb, M. Voelter, D. Ratiu, D. Pavletic, Z. Molotnikov, K. Dummann,
The GDB debugger takes a similar approach as the LLDB: N. Stotz, S. Lisson, S. Eberle, T. Szabo, A. Shatalin, K. Miyamoto,
tests cover different aspects of the debugger functionality and and S. Kaufmann, “mbeddr.core - An extensible C,” https://github.com/
mbeddr/mbeddr.core, GitHub repository, 2015.
are written in a scripting language [9]. Contrarily, to our [9] Free Software Foundation, “The GNU Project Debugger,” 2015.
approach of testing debugging for one extensible language, the [Online]. Available: https://www.gnu.org/software/gdb/
GDB project tests debugging behavior for all of its supported [10] H. Wu, J. G. Gray, and M. Mernik, “Unit testing for domain-specific
languages,” in Domain-Specific Languages, IFIP TC 2 Working Confer-
languages, such as C, C++, Java, Ada etc. Further, those tests ence, DSL 2009, Oxford, UK, July 15-17, 2009, Proceedings, ser. Lecture
run on different platforms and target configurations. Our work Notes in Computer Science, W. M. Taha, Ed., vol. 5658. Springer, 2009,
supports writing tests against different platforms, but does not pp. 125–147.
[11] LLVM Compiler Infrastructure, “The LLDB Debugger,” 2015. [Online].
allow users to change the target configuration via the DSL. Available: http://lldb.llvm.org
[12] H. Wu and J. Gray, “Automated generation of testing tools for domain-
XII. S UMMARY AND F UTURE W ORK specific languages.” in ASE, D. F. Redmiles, T. Ellman, and A. Zisman,
The mbeddr extensible language comes with an extensible Eds. ACM, 2005, pp. 436–439.
[13] P. R. Henriques, M. J. V. Pereira, M. Mernik, M. Lenic, J. Gray, and
debugger. To test this debugger, we have introduced in this H. Wu, “Automatic generation of language-based tools using the LISA
paper a generic and extensible testing DSL. The language is system,” Software, IEE Proceedings -, vol. 152, no. 2, pp. 54–69, 2005.
implemented in MPS with focus on mbeddr, but the underlying [14] H. Wu, J. Gray, and M. Mernik, “Grammar-driven generation of
domain-specific language debuggers.” Software: Practice and Experi-
1 Specific language workbenches might require testing of additional aspects ence, vol. 38, no. 10, pp. 1073–1103, 2008.
40