=Paper=
{{Paper
|id=Vol-3642/paper1
|storemode=property
|title=Locating Loop Errors in Programs: A Scalable and Expressive Approach using LocFaults
|pdfUrl=https://ceur-ws.org/Vol-3642/paper1.pdf
|volume=Vol-3642
|authors=Mohammed Bekkouche
|dblpUrl=https://dblp.org/rec/conf/tacc/Bekkouche23
}}
==Locating Loop Errors in Programs: A Scalable and Expressive Approach using LocFaults
==
Locating Loop Errors in Programs: A Scalable and
Expressive Approach using LocFaults
Mohammed Bekkouche1,*
1
LabRI-SBA Laboratory, Γcole SupΓ©rieure en Informatique, Sidi Bel Abbes 22000, Algeria
Abstract
A model checker can generate a lengthy and complicated trace of counterexamples for an erroneous
program, with the loop instructions being the largest part of this trace. Consequently, the location of
errors in loops is critical to analyzing the overall program. In this paper, we delve into the scalability
potential of LocFaults, our error localization approach that utilizes Control Flow Graph (CFG) paths
from counterexamples to calculate the Minimal Correction Deviations (MCDs) and Minimal Correction
Subsets (MCSs) for each MCD found. The study presents the efficiency of LocFaults on programs with
While-loops unfolded b times and deviated conditions ranging from 0 to n. Preliminary results show
that LocFaults, constraint-based and flow-driven, is faster and provides more expressive information
for the user compared to BugAssist, which is based on SAT and transforms the entire program into a
Boolean formula.
Keywords
Error localization, LocFaults, BugAssist, Off-by-one bug, Minimal Correction Deviations, Minimal
Correction Subsets
1. Introduction
Errors are inevitable in a program; they can disrupt proper operation and lead to serious finan-
cial consequences, posing a threat to human well-being [1]. Some software bug stories are cited
in this link [2]. Consequently, the debugging process, which involves detecting, localizing, and
correcting errors, is essential. Localizing errors is the most costly step, requiring the identifi-
cation of the precise locations of suspicious instructions [3] to help the user understand why
the program failed and facilitate error correction. When a program π does not conform to its
specification (i.e., contains errors), a model checker can produce a trace of a counterexample,
which is often long and challenging to comprehend, even for experienced programmers. To
address this issue, we have proposed an approach named LocFaults [4], which is based on
constraints that explore the paths of the programβs Control Flow Graph (CFG) from the coun-
terexample to calculate the minimal subsets necessary to restore the programβs compliance with
its postcondition. Ensuring that our method is highly scalable to meet the enormous complexity
of software systems is a crucial criterion for its quality [5].
TACC 2023: Tunisian-Algerian Joint Conference on Applied Computing, November 06β08, 2023, Sousse, Tunisia
*
Corresponding author.
$ m.bekkouche@esi-sba.dz (M. Bekkouche)
Β https://www.esi-sba.dz/fr/index.php/personnel/bekkouche-mohammed/ (M. Bekkouche)
0000-0002-8305-0542 (M. Bekkouche)
Β© 2023 Copyright for this paper by its authors. Use permitted under Creative Commons License Attribution 4.0 International (CC BY 4.0).
CEUR
Workshop
Proceedings
http://ceur-ws.org
ISSN 1613-0073
CEUR Workshop Proceedings (CEUR-WS.org)
CEUR
ceur-ws.org
Workshop ISSN 1613-0073
Proceedings
Several statistical approaches for error localization have been proposed, such as Tarantula [6,
7], Ochiai [8], AMPLE [8], Pinpoint [9], FLCN-S [10], FTFL [11], ConsilientSFL [12], and
Poster [13]. Among them, Tarantula is the most famous and uses different metrics to calculate
the degree of suspicion of each instruction in the program while running a series of tests.
However, these approaches have a drawback in that they require a large number of test cases,
while our approach only uses one counterexample. Another challenge with statistical approaches
is the need for an oracle to determine if a test caseβs result is correct or not. To address this issue,
we utilize the Bounded Model Checking (BMC) framework, which only requires a postcondition
or assertion to verify.
Our approach aims to simplify the problem of error localization by reducing it to computing
a minimal set that explains why a Constraint Satisfaction Problem (CSP) is infeasible. The CSP
represents the constraints of the program, counterexample, and the assertion or postcondition
violated. The calculated set can be either a Minimal Correction Subset (MCS) or a Minimal
Unsatisfiable Subset (MUS). Generally, testing the feasibility of a CSP over a finite domain is an
NP-complete problem, which is one of the most difficult NP problems. Thus, explaining the
infeasibility in a CSP is equally challenging, if not harder, and can be classified as an NP-hard
problem. While BugAssist [14, 15] is a BMC-based error localization method that employs a
Max-SAT solver to compute the merger of MCSs of the Boolean formula of the entire program
with the counterexample, it becomes inefficient for large programs. LocFaults also works from
a counterexample to calculate MCSs.
In this paper, we investigate the scalability of LocFaults on programs with While-loops that
are unfolded b times, and a number of deviated conditions ranging from 0 to 3. Our approach
contributes in the following ways compared to BugAssist:
β’ We avoid transforming the entire program into a system of constraints. Instead, we use the
CFG (Control Flow Graph) of the program to gather the constraints of the counterexample
path and paths derived from it. We assume that at most k conditionals may contain errors,
and we calculate MCSs only on the counterexample path and paths that correct the
program.
β’ We do not convert program instructions into a SAT (Boolean satisfiability) formula.
Instead, we use numerical constraints that will be handled by constraint solvers.
β’ We do not rely on MaxSAT solvers as black boxes. Instead, we use a generic algorithm
that uses a constraint solver to calculate MCSs.
β’ We limit the size of the generated MCSs and the number of deviated conditions.
β’ We can work together multiple solvers during the localization process and choose the
most efficient one according to the category of the CSP (Constraint Satisfaction Problem)
constructed. For example, if the CSP of the detected path is linear over integers, we use
a MIP (Mixed Integer Programming) solver. If it is nonlinear, we use a CP (Constraint
Programming) and/or MINLP (Mixed Integer Nonlinear Programming) solver.
Based on our practical experience, as demonstrated in Section 5, we have found that the
restrictions and distinctions employed by LocFaults make it faster and more expressive.
The paper is organized as follows. Section 2 introduces the definition of MUS and MCS. In
Section 3, we define the β€ k-MCD problem. In section 4, we describe our contribution to treating
erroneous loops, including the Off-by-one bug. The results of our experimental evaluation are
presented in Section 5. Section 6 includes the conclusion and discussion of future work.
2. Definitions
In this section, we introduce the definition of an IIS/MUS and MCS.
CSP. A CSP (Constraint Satisfaction Problem) π is defined as a triple < π, πΆ, π· >, where:
β’ π : a set of π variables π₯1 , π₯2 , ..., π₯π .
β’ πΆ = {π1 , π2 , ..., ππ } is the set of constraints.
β’ π· : the tuple < π·π₯1 , π·π₯2 , ..., π·π₯π >. The set π·π₯π contains the values of the variable π₯π .
A solution for π is an instantiation of the variables πΌ β π· that satisfies all the constraints in
πΆ. π is infeasible if it has no solutions. A sub-set of constraints πΆ β² in πΆ is also said infeasible
for the same reason except that it is limited to the constraints in πΆ β² .
We denote as:
β’ πππ(< π, πΆ β² , π· >) = β
, to specify that πΆ β² has no solutions, so it is infeasible.
β’ πππ(< π, πΆ β² , π· >) ΜΈ= β
, to specify that πΆ β² has at least one solution, so it is feasible.
A Linear Program, denoted as LP, is said to be linear if all the constraints in the set πΆ are
expressed as linear equations or inequalities. It is considered continuous if the domain of all
variables is real. If at least one variable in the set π is an integer or a binary (which is a special
case of an integer), and the constraints are linear, then π is referred to as a Mixed-Integer Linear
Program (MIP). If the constraints are expressed as nonlinear equations or inequalities, then π is
referred to as a Nonlinear Program (NLP).
Let π =< π, πΆ, π· > an infeasible CSP, we define for π :
IS. An IS (Inconsistent Set) is an infeasible subset of constraints in the constraint set infeasible
πΆ. πΆ β² is an IS iff:
β’ πΆ β² β πΆ.
β’ πππ(< π, πΆ β² , π· >) = β
.
IIS or MUS. An IIS (Irreducible Inconsistent Set) or MUS (Minimal Unsatisfiable Subset) is an
infeasible subset of constraints of πΆ, and all its strict subsets are feasible. πΆ β² is an IIS iff :
β’ πΆ β² is an IS.
β’ βπΆ β²β² β πΆ β² .πππ(< π, πΆ β²β² , π· >) ΜΈ= β
, (each of its parts contributes to the infeasibility), πΆ β²
is called irreducible.
MCS. πΆ β² is a MCS (Minimal Correction Set) iff :
β’ πΆ β² β πΆ.
β’ πππ(< π, πΆβπΆ β² , π· >) ΜΈ= β
.
β’ β πΆ β²β² β πΆ β² such as πππ(< π, πΆβπΆ β²β² , π· >) = β
.
3. The problem β€ k-MCD
Given an erroneous program modeled in a CFG1 πΊ = (πΆ, π΄, πΈ), where πΆ is the set of conditional
nodes, π΄ is the set of assignment blocks, and πΈ is the set of arcs, along with a counterexample, a
Minimal Correction Deviation (MCD) is a set π· β πΆ such that propagating the counterexample
on all the instructions of πΊ from the root, while having negated each condition2 in π·, allows
the output to satisfy the postcondition. A MCD is called minimal (or irreducible) if no element
can be removed from π· without losing this property. In other words, π· is a minimal program
correctness in the set of conditions. The size of the minimal deviation is its cardinality. The
problem of finding all MCDs of size smaller or equal to π is denoted as β€ k-MCD.
As an illustration (refer to Fig. 1), consider the CFG of the program AbsMinus (refer to
Fig. 1b). When provided with the counterexample {π = 0, π = 1}, this program has one
minimal deviation of size 1. While the deviation {π0 β€ π0 , π1 = 1 β§ π0 ΜΈ= π0 } does correct the
program, it is not minimal. In fact, the only minimal correction deviation for this program is
{π1 = 1 β§ π0 ΜΈ= π0 }.
Table 1
The progress of LocFaults for the program AbsMinus
Deviated conditions MCD MCS Figure
β
/ {π1 = π0 β π0 : 13} Fig. 1c
{π0 β€ π0 : 8} No / Fig. 1d
{π0 = 0 : 7},
{π1 = 1 β§ π0 ! = π0 : 10} Yes Fig. 1e
{π1 = π0 + 2 : 9}
{π0 β€ π0 : 8,
No / Fig. 1f
π1 = 1 β§ π0 ! = π0 : 10}
Table 1 provides a summary of the progress of LocFaults for the program AbsMinus, with
at most 2 conditions deviated from the counterexample {π = 0, π = 1}. The table displays
the conditions deviated, indicating whether they are minimal or non-minimal deviations, and
the MCSs (Minimal Correction Sets) calculated from the constructed constraint system : see
columns 1, 2, and 3, respectively. Column 4 shows the figure that illustrates the path explored
for each deviation. Additionally, the first and the third columns show the instruction and its
corresponding line in the program.
For example, the first line in the table indicates that a single MCS ({π1 = π0 β π0 : 13}) was
found on the path of the counterexample.
4. Error localization in loops
In the context of Bounded Model Checking (BMC) for programs, unfolding can be applied to the
entire program or to loops separately [5]. Our algorithm, LocFaults [4], for error localization
1
We use Dynamic Single Assignment (DSA) form [16] transformation that ensures that each variable is assigned
only once on each path of the CFG.
2
To navigate to the intended branch, we negate the condition to take the opposite branch.
π0 = 0 π0 = 0
1 class AbsMinus {
2 /*@ ensures
3 @ ((i < j) ==> (\result == j-i
)) && π0 β€ π0 β€
π0 π0
4 @ ((i >= j) ==> (\result == i- If Else If Else
j)); */
5 int AbsMinus (int i, int j) { π1 = π0 + 2 π1 = π0 π1 = π0 + 2 π1 = π0
6 int result;
7 int k = 0;
8 if (i <= j) { π1 = π1 =
1 && 1 &&
9 k = k+2; } // error : k = k π0 ! = π0 ! =
+2 instead of k=k+1 If π0 Else If π0 Else
10 if (k == 1 && i != j) {
11 result = j-i; } π1 = π0 β π0 π1 = π0 β π0 π1 = π0 β π0 π1 = π0 β π0
12 else {
13 result = i-j; } POST:{π1 = |π0 β π0 |} POST:{π1 = |π0 β π0 |}
14 return result; } }
(b) The CFG in DSA form of Ab- (c) The path of the counterexam-
(a) The program AbsMinus sMinus ple
π0 = 0 π0 = 0 π0 = 0
π0 β€ π0 β€ π0 β€
π0 π0 π0
If Else If Else If Else
π1 = π0 + 2 π1 = π0 π1 = π0 + 2 π1 = π0 π1 = π0 + 2 π1 = π0
π1 = π1 = π1 =
1 && 1 && 1 &&
π0 ! = π0 ! = π0 ! =
If π0 Else If π0 Else If π0 else
π1 = π0 β π0 π1 = π0 β π0 π1 = π0 β π0 π1 = π0 β π0 π1 = π0 β π0 π1 = π0 β π0
POST:{π1 = POST:{π1 = POST:{π1 =
|π0 β π0 |} is UNSAT |π0 β π0 |} is SAT |π0 β π0 |} is SAT
(d) The path obtained by deviat- (e) The path by deviating the con- (f) The path of a non-minimal de-
ing the condition π0 β€ π0 dition π1 = 1 β§ π0 ! = π0 viation
Figure 1: The exploration of AbsMinus CFG performed by LocFaults
follows the latter approach, where we use a bound π to unfold loops by replacing them with
nested conditional statements of depth π. For example, consider the program Minimum (refer to
Fig. 2), which contains a single loop that calculates the minimum value in an array of integers.
The effect on the control flow graph of the program Minimum before and after unfolding is
illustrated in Figures 2 and 3, respectively. The while-loop is unfolded three times, since three
iterations are required to calculate the minimum value for an array of size 4.
LocFaults takes the CFG of the erroneous program, πΆπΈ (a counterexample), ππππ (a bound
on the number of deviated conditions), and ππππ (a bound on the size of calculated MCSs) as
input. It enables us to explore the CFG in depth by diverting a maximum of ππππ conditions
from the counterexampleβs path by performing the following steps:
β’ Propagating CE on the CFG until the postcondition is reached. Then, it calculates the
πππ = π‘ππ[0]
1 class Minimum {
π=1
2 /*The minimum in an array of n integers
*/
3 /*@ ensures
4 @ (\forall int k;(k>=0 && k=min); π‘ππ.πππππ‘β β
5 @*/ 1
6 int Minimum(int[] tab){
7 int min=tab[0]; If
8 int i=1;
9 while(ival);*/
3 public static int squareRoot ()
4 {
5 int val = 50;
6 int i = 1;
7 int v = 0;
8 int res;
9 while (v < val){
10 v = v + 2*i + 1;
11 i = i + 1;
12 }
13 res = i; //error: should be res = i-1
14 return res;
15 }
16 }
Figure 5: The SquareRoot program.
With an unwinding limit of 50, BugAssist identifies the following suspicious instructions
for this program: {9, 10, 11, 13}. The localization time is 36.16π and the pre-treatment time is
0.12π .
LocFaults identifies suspicious instructions by providing their location in the program (in-
struction line), as well as the line of the condition and the iteration number of the loop leading
to that instruction. For example, {9 : 2.11} indicates that the suspicious instruction is on line
11 of the program, which is inside a loop with the stop condition at line 9 and the iteration
number is 2. The sets of suspected instructions identified by LocFaults are listed in Table 3. The
Table 3
MCD and MCSs calculated by LocFaults for SquareRoot.
{5}, {6}, {9 : 1.11}, {9 : 2.11}, {9 : 3.11},
β
{9 : 4.11}, {9 : 5.11}, {9 : 6.11}, {9 : 7.11}, {13}
{5}, {6}, {7}, {9 : 1.10}, {9 : 2.10}, {9 : 3.10},
{9 : 7} {9 : 4.10}, {9 : 5.10}, {9 : 6.10}, {9 : 1.11},
{9 : 2.11}, {9 : 3.11}, {9 : 4.11}, {9 : 5.11}, {9 : 6.11}
pretreatment time is 0.769s. The time for exploring the CFG and calculating MCSs is 1.299s.
We conducted a study of the times for LocFaults and BugAssist with values of "val" ranging
from 10 to 100 (where the number of unfoldings "b" used is equal to "val"), in order to analyze
the combinatorial behavior of each tool for this program.
βοΈThe
π
Sum program receives a positive integer π from the user and calculates the value of
π=1 as per theβοΈ
π postcondition. The
βοΈerror in Sum lies in the condition of its loop, causing it to
πβ1 π
calculate the sum π=1 π instead of π=1 π. The program contains linear numerical instructions
within the core of the loop, along with a nonlinear postcondition.
The time results for the SquareRoot and Sum benchmarks are presented in Tables 4 and 5,
respectively. It is observed that the execution time of BugAssist increases rapidly, while the
times of LocFaults remain relatively constant. Furthermore, the times of LocFaults with at most
0, 1, and 2 conditions deviated are comparable to those of LocFaults with at most 3 conditions
deviated.
Table 4
Computation time for SquareRoot Benchmark.
LocFaults BugAssist
Program b L
P P L
=0 β€1 β€2 β€3
V0 10 1.096π 1.737π 2.098π 2.113π 2.066π 0.05π 3.51π
V10 20 0.724π 0.974π 1.131π 1.117π 1.099π 0.05π 6.54π
V20 30 0.771π 1.048π 1.16π 1.171π 1.223π 0.08π 12.32π
V30 40 0.765π 1.048π 1.248π 1.266π 1.28π 0.09π 23.35π
V40 50 0.769π 1.089π 1.271π 1.291π 1.299π 0.12π 36.16π
V50 60 0.741π 1.041π 1.251π 1.265π 1.281π 0.14π 38.22π
V70 80 0.769π 1.114π 1.407π 1.424π 1.386π 0.19π 57.09π
V80 90 0.744π 1.085π 1.454π 1.393π 1.505π 0.22π 64.94π
V90 100 0.791π 1.168π 1.605π 1.616π 1.613π 0.24π 80.81π
Table 5
Computation time for Sum Benchmark.
LocFaults BugAssist
Program b L
P P L
=0 β€1 β€2 β€3
V0 6 0.765π 0.427π 0.766π 0.547π 0.608π 0.04π 2.19π
V10 16 0.9π 0.785π 1.731π 1.845π 1.615π 0.08π 17.88π
V20 26 1.11π 1.449π 7.27π 7.264π 6.34π 0.12π 53.85π
V30 36 1.255π 0.389π 8.727π 4.89π 4.103π 0.13π 108.31π
V40 46 1.052π 0.129π 5.258π 5.746π 13.558π 0.23π 206.77π
V50 56 1.06π 0.163π 7.328π 6.891π 6.781π 0.22π 341.41π
V60 66 1.588π 0.235π 13.998π 13.343π 14.698π 0.36π 593.82π
V70 76 0.82π 0.141π 10.066π 9.453π 10.531π 0.24π 455.76π
V80 86 0.789π 0.141π 13.03π 12.643π 12.843π 0.24π 548.83π
V90 96 0.803π 0.157π 34.994π 28.939π 18.141π 0.31π 785.64π
6. Conclusion
The LocFaults method detects suspicious subsets by analyzing the paths of the CFG to identify
the MCDs and MCSs from each MCD, utilizing constraint solvers. On the other hand, the
BugAssist method calculates the merger of MCSs by transforming the entire program into a
Boolean formula and leveraging Max-SAT solvers. Both methods start from a counterexample to
identify potential issues. In this paper, we have presented a scalability exploration of LocFaults,
with a focus on handling loops with the Off-by-one bug. The initial results indicate that LocFaults
is more effective than BugAssist for programs with loops. The execution times of BugAssist
tend to rapidly increase with the number of loop unfoldings, while LocFaults shows better
scalability in this aspect.
As part of our future work, we plan to validate our results on programs with more complex
loops. We also intend to compare the performance of LocFaults with existing statistical methods.
To further enhance our tool, we are developing an interactive version that presents suspect
subsets one by one, leveraging the userβs knowledge to select the conditions that should be
deviated. Additionally, we are considering ways to extend our method to handle numerical
instructions involving calculations on floating-point
References
[1] R. N. Charette, Why software fails, IEEE spectrum 42 (2005) 36.
[2] M. Bekkouche, Bug stories, 2015. URL: http://www.capv.toile-libre.org/Bug_stories.html.
[3] W. E. Wong, R. Gao, Y. Li, R. Abreu, F. Wotawa, A survey on software fault localization,
IEEE Transactions on Software Engineering 42 (2016) 707β740.
[4] M. Bekkouche, H. Collavizza, M. Rueher, Locfaults: A new flow-driven and constraint-
based error localization approach, in: Proceedings of the 30th Annual ACM Symposium
on Applied Computing, 2015, pp. 1773β1780.
[5] V. Dβsilva, D. Kroening, G. Weissenbacher, A survey of automated techniques for formal
software verification, IEEE Transactions on Computer-Aided Design of Integrated Circuits
and Systems 27 (2008) 1165β1178.
[6] J. A. Jones, M. J. Harrold, J. Stasko, Visualization of test information to assist fault
localization, in: Proceedings of the 24th international conference on Software engineering,
2002, pp. 467β477.
[7] J. A. Jones, M. J. Harrold, Empirical evaluation of the tarantula automatic fault-localization
technique, in: Proceedings of the 20th IEEE/ACM international Conference on Automated
software engineering, 2005, pp. 273β282.
[8] R. Abreu, P. Zoeteweij, A. J. Van Gemund, On the accuracy of spectrum-based fault
localization, in: Testing: Academic and industrial conference practice and research
techniques-MUTATION (TAICPART-MUTATION 2007), IEEE, 2007, pp. 89β98.
[9] M. Y. Chen, E. Kiciman, E. Fratkin, A. Fox, E. Brewer, Pinpoint: Problem determination in
large, dynamic internet services, in: Proceedings International Conference on Dependable
Systems and Networks, IEEE, 2002, pp. 595β604.
[10] A. Zakari, S. P. Lee, I. A. T. Hashem, A single fault localization technique based on failed
test input, Array 3 (2019) 100008.
[11] A. Dutta, K. Kunal, S. S. Srivastava, S. Shankar, R. Mall, Ftfl: A fisherβs test-based approach
for fault localization, Innovations in Systems and Software Engineering 17 (2021) 381β405.
[12] A. Majd, M. Vahidi-Asl, A. Khalilian, B. Bagheri, Consilientsfl: using preferential voting
system to generate combinatorial ranking metrics for spectrum-based fault localization,
Applied Intelligence 52 (2022) 11068β11088.
[13] Q. I. Sarhan, Γ. BeszΓ©des, Poster: Improving spectrum based fault localization for python
programs using weighted code elements, in: 2023 IEEE Conference on Software Testing,
Verification and Validation (ICST), IEEE, 2023, pp. 478β481.
[14] M. Jose, R. Majumdar, Cause clue clauses: error localization using maximum satisfiability,
ACM SIGPLAN Notices 46 (2011) 437β446.
[15] M. Jose, R. Majumdar, Bug-assist: Assisting fault localization in ansi-c programs, in:
Computer Aided Verification: 23rd International Conference, CAV 2011, Snowbird, UT,
USA, July 14-20, 2011. Proceedings 23, Springer, 2011, pp. 504β509.
[16] M. Barnett, K. R. M. Leino, Weakest-precondition of unstructured programs, in: Proceed-
ings of the 6th ACM SIGPLAN-SIGSOFT workshop on Program analysis for software tools
and engineering, 2005, pp. 82β87.
[17] H. Collavizza, M. Rueher, P. Van Hentenryck, Cpbpv: a constraint-programming framework
for bounded program verification, Constraints 15 (2010) 238β264.
[18] M. H. Liffiton, K. A. Sakallah, Algorithms for computing minimal unsatisfiable subsets of
constraints, Journal of Automated Reasoning 40 (2008) 1.
[19] E. Clarke, D. Kroening, F. Lerda, A tool for checking ansi-c programs, in: Tools and
Algorithms for the Construction and Analysis of Systems: 10th International Conference,
TACAS 2004, Held as Part of the Joint European Conferences on Theory and Practice of
Software, ETAPS 2004, Barcelona, Spain, March 29-April 2, 2004. Proceedings 10, Springer,
2004, pp. 168β176.
[20] J. Marques-Silva, The msuncore maxsat solver, SAT (2009) 151.