<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta>
      <journal-title-group>
        <journal-title>Applied Nanoscience</journal-title>
      </journal-title-group>
      <issn pub-type="ppub">1613-0073</issn>
    </journal-meta>
    <article-meta>
      <article-id pub-id-type="doi">10.1007/s13198-023-01887-3</article-id>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Tetiana Hovorushchenko</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Yurii Voichur</string-name>
          <email>voichury@khmnu.edu.ua</email>
          <xref ref-type="aff" rid="aff0">0</xref>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Workshop</string-name>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="editor">
          <string-name>Ukraine, Opole, Poland</string-name>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Khmelnytskyi National University</institution>
          ,
          <addr-line>Institutska str., 11, Khmelnytskyi, 29016</addr-line>
          ,
          <country country="UA">Ukraine</country>
        </aff>
        <aff id="aff1">
          <label>1</label>
          <institution>[5] T. Hai, J. Zhou, N. Li, S. K. Jain</institution>
          ,
          <addr-line>S. Agrawal, I. B. Dhaou. Cloud-based bug tracking software</addr-line>
        </aff>
      </contrib-group>
      <pub-date>
        <year>1857</year>
      </pub-date>
      <volume>13</volume>
      <issue>2023</issue>
      <fpage>0000</fpage>
      <lpage>0002</lpage>
      <abstract>
        <p>Improving the veracity of estimating the effort of software development is currently an urgent task. The purpose of this study is to develop a method for determining the number of lines of manually written source code to calculate the actual number of lines of code created by the programmer(s), which will allow a more accurate and reliable assessment of the effort of software development. The developed in this article method for determining the number of lines of manually written source code makes it possible to determine not only the LOCestimation (number of code lines) of the source code written by a programmer, but also the LOC-estimation (number of code lines) of automatically generated source code, as well as the number of uses of each automatically generated construction of a particular programming language. The method is universal because it can be customized for any used programming language. Software development effort (labor intensity), LOC-estimation, source code, manually written ITTAP'2023: 3rd International Workshop on Information Technologies: Theoretical and Applied Problems, November 22-24, 2023, Ternopil, Proceedings</p>
      </abstract>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>-</title>
      <p>source code, automatically generated source code.</p>
    </sec>
    <sec id="sec-2">
      <title>1. Introduction</title>
      <p>Analyzing the software development process and estimating the effort required to complete it is an
important task. An early estimate of effort (labor intensity) is important for effective planning of
resource utilization in a software project [1, 2].</p>
      <p>In today's highly competitive market, it is very important for a software project development team
to ensure timely delivery of their software product and stay within the planned budget, but in practice
they are constantly faced with schedule delays and budget overruns [3].</p>
      <p>Estimating the effort of a software project is an important process that involves predicting how much
time and money it will take to complete a software development project. For both clients and developers
in software development, effort estimation is vital. Underestimating the effort can lead to poorly
designed processes, low quality, delayed schedules and budget overruns, inadequate project approval
by management and customers, insufficient project team size, excessively tight development timelines,
and, as a result, reputational damage and loss of trust in developers in the event of budget and schedule
violations. Moreover, overestimating the complexity of software development may not be any better. If
more resources are allocated to a project than are actually needed, the software project will be more
expensive and time-consuming, and will result in a delay in the start of the next project or in its refusal
to the software development. It is effort estimation that helps to exchange information necessary for the
successful achievement of project results [4].</p>
      <p>The success of a software project includes three main elements: time, budget, and functionality.
When changes are made to one of the elements, the other elements will necessarily change as well, and
the nature of the impact of such changes depends mainly on the specifics of the project and the</p>
      <p>2023 Copyright for this paper by its authors.
CEUR</p>
      <p>ceur-ws.org
circumstances. For example, a reduction in project execution time may in one case lead to a decrease
in its budget due to a reduction in functionality, and in another case to an increase in its budget due to
the involvement of more developers to maintain the planned functionality [5-9].</p>
      <p>Estimation of effort is used to solve many problems, including the following [3, 10]: development
of a budget and schedule of a software project; analysis of the degree of risk and selection of a
compromise solution; planning and management of a software project; analysis of the costs of
improving the quality of software.</p>
      <p>Estimating effort project remains a difficult task for project managers. In the early stages of the
project, a high level of uncertainty and lack of experience lead to an inaccurate estimate of effort [11].
The rapid growth of the software industry drives the need for new technologies to improve the accuracy
of software effort estimation methodologies.</p>
      <p>Thus, improving the veracity of software development effort estimation is an urgent task.</p>
    </sec>
    <sec id="sec-3">
      <title>2. Case Study</title>
      <p>One of the main methods for estimating the effort of a software project is algorithmic modeling
[1216], which is a method that determines the dependence of the project effort on some quantitative
indicator of the software (usually the size of the code). This indicator is estimated for a given project,
after which the model predicts future costs. Most models for determining the complexity of software
development can be reduced to a function of five main parameters [17-20]:
1. the size of the final product – usually the number of lines of code or the number of function
points required to implement a given functionality
2. features of the process used to obtain the final product, in particular, its ability to avoid
unproductive activities (rework, interaction costs)
3. capabilities of the personnel involved in software development, especially their professional
experience and knowledge of the project subject area
4. the environment, which consists of tools and methods used to effectively perform software
development and automate the process
5. the required quality of the product, which includes its functionality, performance, reliability
and adaptability.</p>
      <p>The most influential factor in effort estimating in these models is the size of the software [21-24].
The main units of measurement of software size are: the number of lines of code (LOC) and function
points (FP). The number of lines of code (LOC-estimation) is the most famous, widespread, and most
used unit of measurement [4, 11, 25, 26].</p>
      <p>The advantages of using LOC-estimations as units of measurement [4, 11, 25]: widespread and easy
adaptability; the ability to compare methods of measuring size and performance in different groups of
developers; direct connection with the final product; easy evaluation before the end of the project; estimation
of software size based on the developer's point of view - a physical assessment of the created product.</p>
      <p>Along with the advantages, the use of LOC has a number of problems [4, 11, 25]: LOC-estimation
is difficult to use when estimating the size of software at the early stages of development; lines of source
code may differ depending on the types of programming languages, design methods, style and abilities
of the programmer; LOC-indicators cannot be used for normalization if the development platforms or
languages used are different; the use of estimation methods by counting the number of lines of code is
not regulated by industry standards; software development can be associated with high costs that do not
directly depend on the size of the source code – preparation of requirements specifications and user
documents that are not included in the direct costs of coding; programmers may be undeservedly
rewarded for achieving high LOC if management mistakenly considers it a sign of high productivity,
but there is no carefully designed project (source code is not an end in itself when creating a product –
functional properties and performance indicators play a major role); code generators often produce an
excessive amount of code, which distorts LOC-estimations; when counting the number of lines of code,
should distinguish between automatically and manually generated code.</p>
      <p>Most development environments include sets of standard elements. Whenever a new project is
created, the respective development environment automatically creates the "skeleton" of the future
application, and this code can be immediately compiled and run without errors. In this case, the software
project contains automatically generated source code, which is the basis of the future program. More
complex controls have so-called "wizards" that help you customize the behavior of controls by
automatically generating code depending on the selected options. Automatic generation of source code
saves developers' time, eliminates the need to re-create a typical source code every time and, of course,
reduces the effort of software development. It is for the purpose of correctly determining the
LOCestimation of the source code and further reliable estimation of the effort of software development that
it becomes necessary to distinguish between automatically generated source code and manually written
(by a programmer(s)) source code.</p>
      <p>Therefore, the purpose of our study is to develop a method for determining the number of lines of
manually written source code to calculate the actual number of lines of code created by the programmer,
which will allow a more accurate and reliable assessment of the effort of software development.</p>
    </sec>
    <sec id="sec-4">
      <title>3. Method for Determining the Number of Lines of Manually Written Source</title>
    </sec>
    <sec id="sec-5">
      <title>Code</title>
      <p>Since a software project contains automatically generated source code, which is the basis
("skeleton") of the future program, and source code that is added manually by the programmer(s), any
source code can be represented as a union of sets of automatically generated source code and source
code written manually by the programmer(s):
where 
is the automatically generated source code, 
is the manually written source code.</p>
      <p>Therefore, the LOC-estimation of the source code can be represented in the form:
 =</p>
      <p>∪ 

+</p>
      <p>,
,
estimation of the manually written source code.
that provide metric analysis of the source code.

is the LOC-estimation of the automatically generated source code, 

is the
LOC</p>
      <sec id="sec-5-1">
        <title>The overall LOC-estimation</title>
        <p>can be obtained using a variety of static analyzers and other tools</p>
        <p>For determining the number of lines of manually written source code, the LOC-estimation of
automatically generated source code is determined. Lines of automatically generated code consist of
certain constructions that form the following set (this set will have different content for different
languages and programming environments):
= {
1
, … , 

},
where</p>
        <p>is j-th automatically generated source code's construction for a particular programming
language under consideration (the main restriction on such a construction is that it must be a single line
of code), m is the number of constructions that can be automatically generated (this number is different
for different environments and programming languages).</p>
        <p>The entire source code can also be represented as a set of its lines:
where 
 is i-th line of source code, moreover 
 is, in turn, also a set consisting of the
constructions of a particular programming language, letters, numbers and symbols allowed by the
alphabet of a particular programming language. The main requirement for a source code is mandatory
compliance with the Code Style rules, in particular, in terms of code formatting (a new construction is
(1)
(2)
(3)
(4)


written on a new line, etc.).
developed:</p>
        <p>The universal rules for estimating the number of lines of automatically generated source code were</p>
        <p>if 
…
if 
…
if 
…

1, then</p>
        <p>= 



 ∈ 
1 ∈</p>
        <p>from formula (2):



= 
− 


,
 ,
 ( = 1. .  ,  = 1. .    ), then 
 , then</p>
        <p>, then 



The developed rules are universal for each language and programming environment, but the content
 } will be different (individual) for each specific language and
Method for determining the number of lines of manually written source code consists of the
form
1_</p>
        <p>a
, … , 
set
   _
of
automatically
generated
constructions
 
=
} for the programming language lang
2.</p>
        <p>using the method of searching in width in the forward direction in the set of rules for estimating
the number of lines of automatically generated source code, to search the rules for each of the
elements
of the
set
 
= {
1_</p>
        <p>, … , 
1, … , 

}, according to which the counters 
1, … , 
 

}
in
the</p>
        <p>set  =
of the number of lines
(used in the source code) of each automatically generated construction of the programming
3. to determine the LOC-estimation of the automatically generated source code by the formula:
is the number of uses in the source code of each automatically generated
construction of a particular programming language
4. to determine the LOC-estimation of manually written source code using the formula derived</p>
        <p>is the total number of lines of source code (as mentioned earlier, there are many
tools that provide LOC-estimation of source code)</p>
        <p>The developed method for determining the number of lines of manually written source code is
scientifically new and makes it possible to determine not only the LOC-estimation of the source code
written by a programmer(s), but also the LOC-estimation of automatically generated source code, as
well as the number of uses of each automatically generated construction of a particular programming
language. The method is universal because it can be customized for any used programming language.</p>
      </sec>
    </sec>
    <sec id="sec-6">
      <title>4. Results &amp; Discussion</title>
      <p>A real case of using the developed method for determining the number of lines of manually written
source code using the Visual C++ programming language as an example was considered. The analyzed
Visual C++ source code, which consists of 225 lines, is represented as a set 
First, a subset of the automatically generated constructions was formed 
, #
, 
, ℎ
, 
ℎ , 
,   , 
, 
_</p>
      <p>, 
, } for the Visual C++ programming language (this subset for the
example under consideration does not include all automatically generated Visual C++ constructs, but
only 11 such constructs, which is sufficient to demonstrate the operation of the proposed method).</p>
      <p>Using the method of searching in width in the forward direction in the set of rules for estimating the
number of lines of automatically generated source code, we searched the rules for each of the elements
of the set 

++ in the set 
225}, according to which the counters of the number
automatically generated code and 141 lines of manually written code.</p>
    </sec>
    <sec id="sec-7">
      <title>5. Conclusions</title>
      <p>Improving the veracity of estimating the effort of software development is currently an urgent task.
The purpose of this study is to develop a method for determining the number of lines of manually
written source code to calculate the actual number of lines of code created by the programmer(s), which
will allow a more accurate and reliable assessment of the effort of software development.</p>
      <p>The developed in this article method for determining the number of lines of manually written source
code makes it possible to determine not only the LOC-estimation (number of code lines) of the source
code written by a programmer, but also the LOC-estimation (number of code lines) of automatically
generated source code, as well as the number of uses of each automatically generated construction of a
particular programming language. The method is universal because it can be customized for any used
programming language.</p>
      <p>The directions of the authors' future research are: formation of complete sets of automatically
generated constructions for different actual programming languages; automation of the analysis of
source code in different languages to search for automatically generated constructions in it;
development of a tool for determining the number of lines of manually written source code, which will
work on the basis of the rules and method developed in this article.</p>
    </sec>
    <sec id="sec-8">
      <title>6. References</title>
      <p>[1] O. Pomorova, T. Hovorushchenko. The Way to Detection of Software Emergent Properties, in:
Proceedings of the 2015 IEEE 8-th International Conference on Intelligent Data Acquisition and
Advanced Computing Systems: Technology and Applications IDAACS-2015, Warsaw, 2015, vol.
2, pp. 779-784. doi: 10.1109/IDAACS.2015.7341409.
[2] O. Pomorova, T. Hovorushchenko. Research of Artificial Neural Network's Component of
Software Quality Evaluation and Prediction Method, in: Proceedings of the 2011 IEEE 6-th
International Conference on Intelligent Data Acquisition and Advanced Computing Systems:</p>
      <sec id="sec-8-1">
        <title>Technology and</title>
        <p>Applications, IDAACS-2011, Prague, 2011, vol.2, pp. 959-962. doi:
10.1109/IDAACS.2011.6072916.
10.1186/s13677-022-00311-8.</p>
        <p>173210. doi: 10.1007/s11704-022-1541-7.
[3] R. Arora, R. Mittal, A. G. Aggarwal. Investigating the impact of effort slippages in software
development project. International Journal of System Assurance Engineering and Management 14
defects analysis using deep learning. Journal of Cloud Computing 11 1 (2022), 32. doi:
[6] L. Aversano, M. L. Bernardi, M. Cimitile, M. Iammarino, D. Montano. Forecasting technical debt
evolution in software systems: an empirical study. Frontiers of Computer Science 17 3 (2023),
[7] T. Hovorushchenko, O. Pomorova. Methodology of Evaluating the Sufficiency of Information on
Quality in the Software Requirements Specifications, in: Proceedings of 2018 IEEE 9th International
Conference on Dependable Systems, Services and Technologies DeSSerT-2018, Kyiv, 2018, pp.
385389. doi: 10.1109/DESSERT.2018.8409161.
[8] T. Hovorushchenko, O. Pomorova. Information Technology of Evaluating the Sufficiency of Information
on Quality in the Software Requirements Specifications. CEUR-WS 2104 (2018) 555-570.
[9] T. Hovorushchenko T. Methodology of Evaluating the Sufficiency of Information for Software
Quality Assessment According to ISO 25010. Journal of Information and Organizational Sciences
42 1 (2018) 63-85. doi: 10.31341/jios.42.1.4.
[10] P. G. F. Matsubara, I. Steinmacher, B. Gadelha, T. Conte. Much more than a prediction:
Expertbased software effort estimation as a behavioral act. Empirical Software Engineering 28 4 (2023)
98. doi: 10.1007/s10664-023-10332-9.
[11] B. Şengüneş, N. Öztürk. An Artificial Neural Network Model for Project Effort Estimation.</p>
        <p>Systems 11 2 (2023) 91. doi: 10.3390/systems11020091.
[12] M. Padmaja, D. Haritha. Software Effort Estimation Using Grey Relational Analysis. International
Journal of Information Technology and Computer Science 9 5 (2017) 52–60. doi:
10.5815/ijitcs.2017.05.07.
[13] S. Basri, N. Kama, H. Md Sarkan, S. Adli, F. Haneem. An Algorithmic-Based Change Effort
Estimation Model for Software Development, in: Proceedings of 2016 23rd Asia-Pacific Software
Engineering Conference APSEC-2016, Hamilton, 2016, 177-184. doi: 10.1109/apsec.2016.034.
[14] P. Singal, A. C. Kumari, P. Sharma. Estimation of Software Development Effort: A Differential
Evolution Approach. Procedia Computer Science 167 (2020) 2643–2652. doi:
10.1016/j.procs.2020.03.343.
[15] R. Silhavy, Z. Prokopova, P. Silhavy. Algorithmic optimization method for effort estimation.</p>
        <p>Programming and Computer Software 42 3 (2016) 161–166. doi: 10.1134/s0361768816030087.
[16] Y. Mahmood, N. Kama, A. Azmi, A. S. Khan, M. Ali. Software effort estimation accuracy
prediction of machine learning techniques: A systematic performance evaluation. Software:
Practice and Experience 52 (2022) 39-65. doi: 10.1002/spe.3009.
[17] K. H. Kumar, K. Srinivas. An accurate analogy based software effort estimation using hybrid
optimization and machine learning techniques. Multimedia Tools and Applications 82 (2023)
30463-30490. doi: 10.1007/s11042-023-14522-x.
[18] R. Lalitha, P. Sreelekha. A methodology to analyse and estimate software development process
using machine learning techniques. International Journal of Software Engineering and Knowledge
Engineering 33 6 (2023) 815-835. doi: 10.1142/s021819402350016x.
[19] H. Sone, Y. Tamura, S. Yamada. Study of Effort Calculation and Estimation in Open Source
Projects. International Journal of Reliability, Quality and Safety Engineering 30 3 (2023) 2350011.
doi: 10.1142/s0218539323500110.
[20] R. K. Gora, R. R. Sinha. A Study of Evaluation Measures for Software Effort Estimation Using
Machine Learning. International Journal of Intelligent Systems and Applications in Engineering
11 6s (2023) 267–275.
[21] V. Yadav, R. Singh, V. Yadav. Estimation Model for Enhanced Predictive Object Point Metric in
OO Software Size Estimation Using Deep Learning. The International Arab Journal of Information
Technology 20 3 (2023) 293-302. doi: 10.34028/iajit/20/3/1.
[22] M. Jørgensen. Improved measurement of software development effort estimation bias. Information
and Software Technology 157 (2023) 107157. doi: 10.1016/j.infsof.2023.107157.
[23] E. Rodríguez Sánchez, E. F. Vázquez Santacruz, H. Cervantes Maceda. Effort and Cost Estimation
Using Decision Tree Techniques and Story Points in Agile Software Development. Mathematics
11 6 (2023) 1477. doi: 10.3390/math11061477.
[24] I. Abnane, A. Idri, I. Chlioui, A. Abran. Evaluating ensemble imputation in software effort
estimation. Empirical Software Engineering 28 2 (2023) 56. doi: 10.1007/s10664-022-10260-0.
[25] S. Singh, K. Kumar Software Cost Estimation: A Literature Review and Current Trends, in:
Proceedings of 2023 Third International Conference on Secure Cyber Computing and
Communication ICSCCC-2023, Jalandhar, 2023, pp. 469-474. doi:
10.1109/icsccc58608.2023.10176495.
[26] X. Yuan, J. Su, C. Yu, S. Ye. Power Grid Software Cost Estimation Based on Improved COCOMO
Model, in: Proceedings of 2023 IEEE 3rd International Conference on Electronic Technology,
Communication and Information ICETCI-2023, Changchun, 2023, pp. 1265-1269. doi:
10.1109/icetci57876.2023.10176686.</p>
      </sec>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>, then</mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>