Case-Based System of User Interface Reverse Engineering Pavel Myznikov Novosibrisk State University 1 Pirogova str., Novosibirsk 630090, Russia miznikov72@gmail.com Abstract lems with adaptation of similar problems from the past to a current situation. This approach has been chosen due to the The paper is devoted to the implementation of case-based reasoning in reverse engineering of graphical user interfaces. hypothesis that formalization of images (which is, in fact, In particular, web interfaces are considered. We suggest au- creation of html markup) using transductive reasoning, or tomating HTML/CSS markup building with aggregation of reasoning by analogy, produces a result, the most similar to code samples from previous cases. The proposal is based what a human does. Also, such a scheme can solve the prob- on the hypothesis that analogy is employed to conceptualize lems with the automation described above. HTML markup construction. The article considers the origi- The state of the art solutions offer to generate code with- nal theory of building an image structure, the modification of out human involvement. They work as black boxes, where the case-based reasoning approach and the results of practical the final result totally depends on the collected dataset. The experiments. CBR approach is to fill this gap with the acquisition of ex- perts’ knowledge in case storage. The ultimate role of CBR, Introduction therefore, is to help specialists to formalize their knowledge Web-technology is one of the most developed areas of mod- in a relatively small dataset and then automatically combine ern computer science. It is used not only in website develop- complex interfaces in such a way as if the specialists did it ment but also as an important node of any IT-infrastructure. manually. Consequently, the whole information technology industry uses web-development in one form or another. Related work Nevertheless, automation of web-development itself re- To fill a gap in knowledge for readers who are not familiar mains incomplete. One of the key tasks – creation of with the field, we overview two types of related work. The html/css markups – does not have an extensive solution first one is the papers about interfaces reverse engineering: yet. Some properties of this process prevents using clas- they are the benchmarks which we compare our approach sical methods of automation. These properties are: non- with. The second one is an outline of CBR. Since this is a formalized requirements for a layout, variability of indus- core of the given approach, this information is crucial for trial standards for code writing, and, finally, cross-browser understanding the paper. and cross-platform compatibility. The impact of automation of html/css markups building is not limited to speeding up a process of web-applications Interfaces reverse engineering development. It also makes the development more flexible We consider two types of interfaces that are typical objects allowing to scale results, verify hypotheses and help testing of the reverse engineering task: mobile interfaces and web applications. interfaces. We can expand the results of the work to a more general domain: any kind of technologies where markup languages Mobile interfaces Mobile development is supposed to be are used for building GUI. a more common area for interfaces reengineering than web. A methodological basis of the work is the Case-Based It can be explained by the fact that mobile interfaces are usu- Reasoning (CBR) approach (Kolodner 1992). Briefly de- ally relatively simplier and more unified, while web pages scribing its essence, one can say it is a way of solving prob- have a great variance of different layouts. Here we point at a few mobile interface reverse engineering researches that we Copyright c 2020 held by the author(s). In A. Martin, K. Hinkel- consider as robust baselines in terms of comparing OCR and mann, H.-G. Fill, A. Gerber, D. Lenat, R. Stolle, F. van Harmelen (Eds.), Proceedings of the AAAI 2020 Spring Symposium on Com- code generation parts. bining Machine Learning and Knowledge Engineering in Practice The framework of (Nguyen and Csallner 2015), RE- (AAAI-MAKE 2020). Stanford University, Palo Alto, California, MAUI, is conceptually similar to our approach, especially USA, March 23-25, 2020. Use permitted under Creative Commons in the OCR part. REMAUI is also faced with the problem of License Attribution 4.0 International (CC BY 4.0). false-positive candidates and contains a merging step while parsing an input. In both questions, they use a set of heuris- ally generate less accurate results in terms of pixel-to-pixel tics specific to a mobile domain. Surely, there is a difference. similarity, but an important thing is that specialists can con- Firstly, our OCR algorithm is mainly focused on image de- tribute their knowledge in a case storage, thereby managing tection rather than text detection. Secondly, we suggest an- a style or methodology that they expect to see in the final other model of hierarchy building. However, we can state result. that the idea of filtering candidates in the OCR stage was added to our framework after familiarization with their pa- Case-Based Reasoning per. As mentioned above, case-based reasoning is a core of the Based on REMAUI, (Natarajan and Csallner 2018) de- approach. This section outlines basics of CBR. veloped a tool – P2A – for generating a code for animated The origin of CBR is the work of Roger Schank (Schank mobile application. This is the next step of the problem: not 1982). His idea is representing knowledge as memory or- only to recognize a layout but also to find a relation between ganization packets (MOPs) that hold results of a person’s two similar interfaces screenshot. What is important is that experience. When a person solves a problem, he or she uses this tool implies an interaction with a user while performing MOPs to reapply a previously successful solution scheme in a task as well as our approach. The difference is that we sug- a new similar context. This approach contradicts the rule- gest a long-term setting of the system in the beginning but based approach, where a person uses predefined scripts for the absence of it later, while P2A offers a user’s involvement all cases. You can find more information in (Watson and in each session of image processing. Marir 1994). The work of (Chen and Burrell 2001) is devoted to solv- A lot of works devoted to this approach were written ing a very specific subtask of mobile interface reverse en- in the 1990s (Kolodner 1992; Watson and Marir 1994; gineering: automatic conversion of an Android application Aamodt and Plaza 1994; Ram and Santamarı́a 1997). In into an iOS application and vice versa based on interface the 2000s case-based reasoning was implemented in differ- screenshots. Having such restricted input and output of the ent domains: medicine (Gómez-Vallejo et al. 2016; Delir task, the authors were able to implement a more powerful Haghighi et al. 2013), business (Chou 2009; Chang, Liu, model. Namely, they reduced the task to component detec- and Lai 2008; Carrascosa et al. 2008), finance (Sartori, Maz- tion and component classification, where for the second part zucchelli, and Gregorio 2016; Sun et al. 2014), government convolutional neural networks were used. (Lary et al. 2016), education (Zeyen, Müller, and Bergmann 2017), information technologies (De Renzis et al. 2016; Web interfaces Further, we would like to focus on web Navarro-Cáceres et al. 2018; He 2013), systems design reverse engineering, because in the case of solving this task, (Shen et al. 2017), geosciences (Lary et al. 2016). a possible outcome can be more considerable, as web tech- In short, CBR is a method of solving a problem by adap- nologies are applied in an extremely wide range of domains. tation of solutions to similar problems in the past to a current Researches in this field are not so numerous, however. situation. Asiroglu in (Asiroglu et al. 2019) presented a quite un- The core concept in CBR is a case. A case is a threesome usual reverse engineering case. Their tool accepts a hand- of elements: drawn sketch and generates an HTML document using a deep convolutional neural network. Such an application • Problem is a state of the world when the case occurs. seems very helpful in the industry. Unfortunately, it cannot • Solution is a suitable answer for the problem in the given be used in solving the given research problem, because in context. our case, we should detect exact features of the interface, • Outcome is a state of the world after the problem is e.g. color, font, size, etc., while that tool perceives a com- solved. mon idea of an interface. The most recent successful case of solving the problem in The process of finding the solution is called a CBR-cycle the industry is the result of UIzard Technologies company that consists of 4 stages (Kolodner 1992): (Beltramelli 2017) with the title “pix2code”. The authors 1. Retrieve. The best matching case is extracted from the note the similarity of the task of generating a code based case base. on an interface screenshot with the task of generating a text 2. Reuse. The solution to the case retrieved is adapted to in a natural language based on an image. Consequently, they solve the current problem. use a similar method: the basis of the solution is a cascade of long-short term memory recurrent neural networks. At the 3. Revise. The adapted solution is tested: if it is not satis- moment, this project is only on a proof-of-concept stage, so factory, then either additional cases are retrieved, or the it is too early to make valid conclusions about the viability retrieved solution must be adapted again. of the idea. 4. Retain. The case with the give solution is added to the As you can see above, there are many points of view on case base. the interface reverse engineering problem. Each of them is suitable for specific conditions and goals. The contribution of the given paper is building a reverse engineering frame- Task definition work more as an expert system rather than an automation When defining the task, we aim to reproduce the way that a tool. It means that compared with alternatives, it can eventu- human performs HTML markup building. All criteria, con- Figure 1: CBR cycle according to Aamodt and Plaza (Aamodt and Plaza 1994) Figure 2: Scheme of generation HTML code based on a strains and conditions given below are based on how the in- bitmap. 1. Extraction elements. 2. Building a hierarchy of dustry operates. For more information, see (Ali 2017). elements. 3a and 3b. Transforming a hierarchy to html doc- The following life-cycle of interaction with the system is ument using a case storage. 4. Assembling final files. suggested. In the very beginning, an expert (or a group of ex- perts) populates a case storage with code templates to setup the system. Or, they can use a ready case storage populated identically processed in all modern browsers and operat- with someone else. This step is required to be done only ing systems. once. Then, the system is used to get a design of interface In addition, we set input constraints on the current stage. as an input and generate code automatically. At this point, An image must not include we don’t expect an absolutely accurate result. A specialist, • gradient background in the end, makes final improvements in the code, or uses it as a basis for more high-level tasks. • animation elements Consequently, the main requirements to an image pro- • popup elements cessing part are that the algorithm must recognize an image In the future, we intend to make an algorithm working structure, find as many features (color, type, margins, etc.) beyond these conditions. of objects as possible, and generate the code implementing the structure recognized. Wherein, it is not necessary to Architecture • recognize all features The basis of the architecture is an attempt to recreate the • recreate the picture pixel-to-pixel process of writing code by a human. The hypothesis is, a human uses transductive reasoning when formalizing visual The introduction and related work sections contain con- information: one extracts structures of different levels from ditions that should be considered in solving the task. It is an image and describes this structure by analogy with one’s necessary to make a set of requirements for the system such own or others’ previous experience. that it would be possible to get positive results in these con- In particular, such a situation is observed in writing ditions. HTML-code. Meeting an unknown layout pattern, a special- • Condition 1. Non-formalized requirements. An image ist finds in the literature, how to implement it on HTML. itself does not contain complete information about what Further, referrals to directories are becoming less and less, should be a final HTML-markup. Therefore, it must be but the principle remains the same: facing another pattern, a available to include additional knowledge about the re- human extracts from memory a suitable example and adapts quirements for the system. it to a current situation. The process is implemented in the following way (Fig. 2): • Condition 2. Variability of industrial standards to writing the code. There are a few methodologies 1. The elements are extracted from the image (text, pictures, of HTML-markups (adaptive, BEM (block-element- blocks, etc.). modificator), bootstrap and so on). Moreover, as a rule, 2. The extracted images are combined in a tree-like structure each developer’s team has its own standards. Thus, the according to a special algorithm. system must generate code in several styles and standards. 3. A prefix tree traversal is performed; on each iteration, • Condition 3. Cross-browser and cross-platform com- there is a request to the case base to find a suitable HTML patibility. HTML-code must be not only valid, but also description of an architectural pattern in the tree node. 4. Artifacts received on the previous step are saved in files. Algorithm 2 Search for a suitable node (findNode) The next sections are devoted to the description of these Require: nodei – a node for that a suitable node is be- steps. ing found, nodex1 , nodey1 – a left top coordinate, nodex2 , nodey2 – a right bottom coordinate Image structure building Require: orientation – a direction of joining nodes The goal of this step is to receive a so-called structure of Ensure: nodej is a node demanded or NULL if there is no a bitmap, i.e. a mathematical object describing a mutual such a node arrangement of image parts. In (Myznikov 2018), we de- 1: if orientation is HOR then scribed an approach to solving this problem. We detect bor- 2: nodej ← the next node to the right of the nodei ders of objects (embedded images, texts, background) and 3: else if orientation is VERT then then use a greedy algorithm to build a hierarchy of these 4: nodej ← the next node to the bottom of the nodei 1 x1 objects (see algorithms 1 and 2). The hierarchy then is pro- 5: x ← min(nodex 1 i , nodej ) cessed with the CBR cycle (see the next section). y1 y1 6: y 1 ← min(nodei , nodej ) 2 2 Algorithm 1 Algorithm of hierarchy construction 7: x2 ← max(nodex x i , nodej ) y2 y2 Require: nodes – set of elements ordered by horizontal and 8: y 2 ← max(nodei , nodej )) vertical axes of the top left coordinate 9: R ← rectangle (x1 , y 1 ) (x2 , y 2 ) Ensure: |nodes| == 1 and nodes1 is a root node of the 10: for all nodek ∈ nodes do tree containing all elements from the origin set 11: if nodek ∩ R 6= ∅ then 1: while |nodes| > 1 do 12: nodej ← NULL 2: for orient ← [HOR,VERT] do 3: i←0 4: while i < |nodes| do 5: suitN ode ← f indN ode(nodesi , orient) Retrieval phase A new case has a problem but no solu- 6: if suitN ode is not NULL then tion and outcome. The task of this stage is to retrieve cases, 7: if nodei is composite then which are the most similar to the problem of a new case. In 8: nodesi .addChild(suitN ode) general, it is a task of n-class classification, where n is the 9: else number of cases in storage. On the proof-of-concept stage, 10: newN ode ← new N ode we choose the method of k-nearest neighbors with Euclid 11: newN ode.addChild(nodesi ) distance. For this purpose, we vectorize and normalize prob- 12: newN ode.addChild(suitN ode) lems. This means that first, all categorial features are trans- 13: newN ode.orientation ← orient formed to a numerical type, second, absolute values are re- 14: nodesi ← newN ode placed with relative ones by the formula: xi = xmaxx−x i min . 15: nodes ← nodes \ suitN ode Then, calculate a distance between pPn the given vector and vec- 2 16: i←i+1 tors in a case base d(x, y) = i=1 (xi − yi ) and select the case, where distance is minimal. In terms of CBR, this case is called retrieved. Development of the CBR-system Adaptation phase The adaptation stage is performed with Remember that generation of code is performed with the an algorithm, receiving a problem of a new case and a solu- prefix tree traversal, where a tree is an image structure. On tion of a retrieved case and returning an outcome. A case each iteration, the CBR-cycle runs. We need to define the containing a problem of a new case, solution of a retrieved problem, solution, and outcome in the context of the task. In case and the generated outcome is a solved case, and the other words, to describe a case format. solution is called suggested. • Problem is a features vector of a node and its children. In the current work, a derivative type of adaptation is sug- gested, that means an old solution is to be regenerated in • Solution is templates of HTML and CSS code, that im- new conditions. Since a solution is a code template, a re- plement markup of the structure described. sult is built with using templating language. The difficulty • Outcome is HTML/CSS code generated from the tem- is that the results of different cases may conflict with each plate applied to a specific context. other during the processing. It is especially true for conflicts Thus, when each node of a tree is processed, a “problem” in CSS code: situations when different results describe the is formed: description of the current element and elements same class differently. That is why conflict resolution is an on a lower level in an image structure. In storage, there are important part of the adaptation stage. cases in which problems are typical cases of elements lay- The base of conflict resolution is a resolving of three out, and solutions are typical ways of markup. The task of a cases: 1) absence of conflict; 2) a complete conflict, that is CBR-cycle is to get HTML/CSS code to build a markup of classes describe incompatible styles; 3) one class is a special a current structure using the solution stored. Let us describe case of another. The following strategies are applied accord- the cycle in detail. As a notation, denote the case processed ingly: 1) the code is saved as is; 2) different classes with as new case. Further terms will be included as they appear. own styles are created, and appropriate replacements in an 3. There is no “clogging” of a global storage with specific Table 1: Differences of local and global storages cases, which, first, positively influences the size of global Local storage Global storage storage, and second, it reduces a level of overfitting of the Stored in RAM Stored in external mem- system. ory In other words, a global storage serves as a source of cases Destroyed after a docu- Exists regardless of that have solutions of code generation of a typical layout. ment is processed documents processing Then, local storage is used for accurate and rapid adaptation Filled during a docu- Filled with a special of these solutions in the context of a specific document. ment processing procedure before docu- ments processing Case storage population The case storage is populated by Cases have a higher pri- Cases have a lower pri- experts with a share of automation. Namely, a set of cases ority when extracting ority when extracting with their problem parts is built automatically, and corre- sponding solution parts are filled by specialists. The follow- ing are the steps for generating cases: HTML-document are made; 3) a new class is added to a 1. A list of real websites is collected. styles set, which is a special case, and appropriate inserts 2. DOM-trees of random pages are saved. in an HTML-document are made. 3. Trees are split by sub-trees with prefix traversal of a whole Evaluation phase In the classic CBR, an evaluation step tree and selecting each node with its children. serves as validation of the suggested result. A case that passed the test is called tested, and the solution is approved. 4. A given set of sub-trees is clustered with DBSCAN algo- Regarding this work, the task of evaluation of the quality of rithm, when tree-edit distance is used to define the simi- HTML/CSS code deserves a separate study. In the current larity between objects. (See tree-edit distance overview in state of the research, this step is skipped, and validation of the section below.) the result is resolved for the whole document and not on each 5. From each cluster, a random object is selected. The se- iteration of a CBR-cycle. lected objects form the case storage, where each object is Updating phase The goal of the last step of the cycle is a problem part of a single case. to save an approved solution in case storage so that it can While managing parameters eps and min samples of DB- be reused in the future. In the current work, this step differs SCAN, one can control the size of a case storage. The bigger from the classic approach. Unlike using a centralized case the case storage, the more accurate the system, but there is storage, the CBR-system developed maintains a “global” more work for experts. One must find a right balance be- and few “local” case storages (see table 1). The global case tween these two criteria to set up the system adequately to storage is used for all images. The local ones are used only the existing conditions. Then, when the set of cases with in processing a specific document. We can say that a “local” their problems is built, experts can populate it with solutions storage is a context of a document: it contains the results of - HTML/CSS codes. solved problems during its processing. Specifically, a tested case is always saved only into a local storage. Wherein, on a retrieval step, a search is performed Results and further work in both global and local repositories, where local ones have Before the experiment evaluation, let us illustrate how the priority. This approach has several advantages: system works in practice. As an example, let us consider a page of Novosibirsk State 1. A risk of different code generation for the same blocks of University site. an image is decreased. The system may process identical As an output of the first stage of image processing, ele- blocks differently, because noise on an image can affect ments were extracted and grouped into nodes (Fig. 3). Then, retrieval of a suitable case. At the same time, the size lim- the algorithm (described earlier in the paper) created a tree- itation of a local storage prevents generating redundant like structure (Fig. 4). We estimate that the result is of good solutions. quality, because despite of some mistakes (wrong font was 2. System performance is increased: selected, some pictures were missing, tiny errors in size ex- (a) a global storage is located in external memory, while a isted), the system managed to solve the main task: to recreate local one is in RAM; as a result, the number of requests a structure of elements and generate human-like code (Fig. to a hard disk reduces 5). (b) graphical user interfaces have a property to contain a lot of similar blocks; when processing the next block, a Evaluation method search in a relatively small local storage, which already The question of evaluation is open. What should we compare contains a suitable case, is performed much faster than to estimate the result? One option is to compare images: an a repeating search in a global storage; also, an adapta- initial screenshot and a screenshot of the interface generated tion stage requires less resources than initial adaptation by the system. It is probably the most obvious way but as we of an “unprepared” case. stated in the task definition part, we do not aim to recreate an Figure 6: Top-left parts of the input image (left) and the gen- erated result (right). Some features like margins, fonts and others have been incorrectly reproduced. But in general, the layout is generated correctly. interface by pixel-to-pixel: it is much more important to rec- ognize the idea of the layout. Another option is to compare source codes. We can collect a base of real applications with their source codes, make screenshots, generate a code, and find the difference between the texts. This way is too sensi- tive to implementation specificity. There are numerous ways Figure 3: The first step of processing: elements detection. to write a code for the same interface and there is no argu- The processed image (left) and selected elements and nodes ment to consider the result incorrect if the generated code (right). The nodes are bounded with colored boxes. differs from the real one. We believe the best way is to compare tree models of the source codes, because a tree model represent the way how a code is organized. We collect the data with crawling web pages so that among the screenshots we get HTML code and extract the DOM model from them. As a rule (except some tricky cases), the DOM model adequately describes elements structure. The idea is to map structures received by our algorithm with DOM trees and roughly estimate their similarity. Such a procedure allows us to evaluate our ap- Figure 4: The second step of processing: structure build- proach, although we understand that it is limited only on ing. The tree represents hierarchical relations between ele- a specific subset of images (only web interfaces) and also ments. some technical tricks of web technologies can affect the cor- rectness of DOM trees and thereby decrease the estimates. Nevertheless, it is the best way of evaluating the method so far, which can prove the viability of the solution and outline the next steps. Selection of a measure The crucial thing is the selection of a quality measure. There are few approaches to comparing trees: some of them count operations needed for the transformation of one tree to an- other; others compare the longest common paths from a root to a tree node; also there are variable-length doesn’t care (VLDC) based methods. We select two measures: edit distance and bottom-up dis- tance. Tree edit distance. The review of edit distance methods can be found in the survey of Philip Bille. This is how he defines the problem: Given T is a labelled ordered tree (ordered means that order among siblings matters). Define tree operations as fol- lows: Figure 5: The final result. A web-page image generated by • Relabel. Change the label of a node. the system. • Delete. Delete a non-root node v in T with parent v’ keep- ing children of v order. • Insert. Insert a node v as a child of v’ in T making v the relatively high and bottom-up distance would be relatively parent of a consecutive subsequence of the children of v’. low. Assume there is an edit cost is c : V × V → R. c(v, w) To estimate the approach from different points of view we denotes relabel, c(v, ⊥) denotes delete and c(⊥, w) denotes use both measures as well as the F -score that generalizes a insert operations. Given an edit script S between T1 and T2 common penalty. The problem of aggregating the scores is that they have different scales: [0, +∞) for an edit distance Pn s1 , s2 , . . . , sn , si ∈ V × V is a sequence of edit operations and [0, 1] for a bottom-up one. We solve this problem by and cost of S is d(S) = i=1 c(si ). Denote an optimal edit script between T1 and T2 is Sopt (T1 , T2 ) : d(Sopt ) = transforming an edit distance measure. In the beginning, we minSt d(St ) and d(Sopt ) is a tree edit distance. (Bille 2005) normalize its value by the size of a sample tree: Based on the survey (Bille 2005), we considered that the d(Sopt (T1 , T2 )) best solution of the tree edit distance problem for our case is dnorm (T1 , T2 ) = (3) the Klein’s algorithm (Klein 1998), which requires a worst |T2 | 2 case time bound of O(|T1 | |T2 | log |T2 |) and a space bound It allows us to compare edit distance for trees of different of O(|T1 ||T2 |). sizes but the measure is still not limited from above. There- In our case, we suggest the following edit cost function. fore, we apply the logarithm to the measure: First, we denote a fixed finite alphabet Σ containing values for labels: 1 dlog (T1 , T2 ) = − log( ) (4) 1 + dnorm (T1 , T2 ) Σ = {t, p, i} As d is always positive, the domain of dlog is [0, 1). More- where t stands for an element with text, p stands for an ele- over, the logarithmic form of the measure perfectly suites ment with an embedded picture, and i stands for an internal our idea about estimating the method: we assume that the node. more trees differ, the less important the exact value of the We denote edit cost as follows: difference. Finally, we can denote the F -measure. In order to move  1, if v 6= ⊥ ∧ w = ⊥ from cost scores to measure quality, we subtract distances from one.   0.8, if v = ⊥ ∧ w 6= ⊥  c(v, w) = (1)  0.1, if v 6= ⊥ ∧ w 6= ⊥ ∧ v, w ∈ {p, t} (1 − b) · (1 − dlog ) (1 − b) · (1 − dlog )   0, otherwise F =2 =2 (5) (1 − b) + (1 − dlog ) 2 − b − dlog Also, denote that a tree received by our algorithm would be the first parameter, and a tree from a data set would be Data set collection the second one. This edit cost function penalizes the method the most if We used Alexa service 1 to get a list of 1000 the most a resulted tree misses any node, a little less if it contains an popular websites. Then we made screenshots of the main extra node and very little if it mixes up a text with a picture. and two random pages of each website. Also, we crawled You can mention that it does not penalize for mixing up an HTML/CSS codes of each page and transformed them into internal node with no internal, because in this case the al- trees using DOM parser in Python. Due to technical restric- gorithm either misses or adds an extra node, which already tions in some cases, we were only able to collect a dataset of implies a big penalty. 2640 examples. Bottom-up distance. This method is presented in the In addition, to make the experiment more useful and get work of Valentie (Valiente ). The complexity of the algo- more insights we scored each web page with a measure rithm is O(|T1 ||T2 | log(T1 + T2 )). For two trees T1 and T2 “Text-to-Image ratio”: the bottom-up distance equals 1−f / max(|T1 |, |T1 |), where T f is the size of the largest common forest in T1 and T2 . We tti = (6) slightly change this formula and make it asymmetric: T +I where T is the number of nodes with text content and I is basym (T1 , T2 ) = 1 − f /|T2 | (2) the number of nodes with embedded pictures. Edit and bottom-up distances as defined above can be The reason why we use this score is to estimate how a roughly interpreted as “precision” and “recall” respectively share of text and images affects the result and to define the in terms of machine learning evaluation. Technically, they next steps. It is important to understand which part of the are not the same, but it gives us a good idea of how to read method is the most problematic, and where efforts should be an experiment results. Indeed, for a tree that correctly rep- focused to increase the quality as much as possible. resents a part of another tree but completely does not con- tain another part, edit distance would be relatively low, while Experimental results bottom-up distance would be relatively high. Otherwise, if a In general, results are as follows (see Table 2): tree contains all nodes of another tree, but the structure is 1 different and some extra nodes exist, edit distance would be https://www.alexa.com/topsites Table 2: The experiment results Mean Variance Edit distance 0.28 0.028 Bottom-up distance 0.15 0.043 F score 0.78 0.033 Figure 8: F-score mean and variance depending on “Text-to- Image ratio” Next steps In addition to making improvements from the previous sec- tion, the next steps include: Figure 7: Mean and variance of edit and bottom-up distances depending on “Text-to-Image ratio” 1. Development of a procedure of automatic filling of case storage. The current paper considers the development of a sys- The mean of F-score is 0.78, and the variance is 0.033, tem where case storage is populated manually by experts. which appears to be a good result. Note, that the bottom- As explained above, this approach has strong advantages. up distance is much better than the edit distance. It means However, in general, we would like to collect cases auto- that the recognition part, on average, works better than the matically, because it would be less time-consuming and structure building part. less prone to human errors. Our suggestion is to analyze Let us analyze the dependence of scores on “Text-to- existing web-sites elements with one of the methods of Image ratio” (figures 7 and 8). clustering and use centers of clusters as reference cases. We see that the biggest issues are with the cases when an image has both text and embedded pictures, approximately 2. Development of a better similarity measure. in a proportion of seven to three. Comparing extreme cases, One of the crucial elements of a case-based reasoning so- when an image consists of mainly embedded pictures or lution is the selection of an appropriate similarity mea- mainly text, in the first case the approach works far better sure. The current version uses a simple KNN principle. than in the second one. Herewith, edit distance is decreasing Consequently, there is room for optimizing the measure when moving from mixed cases to more text-based, while construction, because the nearest neighbors algorithm is for the bottom-up distance this effect is not so strong. insensitive to categorical features. As categorical features Also note the heteroskedasticity of the data: the variance are the majority of elements properties processed, we are is bigger in “mixed” cases and smaller in extreme cases. planning to use classifier models based on decision trees. That is, the method is more unpredictable when a picture has diverse content. 3. Development of a revise stage in the CBR-cycle. Based on these outputs, we make the following conclu- When building a CBR cycle, the revising stage has been sions: skipped, because the evaluation of HTML document cor- rectness is an unsolved problem. This question should be • The method demonstrated satisfactory results. investigated to make the cycle complete. • The results can be advanced by applying forces in the fol- In conclusion, we developed a system that generates lowing areas: markup language source code for a given interface screen- shot. The feature of the approach is using experts’ knowl- – enhancement of the text detection part edge that is kept in a specific case storage. The experiments – improvement of the processing of the cases when an demonstrated the satisfactory quality of the current solution image has diverse content and provided grounds for the further development. References Myznikov, P. 2018. Development of the Case-Based Ap- Aamodt, A., and Plaza, E. 1994. CBR: foundational issues, proach of Web Interfaces Reverse Reengineering. Vestnik methodological variations and system approaches. AI Com- NSU. Series: Information Technologies 16(4):115–126. munications 7(1):39–59. Natarajan, S., and Csallner, C. 2018. P2A: A Tool for Con- Ali, K. 2017. A study of software development life cy- verting Pixels to Animated Mobile Application User Inter- cle process models. International Journal of Advanced Re- faces. In 2018 IEEE/ACM 5th International Conference on search in Computer Science 8(1). Mobile Software Engineering and Systems (MOBILESoft), 224–235. Gothenburg, Sweden: IEEE. Asiroglu, B.; Mete, B. R.; Yildiz, E.; Nalcakan, Y.; Sezen, A.; Dagtekin, M.; and Ensari, T. 2019. Automatic Navarro-Cáceres, M.; Rodrı́guez, S.; Bajo, J.; and Corchado, HTML Code Generation from Mock-Up Images Using Ma- J. M. 2018. Applying case-based reasoning in social com- chine Learning Techniques. In 2019 Scientific Meeting on puting to transform colors into music. Engineering Applica- Electrical-Electronics & Biomedical Engineering and Com- tions of Artificial Intelligence 72:1–9. puter Science (EBBT), 1–4. IEEE. Nguyen, T. A., and Csallner, C. 2015. Reverse engineer- Beltramelli, T. 2017. pix2code: Generating Code from a ing mobile application user interfaces with remaui (t). In Graphical User Interface Screenshot. 1–9. 2015 30th IEEE/ACM International Conference on Auto- mated Software Engineering (ASE), 248–259. Bille, P. 2005. A survey on tree edit distance and related problems. Theoretical Computer Science 337(1-3):217–239. Ram, A., and Santamarı́a, J. 1997. Continuous case-based reasoning. Artificial Intelligence 90(1-2):25–77. Carrascosa, C.; Bajo, J.; Julian, V.; Corchado, J.; and Botti, V. 2008. Hybrid multi-agent architecture as a real-time Sartori, F.; Mazzucchelli, A.; and Gregorio, A. D. 2016. problem-solving model. Expert Systems with Applications Bankruptcy forecasting using case-based reasoning: The 34(1):2–17. CRePERIE approach. Expert Systems with Applications 64:400–411. Chang, P.-C.; Liu, C.-H.; and Lai, R. K. 2008. A fuzzy case-based reasoning model for sales forecasting in print Schank, R. 1982. Dynamic memory: A theory of reminding circuit board industries. Expert Systems with Applications and learning in computers and people. Cambridge: Cam- 34(3):2049–2058. bridge University Press. Chen, D., and Burrell, P. 2001. Case-Based Reasoning Sys- Shen, L.; Yan, H.; Fan, H.; Wu, Y.; and Zhang, Y. 2017. An tem and Artificial Neural Networks : A Review. Neural integrated system of text mining technique and case-based Comput & Applic 10:264–276. reasoning (TM-CBR) for supporting green building design. Building and Environment 124:388–401. Chou, J.-S. 2009. Web-based CBR system applied to early cost budgeting for pavement maintenance project. Expert Sun, J.; Li, H.; Huang, Q. H.; and He, K. Y. 2014. Predict- Systems with Applications 36(2):2947–2960. ing financial distress and corporate failure: A review from the state-of-the-art definitions, modeling, sampling, and fea- De Renzis, A.; Garriga, M.; Flores, A.; Cechich, A.; and turing approaches. Knowledge-Based Systems 57:41–56. Zunino, A. 2016. Case-based Reasoning for Web Service Discovery and Selection. Electronic Notes in Theoretical Valiente, G. An efficient bottom-up distance between trees. Computer Science 321:89–112. In Proceedings Eighth Symposium on String Processing and Information Retrieval, 212–219. IEEE. Delir Haghighi, P.; Burstein, F.; Zaslavsky, A.; and Arbon, P. 2013. Development and evaluation of ontology for intel- Watson, I. A. N., and Marir, F. 1994. Case-based reasoning ligent decision support in medical emergency management : A review. The Knowledge Engineering Review 9(4):327– for mass gatherings. Decision Support Systems 54(2):1192– 354. 1204. Zeyen, C.; Müller, G.; and Bergmann, R. 2017. Conver- Gómez-Vallejo, H.; Uriel-Latorre, B.; Sande-Meijide, M.; sational Process-Oriented Case-Based Reasoning. Springer, Villamarı́n-Bello, B.; Pavón, R.; Fdez-Riverola, F.; and Cham. 403–419. Glez-Peña, D. 2016. A case-based reasoning system for aiding detection and classification of nosocomial infections. Decision Support Systems 84:104–116. He, W. 2013. Improving user experience with case-based reasoning systems using text mining and Web 2.0. Expert Systems with Applications 40(2):500–507. Klein, P. N. 1998. Computing the edit-distance between unrooted ordered trees. In ESA. Kolodner, J. L. 1992. An introduction to case-based reason- ing. Artificial Intelligence Review 6(1):3–34. Lary, D. J.; Alavi, A. H.; Gandomi, A. H.; and Walker, A. L. 2016. Machine learning in geosciences and remote sensing. Geoscience Frontiers 7(1):3–10.