=Paper= {{Paper |id=Vol-2076/paper-13 |storemode=property |title=Optimization of Processing the Large Data Stream in Web-interface |pdfUrl=https://ceur-ws.org/Vol-2076/paper-13.pdf |volume=Vol-2076 |authors=Nataliya V. Papulovskaya,Artem A. Rapoport }} ==Optimization of Processing the Large Data Stream in Web-interface== https://ceur-ws.org/Vol-2076/paper-13.pdf
        Optimization of Processing the Large Data
                Stream in Web-interface

                   Nataliya V. Papulovskaya1 , Artem A. Rapoport2
    1
        Ural Federal University named after the first President of Russia B.N.Yeltsin,
                        Yekaterinburg, Russia; pani28@yandex.ru
                            2
                              Uberall GmbH, Berlin, Germany




          Abstract. The paper presents description of the problems related to a
          large amount of data frequently received from a Web-server by the Web-
          interface causing the insufficient performance of the latest one. A com-
          parative analysis of methods for updating the data in the Web-interface
          was made, and the optimal method for updating the data in the realtime
          Web-application was chosen. The paper also provides an example of opti-
          mization of data processing using the data buffering and implementation
          of this example in the modern JavaScript.

          Keywords: data processing, data buffering, big data, Web-interface,
          Webprogramming, WebSocket



1       Introduction

Multiple issues about insufficient performance exist in the modern Web-appli-
cations development. These issues include: slow first page load because of large
amounts of data; low responsiveness or even freezing of an interface when up-
dating a lot of data, and others.
    Web-applications can store more than 900 GB of data on the server side
(or in a database), for example, the data of the students from all around the
world [1]. Therefore, one of the most significant problems in the development
of the Web-frontend part in the Web-applications is the low speed of handling
the big data flow sent from the server to the Web-client (the latter most often
being a Web-browser). Implementation of updating the visual data in a Web-
client can be done in different ways: update (and re-render) the data only after
a specific user action; use polling (in this case the Web-client sends requests to
the server with a time interval), or using the network messaging protocol, other
than HTTP (HyperText Transfer Protocol), which would allow the Web-server
and Web-browser to communicate in real-time (e.g. WebSocket) and others [2].
    This research shows the advantages of data updating implementation by
using the Websocket comparing to other implementations of real-time data Web-
applications and the efficiency of using an additional data handling optimization
by application of buffering.
114

2     Comparison of data updating implementations

Implementation of updating the visual data in a Web-client can be done by well-
known HTTP-requests, which are fired by some user actions. For instance, this
exact method is used in the search engine of the Google, USA: by pressing the
search button, the user sends HTTP-request to the server, and after some short
period of time, the Web-client receives the response and updates the information
on the page. Listing 1 contains the example of a simple HTTP-request written
in the JavaScript.

              Listing 2-1. HTTP-request example, written in JavaScript
import a x i o s from ’ a x i o s ’ ; //HTTP−r e q u e s t s l i b r a r y
import C l i e n t S t o r e from ’ . / C l i e n t S t o r e ’ ; // s t o r e o f Web−c l i e n t
/∗∗ Function t h a t r e q u e s t s t h e data from s e r v e r and r e c e i v e s
  ∗       an answer
  ∗       @function getInfoFromServer
  ∗       @return { v o i d }                 ∗/
function getInfoFromServer () {
    axios . get ( ’/ api / info ’ )
    // Got ’ p o s i t i v e ’ r e s p o n s e with t h e data from t h e s e r v e r
    . then ( ( r e s p o n s e ) => {
    // R e w r i t e t h e data i n t h e Web−c l i e n t s t o r e
          C l i e n t S t o r e . r e w r i t e C o n t e n t ( r e s p o n s e . data ) ;
    })
    // Got ’ n e g a t i v e ’ r e s p o n s e from t h e s e r v e r ( an e r r o r ) .
    . c a t c h ( ( r e s p o n s e ) => {
          // Inform t h e u s e r about t h e e r r o r
          console . e r r o r ( ’ Error while r e c e i v i n g response = ’ ,
                                                        r e s p o n s e . data ) ;
    });
}

    Implementation of updating the visual data in a Web-client can be done by
the help of HTTP-polling. In this case, the same simple HTTP data request is
used, but it is repeated with a time interval, so that the Web-client can show
the real-time information (Listing 2 has the example written in the JavaScript).
This method is easy to implement and can be used when developing a simple
Web-interface of a network device to observe relatively small amounts of data,
for example, monitor one parameters table of a network device in real time.

            Listing 2-2. HTTP-polling example, written in the JavaScript
import a x i o s from ’ a x i o s ’ ; //HTTP−r e q u e s t s l i b r a r y
import C l i e n t S t o r e from ’ . / C l i e n t S t o r e ’ ; // s t o r e o f Web−c l i e n t
l e t r e f r e s h I d = n u l l ; // i d o f t h e p o l l i n g i n t e r v a l
/∗ @ f u n c t i o n s e t I n f o P o l l i n g
∗     @return { v o i d }           ∗/
function setInfoPolling () {
     refreshId = setInterval (
                                                                                               115

    ( ) => {
        axios . get ( ’/ api / info ’ )
        // Got ’ p o s i t i v e ’ r e s p o n s e
        // with t h e data from t h e s e r v e r
            . then ( ( r e s p o n s e ) => {
                  // R e w r i t e t h e data i n t h e Web−c l i e n t s t o r e
                  C l i e n t S t o r e . r e w r i t e C o n t e n t ( r e s p o n s e . data ) ;
                  })
            // Got ’ n e g a t i v e ’ r e s p o n s e from t h e s e r v e r
            . c a t c h ( ( r e s p o n s e ) => {
                  // Inform t h e u s e r about t h e e r r o r
                  console . e r r o r ( ’ Error while r e c e i v i n g response = ’ ,
                        r e s p o n s e . data ) ;
            });
        } , 1000 ) ;
}
/∗ @ f u n c t i o n s t o p I n f o P o l l i n g @return { v o i d }    ∗/
function stopInfoPolling () {
   i f ( refreshId ) {
        clearInterval ( refreshId );
   }
}

Finally, the WebSocket can be an implementation of updating the visual data. In
this case, the Web-client establishes a connection with the server and subscribes
to some necessary topics (themes) of data, and the server sends the data when
and only when it is obligatory, for example, when the data has been changed in
the database. (Listing 3 contains WebSocket example implementation written in
the Javascript). Any Web-application with real-time data can be set as a good
WebSocket implementation (as in the short example), because the WebSocket is
the most performance-wise and server-load-wise efficient transport [3],[4]. Yan-
dex.Mail uses the WebSocket to load messages (e-mails) from the server in real
time (Fig.1).




                Fig. 1. WebSocket-connection on the Yandex.Mail page
116

   Listing 2-3. Example of the WebSocket session handling, written in JavaScript

c o n s t DATA URL = ’ data ’ ;
c o n s t GET SNAPSHOT RPC = ’ getSnapshot ’ ;

/∗ Handler t h a t a b s t r a c t s t h e AutobahnJS (WAMP) methods
∗ @ c l a s s S e s s i o n H a n d l e r ∗/
c l a s s SessionHandler {
      /∗ P r i v a t e f i e l d t h a t s t o r e s s e s s i o n o b j e c t
        ∗ @ p r i v a t e ∗/
        session = null ;
      /∗ @ c o n s t r u c t o r
        ∗ @param {? S e s s i o n } [ s e s s i o n = n u l l ]
        ∗ C l i e n t −s e r v e r s e s s i o n . N u l l by d e f a u l t . ∗/
      constructor ( session = null ) {
            this . setSession ( session );
      }
      /∗ Method t h a t s t o r e s t h e s e s s i o n i n t h e
        ∗         SessionHandler s instance
        ∗ @method s e t S e s s i o n
        ∗ @param {? S e s s i o n } [ s e s s i o n = n u l l ]
        ∗         C l i e n t −s e r v e r s e s s i o n . N u l l by d e f a u l t .
        ∗ @return { v o i d } ∗/
      s e t S e s s i o n = ( s e s s i o n = n u l l ) => {
            this . session = session ;
      };
      /∗ C a l l b a c k −f u n c t i o n t h a t h a n d l e s s u b s c r i p t i o n m e s s a g e s
        ∗ @see h t t p : / / autobahn . ws/ j s / r e f e r e n c e . html#s u b s c r i b e
        ∗         Autobahn documentation
        ∗ @callback subscriptionCallbackFn
        ∗ @param { Array } a r g s                    a r r a y with e v e n t p ay l oa d
        ∗ @param { Obj ect } kwargs                         o b j e c t with e v e n t p a yl oa d
        ∗ @param { Obj ect } d e t a i l s                    o b j e c t with e v e n t metadata
        ∗ @return { v o i d } ∗/

   /∗ Method t h a t s u b s c r i b e s t o data u p d a t e s m e s s a g e s
     ∗ @method s u b s c r i b e T o D a t a
     ∗ @param { s u b s c r i p t i o n C a l l b a c k F n } c a l l b a c k F u n c t i o n
     ∗ f u n c t i o n t h a t s h o u l d be e x e c u t e d when r e c e i v e d a message
     ∗ @return { Promise } ∗/
   s u b s c r i b e T o D a t a = ( c a l l b a c k F u n c t i o n ) => {
       i f (! this . session ) {
              c o n s o l e . e r r o r ( ’ T r i e d t o s u b s c r i b e t o d e v i c e data but
no s e s s i o n was s p e c i f i e d i n S e s s i o n H a n d l e r ’ ) ;
              return null ;
       }
       r e t u r n t h i s . s e s s i o n . s u b s c r i b e (DATA URL, c a l l b a c k F u n c t i o n ) ;
   };

      /∗ Method t h a t r e q u e s t s s n a p s h o t ( f o r t h e            cold       start )
                                                                                              117

     ∗ @method g e t S n a p s h o t
     ∗ @return { Promise } ∗/
   g e t S n a p s h o t = ( ) => {
   i f (! this . session ) {
         c o n s o l e . e r r o r ( ’ T r i e d g e t s n a p s h o t but no s e s s i o n
was s p e c i f i e d i n S e s s i o n H a n d l e r ’ ) ;
         return null ;
   }
   r e t u r n t h i s . s e s s i o n . c a l l (GET SNAPSHOT RPC ) ;
   };
}

    The deduction (that most efficient option to handle big data is WebSocket)
appears when comparing the options described above using several criteria (Ta-
ble 1) because the size of the messages flow between the Web-client and the
server is the smallest and most justified at this point. The method of updating
data only by user action is not suitable for the application with real-time data,
and polling is not suitable for it when the data becomes bloated. Moreover, with
using polling, the Web-client always sends requests, even when the data in the
database was not changed. Multiple WebSocket protocol “wrappers” without
the standardization, on the other hand, are the apparent draw back for a devel-
oper. In this case, the developer should choose a protocol supported both on the
Web-client and on the server side. Despite of the tiny amount of box solutions of
this protocol data transfer implementations, the WAMP (the Web Application
Messaging Protocol) [6] might be a suitable open-standard protocol that allows
one to implement messaging though WebSocket is relatively simply [7].


3     Solution overview

To solve the issue of excessive messages flow from the server to the Web-client,
for example, in the case of monitoring a huge network, handling of each received
message should contain minimum amount of calculations. So that the overall
message handling time is decreased, which gives the Web-client more time to
execute other tasks (for example such heavy tasks as DOM (Document Object
Model) rendering) in the one-thread Web-client. This can be efficiently done with
using data buffering. This kind of buffering works alike the buffering in a CPU[8].
More specifically, the Web-client does not handle the messages immediately after
it receives them from the server, but stores them in a buffer, not into an array
but into one message object. So applying the object to the store data is simple
and should only be done once per time interval.
    Let us assume that Web-application (that is needed to be developed) can have
either huge or small amount of data in the database. For example, in the case of
network monitoring application with real-time data. If there is a low amount of
network nodes (assuming their quantity is from 1 up to 50), the buffering will
not change much in terms of performance because the modern computers have
enough calculation resources to handle telemetries of such nodes amount in a
118

short time (if every message does not lead to recalculation and re-render of the
network map graph, of course).


   Table 1. Comparison of the different ways of data updating in the Web-client

   Comparison           Updates only                 Polling                    WebSocket
      criteria         by user actions
 Possible to cre-     no                    yes                           yes
 ate an applica-
 tion with real-
 time data up-
 dates
 Big       number     yes                   yes                           no
 of boilerplates
 (ready-made
 implementa-
 tions)        and,
 therefore,
 simplicity      of
 development
 Web-client      to   Depends on the        Requests are sent by an       There is only one
 server requests      frequency of the      interval, e.g. every 2 sec-   request: to estab-
 amount               user actions          onds                          lish the connec-
                                                                          tion to the server
 Necessity     of     Data flow can be      The polling is used in        The server sends
 implementation       large only when       the real-time data ap-        the data to the
 of the big data      there is a high       plications; therefore, the    Web-client       by
 flow optimiza-       amount of direct      data should be updated        himself;     there-
 tion methods in      user requests. It     quite often. On the other     fore, if the server
 the Web-client       should be handled     hand, if the server con-      observes        fre-
                      on the server, so     tains too much data, then     quent      changes
                      that other users      the polling, even with        of the data in
                      do not experience     the data clustering im-       the database, all
                      lags during work-     plementation, might be-       the changes will
                      ing with the inter-   come inappropriate deci-      be sent to the
                      face                  sion.                         Web-client as a
                                                                          large     messages
                                                                          flow.



    It follows that the buffering influence on visual Web-interface user experience
is somehow needed to be minimized when there are not so many network nodes,
or, more specifically, messages from the server (when the messages come from
the server less frequently than the buffering interval). On the other hand, the
buffering impact on the frequency of the data updates is also needed to be
increased in order to avoid stack overflow of the calculation operations and,
thus, avoid the interface freezing when the messages start to come from the
                                                                                              119

server more frequently. To implement this, the timeout of updating the data
should be refreshed when no message was received in the buffering interval at
the shortest point and when the sum of the time intervals (including the time
refresh-iterations) becomes somewhat critical to do a force update at the longest
point. For instance, assuming minimum update interval is 200ms, if the Web-
client receives only one message during these 200 ms, the data from this message
will be applied to the data in the Web-client’s store immediately after this time
interval. Otherwise, if the Web-client gets more than 1 message per these 200
ms, the timeout will be refreshed and will wait for the next message again, then
the timeout will be refreshed again etc. This will go on until the total sum of
the time intervals becomes critical, e.g. equal to 1 second. In this case, all the
data, which were merged from the messages (that were received during this 1
second) will be applied to the Web-client’s data store. Thus, when the messages
flow is large, the update occurs only 1 time per second and when the messages
flow is smaller, the update can occur from as frequent as 1 time per 200 ms to
as frequent as 1 time per 1 second.
    Listing 4 contains the implementation of the algorithm above written in
JavaScript, EcmaScript 2015 standard [9]. There, the UpdateHandler class does
the accumulating and applying the accumulated data to the Web-client’s data
store. In this implementation, the class handles the messages about the events
that occur in a massive network, therefore, the buffering needed to be intro-
duced so that the operations delay becomes minimal. The external WebSocket
message handlers pass the messages with data updates of events, occurred in the
network, to the “bufferedUpdate” method. Assuming that the constant named
Constants.DEFAULT UPDATE TIMEOUT is equal to 200 ms, the implemen-
tation of the algorithm described above is presented in the listing.

Listing 3-4. The example of updates handling with buffering usage, written in the
JavaScript


import ∗ a s C o n s t a n t s from ’ . . / Constants ’ ;
import ∗ a s e v e n t s A c t i o n s from ’ a c t i o n s / e v e n t s A c t i o n s ’ ;

c l a s s UpdateHandler {
      /∗ i d o f t h e t i m e o u t with minimum i n t e r v a l ∗/
        interval = null ;

    /∗ i d o f t h e t i m e o u t with maximum i n t e r v a l
     ∗ ( when t h e f o r c e update o c c u r s )      ∗/
     maxInterval = null ;

    /∗ @type { Boolean }
     ∗ Boolean f l a g o f t h e f o r c e update                   ∗/
     forceApply = f a l s e ;

    /∗ Main data s t o r e i n s t a n c e                          ∗/
     store = null ;
120



      /∗ @type { S t r i n g }
       ∗ Updating mode ( can be e i t h e r s e t
       ∗ to updating or accumulating )                            ∗/
       mode = n u l l ;

      /∗ @type { Ob ject }
       ∗ Ob ject with accumulated u p d a t e s o f type
       ∗ { e v e n t s : Map}                       ∗/
       acc umul atedU pdat es ;

      constructor ( store ) {
         t h i s . acc umul atedU pdat es = { e v e n t s : new Map ( ) } ;
         /∗ i n i t i a l i z i n g t h e v a l u e f o r t h e accumulated u p d a t e s ∗/
         this . store = store ;
         // s a v i n g t h e i n s t a n c e o f t h e main s t o r e i n t h e f i e l d
         c l e a r I n t e r v a l ( t h i s . i n t e r v a l ) ; // s t o p p i n g u p d a t e s
         t h i s . startBatchedWaitingTime ( ) ;
         /∗ i n i t i a t i n g e x t e r n a l t i m e o u t with t h e i n t e r v a l
               o f t h e f o r c e update ∗/
      }

      /∗ @ p r i v a t e
       ∗ @method ap ply Acc umu lat edU pda tes If Nee ded
       ∗ Method t h a t a p p l i e s t h e data , t h a t was accumulated
       ∗ during the i n t e r v a l of updating
       ∗ I f t h e r e was no message from t h e s e r v e r ,
       ∗ data i s not needed t o be a p p l i e d                       ∗/
       ap ply Acc umu lat edU pda tes If Nee ded = ( ) => {
          i f ( t h i s . acc umul atedU pdat es . e v e n t s . s i z e > 0 ) {
              clearTimeout ( t h i s . maxInterval ) ;
              /∗ s t o p p i n g ‘ e x t e r n a l ‘ t i m e o u t with t h e
                     i n t e r v a l o f t h e f o r c e update ∗/
              clearInterval ( this . interval );
              // s t o p p i n g ‘ i n t e r n a l ‘ t i m e o u t
              t h i s . startBatchedWaitingTime ( ) ;
              /∗ i n i t i a t i n g ‘ e x t e r n a l ‘ t i m e o u t with
                   t h e i n t e r v a l o f t h e f o r c e update ∗/
              t h i s . applyAccumulatedUpdates ( ) ;
              // a p p l y i n g accumulated data u p d a t e s t o t h e s t o r e
          }
      };

       s t a r t B a t c h e d W a i t i n g T i m e = ( ) => {
            // s e t t i n g f o r c e update f l a g t o f a l s e
            this . forceApply = f a l s e ;
            // S e t t i n g t h e f o r c e update f l a g t o t r u e a f t e r 1 s e c o n d
            t h i s . m a x I n t e r v a l = setTimeout (
                              t h i s . clearBatchedWaitingTime ,
                              C o n s t a n t s .DEFAULT UPDATE TIMEOUT ∗ 5 ) ;
                                                                                                         121

    };

     c l e a r B a t c h e d W a i t i n g T i m e = ( ) => {
          this . forceApply = true ;
    };

    bufferedUpdate ( updates ) {
       // s t o p p i n g ‘ i n t e r n a l ‘ t i m e o u t
       clearInterval ( this . interval );
       // adding u p d a t e s ( a c c u m u l a t i n g )
       t h i s . accumulateUpdates ( updates ) ;
       i f ( t h i s . f o r c e A p p l y === t r u e ) {
             /∗ i f t h e f o r c e update f l a g i s s e t t o t r u e
                   u p d a t i n g t h e data ∗/
             t h i s . ap ply Acc umu lat edU pda tes If Nee ded ( ) ;
             // e x i t i n g t h e c u r r e n t method
             return ;
       }
       t h i s . i n t e r v a l = s e t I n t e r v a l ( ( ) => {
             /∗ c r e a t i n g ‘ i n t e r n a l ‘ timeout ,
                   a f t e r which t h e data update i s c a l l e d ∗/
             t h i s . ap ply Acc umu lat edU pda tes If Nee ded ( ) ;
       } , C o n s t a n t s .DEFAULT UPDATE TIMEOUT ) ;
    }

     accumulateUpdates ( updates ) {
        i f ( ! updates ) { return ; }
       u p d a t e s . f o r E a c h ( update => {
            /∗ e l s e , f o r each update add t h e data
                   t o accumulated u p d a t e s o b j e c t ∗/
            t h i s . acc umul atedU pdat es . e v e n t s =
                   UpdateHandler . appendEventMessage (
                         t h i s . acc umul atedU pdat es . e v e n t s ,
                         update ) ;
       });
    }
    s t a t i c appendEventMessage ( i n i t i a l V a l u e , newMessage ) {
         /∗ Here , depending on t h e message s t r u c t u r e ,
                new data about t h e network e v e n t s i s added ∗/
          r e t u r n i n i t i a l V a l u e . s e t ( newMessage . id , newMessage . v a l ) ;
    }
    applyAccumulatedUpdates ( ) {
          c o n s t { e v e n t s } = t h i s . acc umul atedU pdat es ;
         // a p p l y i n g data t o t h e s t o r e
          t h i s . s t o r e . d i s p a t c h ( e v e n t s A c t i o n s . updateEvents ( e v e n t s ) ) ;
         /∗ i n i t i a l i z i n g t h e v a l u e f o r t h e accumulated u p d a t e s ∗/
          t h i s . acc umul atedU pdat es . e v e n t s = new Map ( ) ;
    }
}
122

    This algorithm was successfully implemented and used in the task of creating
the Web-interfaces of network and geographical map that might contain up to 10
thousand of network devices and around 50 thousand wireless broadband links
between them. The Web-server can send messages quicker than 1 message per
1 ms in the configuration with such a big network. Therefore the one-thread
Web-client could not be able to handle and apply the data without a significant
visual delay (when not using buffering).
    Applying adaptive methods of the data handling, on the other hand, allows
one to increase performance while decreasing the calculations amounts for any
width of the data flow.


4     Conclusion
Modern Web-programming technologies allow the developers to implement the
Web-applications that do not have the excessive calculations and optimization
operations. Especially, it is important when having a low amount of data but
also do not have the significant calculations delay when having a big data flow.
Different development tasks require using different techniques, methods, and
approaches to handle information and optimize the data flow. Nevertheless, using
the WebSocket and buffering in the Web-client is an efficient way to organize
and handle the data updating in the Web-application and to optimize the data
flow.

References
 1. AWS Case Study: Kaplan. https://aws.amazon.com/solutions/case-studies/
    kaplan/
 2. Liping, G., Dongfang, G., Naixue, X., Changhoon, L.: CoWebDraw: a real-time
    collaborative graphical editing system supporting multi-clients based on HTML5.
    Multimedia Tools and Applications. Vol. 77, 4, 5067–5082 (2018)
 3. Chto takoe Long-Polling, WebSockets, SSE i Comet. https://myrusakov.ru/
    long-polling-websockets-sse-and-comet.html
 4. Postojannoe soedinenie mezhdu brauzerom i serverom. https://www.insight-it.
    ru/interactive/2012/postoyannoe-soedinenie-mezhdubrauzerom-i-serverom/
 5. Kotov, A., Krasil’nikov, N.: Klasterizacija dannyh. http://yury.name/internet/
    02ia-seminar-note.pdf
 6. WAMP - The Web Application Messaging Protocol. http://wampproto.org/
 7. GitHub - WAMP in JavaScript for Browsers and NodeJS. https://github.com/
    crossbario/autobahn-js
 8. Muller, H., Flynn, M. J.: Processor Architecture and Data Buffering. IEEE Trans-
    actions on computers. Vol. 41, 10, 1211-1222 (1992)
 9. ECMAScript 2015 Language Specification – ECMA-262 6th Edition. http://www.
    ecma-international.org/ecma-262/6.0/