Constructing Stereo Presentations of Textual Information on an Autostereoscopic Monitor S.V. Andreev1, N.A. Bondareva2 esa@keldysh.ru|niki.99@mail.ru 1 Keldysh Institute of Applied Mathematics RAS, Moscow, Russia; 2 Russian Technological University MIREA, Moscow, Russia; This paper presents practical experience in constructing stereo presentations of texts and formulas on an autostereoscopic monitor in stereo presentations designed to display the results of numerical simulation. The task of constructing stereo images of texts and formulas is a structural subtask of a general study devoted to the development of methods and algorithms for constructing stereo presentations of the results of scientific research. This paper discusses the construction of stereoscopic images on an autostereoscopic monitor. The autostereoscopic monitor allows one to observe a stereo image without glasses, while ensuring the quality of the stereo image, which is not inferior to the quality of the stereo image, presented using a classic 3D projection stereo system. Various methods of obtaining stereo images supported by the monitor were tested, namely, the multi-view presentation of the object and the construction of depth maps. The results for both methods are presented. Keywords: stereo image, textual information, autostereoscopic monitor, multi-view presentation, depth maps. There is accurate information that the person’s vision 1. Introduction adapts to the stereo image in about 30 seconds. The same time is needed to adapt the vision back to mono image. This makes it Nowadays, using of stereoanimation to display the obtained impossible to quickly switch the viewer's attention from the scientific results begins to play an important role in a number of stereo image to the usual flat text. In other words, all additional industries, in fundamental and applied scientific research. information to the stereo image itself, expressed by graphic This is primarily due to the fact that very often a situation symbols (letters, signs or numbers) must also be presented in arises when a flat image of an object does not provide sufficient three-dimensional stereo space, without subjecting the human information to realize research goals and limits the analysis to brain to additional loads. the image schematics or the reduced viewing angle. However, while for geometry and fields of physical The stereo image allows to estimate with greater accuracy quantities, the tasks of constructing stereo representations are the relative position of objects in space and, if necessary, fill in developed in sufficient detail, quite a few works are devoted to the missing data in a given structure beyond the human's gaze, the development of specific practical approaches and algorithms for example, when modeling structural chemical compounds. for constructing stereo images of texts and formulas with Stereo image, in contrast to the two-dimensional flat image, sufficient expressiveness and the necessary stereo effect. allows one to more effectively explore and simulate objects Here we can mention only the work [10] describing the with a developed spatial structure or interpret photographs study of the perceptibility of stereo fonts on the screens of taken during experiments with different points in time and stereoscopic mobile devices in Japan and our previous works integrated into animation. [11, 12] on the presentation of textual information for the A lot of articles are devoted quite to this topic. They classic type of stereo. By classic stereo, we mean a 3D consider methods for constructing images for various types of projection stereo system for displaying stereo presentations, that stereo devices of both passive and active types. The difference includes a graphics station and a projection subsystem of two between these types is that the passive ones allow projectors with a specialized screen. demonstrating this or that material according to a predetermined For the classic type, a linear stereo base was used. The scenario, however, the viewer has no opportunity to influence parameters that provide the best effect were found. The main the process of the show. requirements for the used fonts and a number of conditions, the Active (or interactive) installations imply that the viewer satisfaction of which is necessary to achieve an optimal result, has the opportunity to work with the demo material and were also revealed [11, 12]. influence the demonstration process directly. General aspects of In the studies presented in [11, 12], the construction of test imaging for various types of stereo are presented in [1]. labels with variations of the font, background, degree of relief Specific problems arising from using a computer system to of the font, the rotation angle of the caption, shift along the generate and visualize a composite multi-screen stereo frame. linear stereo base were carried out. Samples of stereo frames Methods for solving such problems are described in sufficient were checked on the stereo system in order to evaluate the detail in [1-3]. stereo effect. The gradual spread of modern stereos has generated great Due to the inability to organize a full-fledged expert interest in the development of practical algorithms for the stereo assessment, the selection of test results was based on the presentation of static and animated images, which is reflected in initially chosen principles and the assessment of visual papers [4-9]. These papers represent the tasks of constructing perception of the distance of the test inscription from the screen stereo images in various fields of research: from the Supernova plane. explosion to the process of displacing oil from porous media. Empirically, the value of the shift along the linear stereo The proliferation of stereo animations and virtual reality base was found, in which viewers located 4 meters from the also made it possible to reveal a rather important private screen perceived the inscription as being approximately 1 meter structural subtask. The standard procedure of a report, away from the screen plane, which was considered sufficient. It presentation of something or a virtual walk implies the presence was found that the most optimal way to represent the in them not only of the visual type of information, but also of inscriptions is the representation based on the relief font Arial symbolic - letters, numbers, symbols, metric values, names, Narrow on a light background, shaded at the bottom, without numbers, state standards, without which the viewer’s perception turning the inscription. An example of a sample recognized as and assimilation of information is significantly complicated. Copyright © 2019 for this paper by its authors. Use permitted under Creative Commons License Attribution 4.0 International (CC BY 4.0). the most successful in all parameters during the course of work follows. In the first frame the inscription is in extreme right is presented in Figure 1. position, then with a certain equal intervals it shifts to the left, until it reaches the leftmost position. At the same time, in each frame, the inscription accordingly rotates to the left by the same angle. The axis of rotation passes through the center of the entire inscription, so the leftmost letters are rotated stronger, which creates the effect of immersing them in the screen. Fig. 1. An example of the construction of the inscription on the basis of the relief font Arial Narrow with the rotation of the inscription for the classic stereo. This paper continues the cycle of research devoted to the creation of stereo representations, both static and animated, the Fig. 2. Constructing a stereo image of an inscription using a results of scientific research. The research is carried out in multi-view representation. Keldysh Institute of Applied Mathematics (KIAM RAS) on the basis of two available types of stereo devices. The first type of The task is to ensure that the viewer does not see the device is a 3D projection stereo system for displaying stereo transition between each of the stereo pairs, when he changes the presentations, educational applications, graphics and films. It is viewing angle, so that the strongest realistic effect is created. a model of a classic stereographic system using two projectors, When the linear and angular shift parameters are too large, a screen and linear polarization glasses artifacts arise: the viewer has a feeling that the image in front of The second type of device is a Dimenco DM654MAS him blurs on transitions from one stereo pair to another, and at autostereoscopic monitor. Autostereoscopic monitors provide the slightest change in position, the viewer notices glare and the display of stereo images without the need to adjust the boundaries of each of the stereo pairs. Figure 3 shows an equipment depending on the position of the observer. The unsuccessful pattern using the example of a single letter, where viewer can move from one segment to another, getting the each of the indicated problems stands out particularly brightly: opportunity to view the displayed object in 3D from different too much rotation of the letter itself and too much linear and angles of view. The principle of operation of an angular shift distance. autostereoscopic monitor is the use of parallax partitions or Fresnel lenses installed behind a protective glass screen. One of the most important advantages of autostereoscopic monitors is the fact that viewing images on such devices does not require the use of special glasses. The research in KIAM RAS were performed to achieve the most effective result – getting of volumetric stereo images – and to study the properties of the devices themselves to find the most effective algorithms for constructing stereo images for Fig. 3. Constructing a stereo image of a single letter using the both types of devices. Different methods of construction were multi-view representation. used: finding a linear stereo base, depth maps, and some others. A more detailed presentation of the solution of these problems Experimentally, it was found that if we combine frames in is described in [1-3, 7, 11, 12]. reverse order, when the first and ninth frames, second and At this stage of research, the results of stereo images eighth, third and seventh, and so on are swapped, then the constructing on an autostereoscopic monitor are presented. The opposite effect can be achieved, when the inscription “floats” results are obtained by two basic methods of construction: a deep into the screen (Fig. 4). multi-view presentation and a depth map. 2. Multi-view presentation An autostereoscopic monitor has the ability to demonstrate an object of visualization using a composite frame containing views of the object of visualization from different angles, which form a certain viewing sector. Usually, these are nine views. Moreover, these nine views form eight stereo pairs ([1 | 2], [2 | 3], [3 | 4], ..., [8 | 9]), and the viewer can watch only one of Fig. 4. Variants of gluing frames with a sequence of 1-9 (A) the stereo pairs, depending on its position in this or that angular and reverse – 9-1 (B). sector of observation. Moving from sector to sector, the observer receives volumetric information about the object using During the experiments, different parameters were varied: all nine angles, that is, as if looking at the object of the font size, the angle of rotation for each frame in the multi- visualization from different sides. view presentation, the distance between frames for the linear At this stage of research, a combination of linear and stereo base. angular displacement was used for the first time. This made it Due to the lack of a theoretical part in this area, the possible to achieve a volume effect comparable to the effect necessary parameters of the linear and angular shift for the achieved on a classical stereo set. multi-view representation had to be sought by the selection Nine frames are combined into one image according to the method. Optimal parameters were found at which the transitions principle of a 3 × 3 matrix (Fig. 2). In the first and last (ninth) between stereo pairs became invisible to the viewer, and the frames the inscription is in its extreme positions. The inscription itself acquired volume and became convenient for construction of the multi-view representation was organized as perception. In particular, maximal difference between the first corresponding pixel is performed by calculating the maximum and ninth frames should be no more than 2.4 mm with a step of of the response function, which can be, for example, the 0.3 mm, the angle of rotation decreases from 9 degrees to 1, correlation of neighborhoods of pixels. The result is a disparity with a step of 1 degree. map [14]. It was found that these parameters, in principle, remain the This algorithm was used to build depth maps for individual same for very different fonts. A similar effect was obtained for letters and whole labels. To create depth maps, a ready-made the inscription, where the font size was 32, and the most software package 3D Stereoscopic Photography [14] was used, successful variant of the inscription, where the font size was 66 created by the author of the package for processing real stereo (Fig. 5 and Fig. 6). photographs. This package allows one to create a depth map from the right and left parts of the stereo pair. The interactive part of this complex allows one to set on a stereo pair both the point of the object being visualized closest to the observer and the one furthest from the observer. In this case, both points are indicated both on the left side of the stereo pair and on the right, thereby providing parallax data for further batch processing for the subsequent creation of a depth map. The algorithm for constructing depth maps is as follows. Fig. 5. Constructing a stereo image of an inscription with a font First, we create a 3D model of the text label, then a stereo pair size of 32. is constructed from the constructed frame with 3D text using a linear or angular stereo base. Then, using the procedures of the software package [14], the points closest to the observer and farthest from the observer are noted on the stereo pair. This allows us to create depth maps using procedures [14]. To create a depth map, the desired object must have a pronounced depth and volume. If they are insufficient, then it will be impossible to create the necessary image using the algorithm outlined above. For example, a variant of the text that is suitable for a multi-view presentation (Fig. 6) turned out to be completely unsuitable for constructing a depth map due to Fig. 6. Constructing a stereo image of an inscription with a font insufficient letter volume. size of 66. Figure 7 shows the variant when one letter was chosen as an experimental sample, enlarged so that the image had a 3. Depth maps pronounced volume. When creating a multi-view video for autostereoscopic monitors, there is a need to store a large amount of data – video streams for each angle. Despite the fact that modern methods of digital video compression can effectively take into account temporal and spatial redundancy, the amount of data with multi- angle video will increase many times. This is especially critical for autostereoscopic monitors. One of the effective ways to solve the problem of large amounts of data is to use the so-called 2D + Z format. Any Fig. 7. Constructing a single letter depth map. conventional (2D) image can be matched with information about the distance of each pixel from the observer (Z- As a result, it turned out that to construct a depth map, it is coordinate). Such a representation of the image is called the enough to have only an angular stereo base (Fig. 8). In contrast “2D + Z format”, and the Z coordinate plane is called the “depth to the multi-view representation, the combination of linear and map”. It can be represented as a monochrome image. In the angular stereo-base did not give a positive result when plotting depth map, grayscale indicates the distance of the image points depth maps. It was assumed that the presence of a linear stereo from the observer. That is, the closest point to the viewer will base is not necessary and does not affect the construction be white, and the farthest – black. efficiency. The idea underlying the construction of a depth map over a stereopair is fairly obvious. For each point on one image, a pair point is searched for it on another image. And by a pair of corresponding points, coordinates of their pre-image in three- dimensional space can be determined. Having the same three- dimensional coordinates of the pre-image, the depth is calculated as the distance to the camera plane. A pair point must be sought on the epipolar line [13]. Accordingly, to simplify the search, the images are aligned so Fig. 8. Constructing of angular stereo base to create a depth that all epipolar lines are parallel to the sides of the image map. (usually horizontal). This image alignment process is called rectification. As mentioned above, at each of the frames the point farthest After the images are rectified, the corresponding pairs of from the viewer and the one closest to it is noted. Then the points are searched. The simplest method is as follows: for each offset distance of the object in the frames is calculated – in this pixel of the left image with coordinates (x0, y0), a pixel is case for two letters – and then a depth map is constructed (Fig. searched for in the right image. In this case, it is assumed that 9). the pixel in the right picture should have coordinates (x0 – d, y0), where d is a quantity called disparity. The search for the eliminating artifacts – as well as solving the problem of combining images and text information in one stereo frame. 5. References [1] Generation of Stereo-Presentations in Photorealistic Rendering and Scientific Visualization /Andreev S.V.[et al] // Keldysh Institute preprints, 2010. № 61. 16 p. [2] Synthesis of photorealistic three-dimensional images in Fig. 9. Constructing a depth map for left frame of stereo pair. modern presentation systems / Andreev S.V.[et al] // Software&Systems 2007. № 3. p. 37-40. [in Russian]. Depending on the need, it is possible to build a depth map [3] Andreev S., Filina A. Using stereo presentations for for any of the frames – left or right – then a 3D object will be visualization of scientific calculations results / Scientific displayed on the autostereoscopic monitor in the position for Visualization. 2012. v.4. № 1. p.12-21. which the depth map was built (Fig. 9 and Fig. 10). Moreover, [4] Hardware-software complex of 3D presentations based on a if the right view is used, the depth map is inverse to the depth virtual studio and virtual environment / Vandanov V.G. [et map for the left view, that is, the closest point to the observer al] // Proceedings of the 1-st international conference «3D becomes black, and the farthest white. visualization of scientific, technical and social reality. Cluster technologies of modeling» Izhevsk. 2009. P.73-77. [in Russian]. [5] Mezhenin A.V., Tozik V.G. 3D Visualization using the stereo image effect // Proceedings of the 2-nd international conference «3D visualization of scientific, technical and social reality. Cluster technologies of modeling» Izhevsk. 2010. [in Russian]. [6] Mikhaylyuk M.V., Huraskin I.A. Synthesis of stereo images Fig. 10. Constructing a depth map for right frame of stereo pair. for virtual reality systems using an optical tracking system / Software&Systems 2006. № 3. p. 10-14. [in Russian]. As practice has shown, the construction of a depth map is [7] Andreev S.V. [et al] / Modelling and visualization of blade also possible for the entire inscription, provided that it has a assembly with complicated shape for power turbine // sufficient size and volume (Fig. 11). Scientific Visualization. 2015. v.7. № 4. p.1-12 [8] Torgashev M.A., P.Y. Timokhin. The technology of stereo video files’ synthesis for the system of 3D real-time visualization. Software Products and Systems, 2012, № 3, pp. 74-80. [In Russian] [9] Mikhaylyuk M.V., Maltsev A.V. Timokhin P.Yu. The methods of 3D stereo visualizationof data obtainedin simulation of unstable oil displacement from porous media / Proceedings of Scientific Research Institute for System Fig. 11. Constructing a depth map for the whole inscription. Analysis RAS, 2018, v.8, N 2, p. 125-129. [in Russian]. [10] Visibility Experiment and Evaluation of 3D Character As a result of all experiments, a visible stereo effect was Representation on Mobile Displays / Hiromu Ishio [et al.] // achieved and the conditions necessary for constructing depth C. Stephanidis (Ed.): Posters, Part II, HCII 2011, CCIS 174, maps were identified. pp. 46–51, 2011. However, in the process, the following problem was [11] Andreev S.V., Bondareva N.A. Constructing a discovered – the presence of artifacts clearly visible on the representation of textual information in stereo presentations autostereoscopic monitor. In the above images, they are also // Proceedings of the 28-th International Conference visible, only on depth maps. GraphiCon-2018, Tomsk, Russia, 23-28 September 2018, p. The reasons are that the algorithms for smoothing the 86-89. [in Russian]. constructed depth map used by the author in the software [12] S.V.Andreev, A.E.Bondarev, V.A.Galaktionov, package [14] used in this study work well mainly with real N.A.Bondareva (2018) The problems of stereo animations stereo photographs when the visualization object does not construction on modern stereo devices. Scientific contain sharp edges with a contrasting color change. In our Visualization 10.4: 40 - 52, DOI: 10.26583/sv.10.4.04 case, the object of visualization is text, most often possessing [13] Kalinichenko A.V., Sveshnikova N.V., Yurin D.V. such properties. Thus, further studies suggest the selection and Epipolar geometry and evaluation of its reliability from the application of the most optimal smoothing algorithms. results of the restoration of the three-dimensional scene by factorization algorithms // Proceedings of the 16-th 4. Conclusion International Conference GraphiCon-2006, 2006. [in Russian]. The results of constructing textual information for an [14] 3D Stereoscopic Photography, autostereoscopic monitor are presented using two methods for http://3dstereophoto.blogspot.com/ last access July 01 2019. constructing a stereo image – a multi-view representation and depth maps. In practical experiments, optimal parameters for constructing stereo images of the text with the most visible stereo effect for both methods are empirically obtained. The purpose of this research is to develop an effective technology for constructing stereo images of the text to create stereo presentations. In the longer term, we suggest improving the quality of the created depth maps – smoothing and