<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>Object-wise Individual Appearance Manipulation with Layer Detection</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Takahiro Nagata</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Toshiyuki Amano</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>Wakayama University</institution>
          ,
          <addr-line>930, Sakaedani, Wakayama-shi, Wakayama</addr-line>
          ,
          <country country="JP">Japan</country>
        </aff>
      </contrib-group>
      <abstract>
        <p>Appearance Manipulation enables to change the perceptual color, texture, and shape with illumination projection. However, it is still unclear how to apply the manipulation for each object independently, not unique manipulation for the whole scene. This paper proposes a method to independently apply appearance manipulation to foreground and background with the layer in a scene which can detect from the pixel correspondence among two projector-camera systems. Furthermore, our method removes cast-shadow-like illumination unevenness created by the foreground from its layer detection.</p>
      </abstract>
      <kwd-group>
        <kwd>eol&gt;Spatial augmented reality</kwd>
        <kwd>Projector-camera system</kwd>
        <kwd>Human-centered computing</kwd>
      </kwd-group>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>1. Introduction</title>
      <p>ance manipulation system owns a pixel correspondence
between the camera and projector which can validate the
The Shader Lamps, which enabled color manipulation assumptions of placement of the object in which layer.
of the buildings by texture mapping on white walls [1], When a foreground object exists in a scene, cast-shadows
presented the potential of spatial augmented reality (SAR) occur on the background object. Solving this problem has
through light projection. Since then, various techniques been investigated as a longstanding research challenge in
have been proposed for SAR applications [2]. SAR. Sukthankar et al. [8] proposed a method that can</p>
      <p>Unlike conventional projection mapping, Amano et al. remove shadows caused by occluding objects by using
proposed an alternative projection technique to manipulate two projectors in overlap projection. Audet et al. [9]
apparent object color with illumination projection in a achieved it with tracking for dynamic scenes, and Flagg et
projector-camera feedback manner [3]. It has another al. [10] proposed another adaptive technique with an IR
potential to hack an appearance of the real world and camera. However, these methods aim to display a given
our visual perception. Currently, many applications of video source on the screen and do not involve appearance
appearance manipulation are proposed [4, 5, 6]. manipulation.</p>
      <p>However, they apply uniform appearance manipulation This paper proposes a method that discriminates
befor the whole area of the scene. The object-wise indi- tween foreground and background objects by
indepenvidual manipulation pushes the boundary of appearance dently working two projector–camera systems and
achievcontrol and potentially other applications ever attempted. ing separate appearance manipulations for each object.
This study aims to extend the concept of appearance ma- Since Appearance Manipulation comprises projectors and
nipulation to enables object-wise individual appearance cameras, its system owns pixel correspondings among
manipulations. This paper specifically focused on the de- cameras and projectors. This paper attempt to identify the
tection of each object region of in the scene which consists shadow areas caused by the foreground. Furthermore, we
of multiple objects and exploring techniques to manipu- address adjusting the light intensity in the superimposed
late object appearance for each object individually. For regions to eliminate the brightness diference without
instance, this technique could be applied to illumination unafected by changes in appearance.
in theaters, amusement parks, photography, etc.</p>
      <p>Semantic segmentation [7] is a key technology for
computer vision, enabling the precise detection of each object 2. Related Work
with a label. However, it is not guaranteed to work
correctly under the illumination projection, which changes
the apparent color or texture. Meanwhile, the
appear</p>
      <sec id="sec-1-1">
        <title>Amano et al. achieved appearance manipulation of objects</title>
        <p>by using a projector-camera feedback. In this method, a
refrectance estimation is introduced to generate control
reference for Model Predictive Control (MPC) [3]. Figure
1 illustrates the block diagram of appearance manipulation.
The main processing step involves firstly creating an
estimated appearance image  under white projection from
the captured image  and the previous step’s projected
image P. Next, the desired image processing is applied
and  in consideration of robustness with the MPC
controller. Finally, appearance manipulation is realized with
the illumination projection from the projector after the
geometrical deformation based on pixel correspondence.</p>
        <p>This study aims to achieve independent appearance
manipulation for both the foreground and background objects
using layers in the scene that can be detected from the
pixel correspondence between the two projector-camera
systems. Additionally, we aim to remove brightness
variations caused by cast shadows created by foreground objects
in the projection.</p>
      </sec>
    </sec>
    <sec id="sec-2">
      <title>3. Proposed method</title>
      <p>We assume a foreground object is positioned in front of the
background within a two pairs of projectors (Prj1, Prj2)
and cameras (Cam1, Cam2), as shown in Figure 2. In this
situation, we have three projection states: Region I, where
the foreground obstructs one projection without afecting
the other. For this region, the conventional method can be
applied. In RegionII, overlapping projection causes
overillumination requiring novel compensation techniques, and
in RegionIII, where no projection can reach and cast
shadows can not be removed. In this paper, we focus on region
discrimination and illumination suppression in RegionII
to address cast-shadow-like projection unevenness.</p>
      <sec id="sec-2-1">
        <title>3.1. Layer-based Discrimination</title>
        <sec id="sec-2-1-1">
          <title>To identify the aforementioned regions (I, II, and III) as</title>
          <p>well as foreground or background in the Cam1 and Cam2
images, this study uses pre-acquired pixel mapping with
both the foreground and background planes. The pixel
mapping is stored with a look-up table that describing the
pixel correspondings between camera and projector.</p>
          <p>When the captured pixel coordinates are (, ) and the
projected pixel coordinates are (  ,   ), pixel mapping
from the camera to the projector is denoted as
(  ,   ) = 2 (, ),
(1)
and those obtained in the foreground and background
planes are denoted as   2 and 2 respectively, as
shown in Fig. 2.</p>
          <p>In this study, foreground and background are
discriminated by comparing the geometric calibration results of
the actual scene with the intermediate value
(, ) = {   2 (, ) + 2 (, )}/2 (2)
!"#$
%"#$
2 2
j t
PrUCniam2
foreground</p>
          <p>background</p>
        </sec>
      </sec>
      <sec id="sec-2-2">
        <title>3.2. Illumination Suppression</title>
        <p>In this section, we briefly present an illumination
suppression method proposed by Uesaka et al.[11], specifically
designed to address the illumination overlapping region
denoted as Region II in Section 3.1. Given a captured
image C1 from Cam1, ambient light C0, and projected
light P1, the reflectance  of the object surface can be
estimated to be
ˆ =  [C1./( P1 + C0)],
(3)
where  ∈ R3×3 denotes the color mixing matrix between
projector and camera, ./ denotes component-wise division.
However, in the overlapping regions, the reflectance 
is overestimated due to the projected light from the two
units, resulting in overprojection. Therefore, considering
the projection P2 from Prj2, C1 can be estimated as
C1 = cos(1) P1 + cos(2) P2 + C0
(4)</p>
      </sec>
      <sec id="sec-2-3">
        <title>4.1. Experimental setup</title>
      </sec>
      <sec id="sec-2-4">
        <title>4.2. Manipulation results</title>
        <p>From the results of the previous section, we controlled the
projection in region II and performed separate appearance
manipulation for the foreground and background. The
results are shown in Figure 6. From the manipulation
results of the color chart (matte photo paper) shown in
upper row, we confirmed that independent image
processing as bright saturation, monolize and color phase
applied to the foreground and background. As shown in
middle row, the results have potential applications in stage
efects. Moreover, we verified that an object like origami
with specular reflection shown in bottom row can be
correctly manipulated when it doesn’t reflect to the camera or
viewer’s perspective. The brightness diference problem
is improved compared to that under white illumination,
but it has not been fully eliminated.</p>
      </sec>
    </sec>
    <sec id="sec-3">
      <title>5. Discussions</title>
      <sec id="sec-3-1">
        <title>5.1. Over illumination supression</title>
        <sec id="sec-3-1-1">
          <title>To evaluate the efectiveness of our illumination suppres</title>
          <p>sion in overlapping areas, we compared our method using
Equation (6) with the conventional method (Figure 7).</p>
          <p>Observing the boundary between regions I and II, we can
see that the boundary between the regions is clearly visible
in Fig. 7c, while the brightness diference is improved
in Fig. 7d. However, Fig. 7b still shows a marked color
diference. Because this is due to the fact 1 = 2 was
assumed for simplicity in Eq. (6), the change in radiance
due to Lambert’s cosine law was not considered. In
addition, the individual color diference of projector is not
considered, which leads colored shadow. Future research
will be addressing these issues and finding solutions.</p>
          <p>Similarly, by utilizing the captured image C2 from Cam2,
our calculation process enables the accurate estimation
of reflectance using only own unit information. A key 5.2. Adaptive foreground detection
advantage of our approach is that the reflectance estimation
in each unit is performed solely using its own captured In the current calibration phase, the discrimination between
image and projection image. In this study, for simplicity, the foreground and background is determined. However,
we approximate 1 = 2 and suppress overprojection by this static scene assumption becomes insuficient when
estimating reflectance in each system and correcting for it.</p>
          <p>①
②
③
fg
bg</p>
          <p>White illumination
White illumination</p>
          <p>Bright Saturation</p>
          <p>Color Phase Control
Monolize</p>
          <p>Monolize</p>
          <p>Monolize
Color Phase Control
5, the estimated images 1  and 1 are obtained as
shown in Figure 8. Then it is possible to determine which
of image 1  or 1 is closer to 1 and identify
foreground and background. Moreover the homography
transformation is an alternative solution for discrimination.</p>
          <p>However, it does not provide accurate pixel mapping due
to the lack of lens distortion consideration. Therefore,
pixel mapping is still required to achieve precise results.</p>
        </sec>
      </sec>
    </sec>
    <sec id="sec-4">
      <title>6. Conclusion</title>
      <p>In this study, we proposed a method to achieve the
indepen(c) A-A’ in (a) (d) A-A’ in (b) dent appearance manipulation of objects by distinguishing
foreground and background objects, as a preliminary step
Figure 7: Comparing lightness transition with white projec- toward moving away from uniform processing of
appeartion. ance. We also proposed a method to identify shadow
caused by the foreground and to suppress luminance
differences between overlapping regions.</p>
      <p>The experimental results confirmed that foreground and
background could be independently manipulated.
Moreover, we were able to improve the brightness diference
issue in the background caused by projections by
suppressing the projected light intensity based on the overlapped
(a) Estimated image 1  (b) Estimated image 1 projection determination.</p>
      <p>However, the distinction between foreground and
backeFaigcuhrpela8n:eC.omparison of images obtained using 221 of ground relies on pre-acquired pixel maps of the actual
scene, which limits our ability to handle dynamic
moveforeground objects are in motion during operation. There- ments of foreground objects. Additionally, due to the
simfore, our next step involves the development of an adaptive plification of not considering the radiance change caused
discrimination. If the foreground and background objects by the cosine of the incident angle, visible brightness
can be assumed to be planar, a possible solution is to com- diferences remained in the overlapped regions.
pare 2, which is deformed using   221 and 221 Future research will work on implementing dynamic
obtained by geometric calibration, with the actual 1. foreground object distinction and more accurate methods
Specifically, when the captured image is as shown in Fig. for improving brightness diferences.
[10] M. Flagg, J. Summet, J. Rehg, Improving the speed
of virtual rear projection: A gpu-centric architecture,
[1] R. Raskar, G. Welch, K.-L. Low, D. Bandyopad- in: 2005 IEEE Computer Society Conference on
hyay, Shader lamps: Animating real objects with Computer Vision and Pattern Recognition, CVPR
image-based illumination, in: Rendering Techniques 2005 - Workshops, IEEE Computer Society
Confer2001: Proceedings of the Eurographics Workshop ence on Computer Vision and Pattern Recognition
in London, United Kingdom, June 25–27, 2001 12, Workshops, IEEE Computer Society, United States,
Springer, 2001, pp. 89–102. 2005. doi:10.1109/CVPR.2005.476.
[2] A. Grundhöfer, D. Iwai, Recent advances in pro- [11] S. Uesaka, T. Amano, Cast-Shadow Removal for
jection mapping algorithms, hardware and applica- Cooperative Adaptive Appearance Manipulation, in:
tions, Computer Graphics Forum 37 (2018) 653–675. H. Uchiyama, J.-M. Normand (Eds.), ICAT-EGVE
doi:10.1111/cgf.13387. 2022 - International Conference on Artificial Reality
[3] T. Amano, H. Kato, Appearance control us- and Telexistence and Eurographics Symposium on
ing projection with model predictive control Virtual Environments, The Eurographics
Associa(2010) 2832–2835. URL: https://cir.nii.ac.jp/crid/ tion, 2022. doi:10.2312/egve.20221271.
1050577309352921344.
[4] T. Amano, Projection based real-time material
appearance manipulation, in: 2013 IEEE
Conference on Computer Vision and Pattern
Recognition Workshops (CVPRW), IEEE Computer
Society, Los Alamitos, CA, USA, 2013, pp.
918–923. URL: https://doi.ieeecomputersociety.org/
10.1109/CVPRW.2013.135. doi:10.1109/CVPRW.</p>
      <p>2013.135.
[5] T. Amano, I. Shimana, S. Ushida, K. Kono,
Successive Wide Viewing Angle Appearance Manipulation
with Dual Projector Camera Systems, in: T. Nojima,
D. Reiners, O. Staadt (Eds.), ICAT-EGVE 2014
International Conference on Artificial Reality and
Telexistence and Eurographics Symposium on
Virtual Environments, The Eurographics Association,
2014. doi:10.2312/ve.20141364.
[6] K. Murakami, T. Amano, Materiality Manipulation
by Light-Field Projection from Reflectance Analysis,
in: G. Bruder, S. Yoshimoto, S. Cobb (Eds.),
ICATEGVE 2018 - International Conference on Artificial
Reality and Telexistence and Eurographics
Symposium on Virtual Environments, The Eurographics</p>
      <p>Association, 2018. doi:10.2312/egve.20181321.
[7] F. Cao, Q. Bao, A survey on image semantic
segmentation methods with convolutional neural
network, in: 2020 International Conference on
Communications, Information System and
Computer Engineering (CISCE), 2020, pp. 458–462.</p>
      <p>doi:10.1109/CISCE50729.2020.00103.
[8] R. Sukthankar, T. J. Cham, G. Sukthankar, Dynamic
shadow elimination for multi-projector displays,
Proceedings of the 2001 IEEE Computer Society
Conference on Computer Vision and Pattern
Recognition. CVPR (2001). URL: https://cir.nii.ac.jp/crid/
1362544420996161536. doi:10.1109/cvpr.2001.</p>
      <p>990943.
[9] S. Audet, J. Cooperstock, Shadow removal in front
projection environments using object tracking, 2007
IEEE Conference on Computer Vision and Pattern
Recognition (2007). URL: https://cir.nii.ac.jp/crid/
1363388844278556416. doi:10.1109/cvpr.2007.
383470.</p>
    </sec>
  </body>
  <back>
    <ref-list />
  </back>
</article>