Spectral Reflection Prediction by Artificial Neural Network Oleg B. Milder1 and Dmitry A. Tarasov1,2 1 Ural Federal University, Ekaterinburg, RUSSIA, datarasov@yandex.ru, www.urfu.ru 2 Institute of Industrial Ecology UB RAS, Ekaterinburg, RUSSIA Abstract. Digital image processing requires significant amount of cal- culations for characterization and profile making. Moreover, enhancing the computing precision does not always lead to better results, and the gamut might describe less part of color space than could do. Instead of expanding the existing methods and color prediction models, we offer a simple technique for spectral reflection prediction using an artificial neu- ral network in Matlab. The proposed method is fast and easy-to-operate. The experimental verification showed its good performance based on minimization of the color difference CIE Lab dE Lab dE. Keywords: Gradation trajectories, Profile, Spectral reflection, Image processing. 1 Introduction In order to create prints with accurately reproduced colors on a given reproduc- tion system, it is essential to specify the color response that the system provides for a given substrate, ink’s type, and with given amounts of inks during the process of characterization and profile making. In general, due to the printing process, the deposited ink surface coverage is larger than the nominal one resulting in a physical dot gain responsible for the ink spreading, which depends on the inks, on the substrate, and so on [1]. Sometimes, under absence of a competent management system, these limitations lead to loss in color. In any case, the problem of ink management is crucial. To solve this problem, manufacturers of image processing software and printing sys- tems recommend using different criteria, such as: visual evaluation by spreading, numerical estimation of the optical density of color coordinates that are imple- mented in a variety of color prediction models (CPMs). CPMs may help the image processing software to decide, which set of inks is chosen and how to se- lect and mix them in order to create a determined color on a particular substrate by particular inks. The models need to be accounted for both the interactions between dyes, substrates and between light, and the halftone print, as well as the Fresnel reflections and light scattering. By the present time, a great deal 87 of CPMs have been developed. The models are called up to predict the result- ing color in print by a set of ink values as specified by reflectance models or tristimulus values of primaries. Empirical surface models only take into account superpositions of ink halftones, which the reflected light is supposed to be a function of the effective ink surface coverage. These models do not deal with the light propagation within the print and only demonstrate the relationship between the reflected light and surface coverages by ink. Physically inspired models engage a more detailed analysis of light-print inter- action based on mathematical prediction of how the light goes within a halftone print and what the resulting fade is. Ink spreading models describe the physical dot gain as a difference between the effective and the nominal surface coverages. They show how much an ink dot spreads out in all ink superposition conditions and rely on ink spreading curves mapping the nominal surface coverages to the effective surface ones. Spectral reflection prediction models (SRPM) are helpful in studying the im- pact of different factors such as inks, substrate, the illumination conditions, and the halftones influencing the range of printable colors and in creating printer characterization profiles for the purpose of color management [2]. There are more complicated spectral CPMs, which deal with spread-based and light propagation probability. One of the most cited one is Kubelka–Munk model that is widely used to predict the properties of multiple layers of ink overlaid at a given location and given information about each constituent ink’s reflectance and opacity [3] 2 K(λ) (1−R∞ (λ)) = , (1) S(λ) 2R∞ (λ) where K is the absorption and S is the scattering coefficients, R is the reflectance of an infinitely thick sample and the prediction of S and K from reflectance is made at the given wavelength λ. Formula (1) allows predicting the combined K and S coefficients for multiple inks X l K (λ) =KB (λ) + ci Ki (λ), (2) i=1 where B refers to the substrate, l is the number of ink layers, ci is the concen- tration, and Ki is the absorption coefficient of the i -th layer. The value S (λ) is computed analogously. Another famous CPM is the Neugebauer model, which predicts the CIE XYZ tristimulus values of a color halftone patch as the sum of the tristimulus values of their individual colorants [4]. Since the Neugebauer model does not take into account the lateral propagation of light within the paper and internal reflections at the paper-air interface, it is considered to be inaccurate. Today, the most applicable model is the Yule–Nielsen modified spectral Neuge- bauer model (YNSN) where the Yule–Nielsen relationship is applied to the spec- 88 tral Neugebauer equations [5, 6] p n X 1 R (λ) =( wi Pi (λ) n ) , (3) i=1 where R(λ) is the reflectance of a halftone pattern neighborhood that is optically integrated as it is being viewed, wi is the relative area coverage of the i -th Neugebauer primary P, n is the Yule-Nielsen non-linearity that is accounted for the optical dot-gain. The Enhanced YNSN accounts the ink spreading connected with the respec- tive physical dot-gains in different conditions (from 1 to 4 colorants superposi- tion). The model engages multiple ink spreading (tone reproduction) curves to characterize the physical dot-gain [7]. Current CPMs accounting the physical dot-gain and are able to predict re- flectance spectra as a function of ink surface coverage for 3–4 inks [1, 8]. The criterion for assessing the model’s performance is minimization of the difference metric between measured and predicted reflection spectrum for each superpo- sition condition. The most applicable difference metric used is the CIE Lab dE (or ∆E ) color difference [9]. All mentioned approaches have their advantages and drawbacks. Majority of models are too complicated to be embedded in a real digital image processing workflow without substantial development and adjustment that takes time. For instance, the YNSN-based models are extremely critical to selection of the pa- rameter n that is usually fitted by brute force, and is utterly laborious except some approaches [10]. An alternative is an empirical gradation approach. Gradation scales are known as an unaltered attribute of contemporary digital image processing systems [11, pp. 88–89]. At the same time, the authors express doubts about the rational use of its features. The main problem is the fact that using the gradation curves in conventional 2D embodiment significantly reduces the quantity and quality of information extracted from them. In work [12], the 3D gradation trajectories are introduced as a further development of the gra- dation curves. Implication the apparatus of differential geometry for gradation trajectories analysis in the 3D CIE Lab space allows one to reveal their intrinsic features of curvature that help to improve ink management. Nevertheless, precise color prediction is still the issue that should be resolved. One of the promising way to build the color prediction model is not model variation at all. This is the artificial neural network (ANN) approach. Works on this topic had started at early 1990th . Most of researchers were focusing on application of ANN to the Kubelka-Munk approach [13–15]. The work [16] also employs fundamental color stimulus for improving performance of color prediction system based on the ANN. All studies have confirmed vast prospective for ANN-based techniques in color prediction models. This work is devoted to further development of the ANN technique applied to color prediction. We work with an ANN training algorithm, define the ap- plicability of the technique for spectral prediction, and guess how preliminary linearization affects the accuracy. 89 1. Experimental For the experiment, we use the 4-color (CMYK) wide-format ink-jet printer Mimaki CJV30-160BS. Print mode: 720×720 dpi, variable dot. Substrate: a vinyl banner fabric as a weak-absorbent substrate. The measurement tools: spec- trophotometer x-Rite iOne iSis + x-Rite ProfileMaker package. Charts genera- tion is made in the ArgyllCMS package. We describe the approach as follows: print a specially developed test chart → Measure CIE Lab coordinates of the patches → Build ANN → Predict spectral reflectance ρ(λ) by ANN using test chart patches recipes as ANN inputs → Assess the quality of prediction by the color difference formula dE 94 [9, 17]. For the ANN development, we use Matlab 16 package. The ANN type is mul- tilayer perceptron with one hidden layer, and 10 hidden neurons. The training techniques are the Levenberg–Marquardt (L-M) method and bayesian regular- ization [18]. The trained network is stored in the form of a Matlab function. Further statistical operations are carried out in the MS Excel and Statistica 10. We predict spectral reflectance by test chart patches recipes. A preliminary experiment stage consists of training the ANN by the training data set (training chart). The training set contains 2448 patches, which recipes are obtained from a discrete sequence {0; 16.8; 33.3; 50.2; 66.7; 83.1; 100} for each color in all possible combinations (so-called a ‘multicube’). For ANN prediction, the test chart with 2496 patches is developed in the ArgyllCMS. The patches recipes take random values evenly distributing the fields in the color space of the ideal CMYK-device. This data set acts as input for the ANN. The ANN output is the predicted spectrum. For each spectra, the CIE Lab coordinates are calculated. The test chart is then measured and the actual values of the CIE Lab coordinates are established. The color difference between predicted and measured values for each patch of the test chart is calculated by the dE 94 formula. The preliminary stage revealed that 6–7% of predicted spectra have negative values of individual spectral components, which is physically impossible. This occurs in cases where the actual value of the reflection coefficient is close to zero. Under the absence of restrictions, the network selects patterns in this way. However, we do not yet have grounds for introducing restrictions. We study several options of this problem solution. The simplest case is ig- noring of the negative values and removing them from further consideration. However, this can steep the results of the prediction. The other option may be an increase in the samples sizes for both train- ing and test sets. The sample volumes are artificially increased by a three-fold measurement of the scales. The results of measurement are placed in a single protocol. This option also did not bring significant results, except for a significant increase in the training time of the network. Similar results are also obtained in attempts to use the averaged values for training and predicting. A variant of the solution of the problem is the transition from the spectral reflection coefficients ρi (λ) to the spectral optical density Di (λ)(4) 90 Di (λ) = −log10 ρi (λ), ∀ i, λ , (4) where i is the counter of patches, λ is the wavelength, λ = {380, 390, . . . , 730} nm. The final experiment is the following. The training set consists of the results of triple measurements of the training scale. Thus, each patch is included in the training set three times with the same composition of predictors, but with a statistically different composition of spectral components. The additional benefit of this approach is the ability to train the network on statistically blurred data. Such, we train the ANN with spectral density D and predict also its values. After prediction, the spectral density D values are recalculated back into the spectral reflectance ρi (λ) =10−Di (λ) , ∀ i, λ. (5) The predicting quality is assessed by the color difference dE 94 . The list of per- formed experiments and their brief description are given in Table 1. 1. Results and discussion For comparison of the results obtained in each experiment, arrays of dE 94 are fitted by different distributions (see Table 2 and Fig. 1). Goodness of fit is assessed by the χ2 criterion. The distribution that fit dE 94 best in most cases is lognormal. We use an- alytical distribution only for the ability to evaluate median and 95% quantile. Median shows the mean value of the color difference in the test set. The quantile is the upper border of the color difference with 95% probability. Analysis of plots in Fig. 1 confirms validity of such estimate. We also build the dependencies of the dE 94 from the total ink parameter (see Fig. 2). The plots allow assessing where the ANN predicts better, in “lights” or in “shades”. As it can be seen from Fig. 2a, the ANN badly predicts spectral reflectance. In some cases, the dE 94 exceeds 6 that is completely unacceptable in real print production. Moreover, color difference is distributed unevenly in relation with the total ink parameter: the prediction error increases in the “shades”. The determi- nation coefficient and trend in the figure accentuate such increase. Nevertheless, as Table 2 shows, the mean value of the color difference is just about 2 and 95% quantile is less than 5 that is quite good. At the same time, existence of the negative spectral components in some predictions does not allow us to consider this experiment successful. In further experiments, we replace the prediction of spectral reflectance ρi (λ) with one of spectral density Di (λ) without changes in the ANN. Figure 2b shows the results of predictions by the ANN trained with non- linearized printer sample. Table 2 reveals the awesome results of the third experi- ment where 95% values of color difference are less or equal 1.5. Notwithstanding, in this case, we also observe the dependence of the color difference and the total ink parameter. “Lights” are predicted much worse than “shades”. This might 91 be explained with the assumption that a non-linearized printer exceeds the ink supply for highly saturated tones. The reason why the network predicts these formulations better is the excessive number of dark patches in the training set. Figures 2c and 2d show the results of prediction for the ANN trained by data from the linearized printer. The determination coefficients in these cases are significantly lower than previous ones. This can be interpreted as the complete absence of the dependence of the color difference on the total ink parameter. Suchwise, the ANN predicts the spectrum of any patch recipe with equal success. The only difference between Figures 2c and 2d is the algorithm of ANN training: Experiment 4 uses the Levenberg-Marquardt method while Experiment 5 applies the Bayesian regularization. Table 1. Summary table of the experiments description ExperimentA brief experiment description number 1 The test set is printed on a linearized printer and measured three times. 7488 dE 94 values are calculated. dE 94 of each patch from the triple aver- age is obtained. The lowest border of prediction accuracy is evaluated. 2 The test set is printed on a linearized printer and measured three times. The ANN is trained to predict ρ(λ) value according to the recipe of the patch. The L-M algorithm is applied. 7059 recipes are predicted without negative values of ρ(λ). Estimation of direct prediction of ρ(λ) is done. 3 The test set is printed on a non-linearized printer and measured once. The ANN is trained to predict the spectral D value according to the patch recipes. The L-M algorithm is applied. Estimation of indirect pre- diction of ρ(λ) is done. We compare the prediction results of a linearized and non-linearized printing system. 4 The test set is printed on a linearized printer and measured three times. The ANN is trained to predict the spectral D value according to the patch recipes. The L-M algorithm is applied. Estimation of indirect pre- diction of ρ(λ) is done. We compare the prediction results of a linearized and non-linearized printing system. 5 The test set is printed on a linearized printer and measured three times. The ANNis trained to predict the spectral D value according to the patch recipes. The Bayesian regularization algorithm is applied. We compare network learning algorithms. As it can be seen from Table 2, the spreading of color differences dE 94 in Experiment 4 is not fitted well by both lognormal and normal distributions. At this stage, the correlation with the total ink parameter is not high, but 1.4 times higher than in the case of Experiment 5. Moreover, there is an inexplica- ble border around dE 94 =3 (see Fig. 2c). Consequently, it can be argued that, in general, for all experiments, the Levenberg-Marquard training algorithm shows unsatisfactory results. The lowest correlation with the total ink and even distri- 92 bution of low dE 94 are obtained in Experiment 5 with the Bayesian regularization for the ANN training method. Fig. 1. Distribution fitting according to Table 1: a) Experiment 1, b) Experiment 2, c) Experiment 3, d) Experiment 4 - lognormal, e) Experiment 4 - normal, f) Experiment 5 93 Fig. 2. dE 94 vs total ink parameter for: a) Experiment 2, b) Experiment 3, c) Experi- ment 4, d) Experiment 5 Table 2. Statistical processing results ExperimentType of χ2 Parameters of the theo- Median, 95% quan- number distribu- retical distribution, µ / d tile, d tion σ 1 Log- 66 –3.0275 / 0.4893 0.05 0.11 normal 2 Log- 80 0.7369 / 0.4916 2.09 4.69 normal 3 Log- 11 –0.1349 / 0.3325 0.87 1.51 normal 4 Log- 1304 0.3625 / 0.3005 1.44 2.36 normal 4 Normal 501 1.6327 / 0.5570 1.63 2.55 5 Log- 291 0.2108 / 0.3142 1.23 2.07 normal 1. Conclusion We offer a technique for spectral reflection prediction using an artificial neural network. The proposed method is easy-to-operate and does not involve sophis- ticated color prediction models. The experimental verification showed its good performance based on minimization of the color difference CIE Lab dE 94 . 94 Application of artificial neural networks for solving the problem of color pre- diction by its recipe shows the excellent result subject to certain conditions. First, the color reproduction system must be linearized prior to the predic- tion. Next, we strongly recommend training the network not for prediction the spectral reflectance but for the spectral density, since there are probability of appearance of negative values during the ANN prediction. Additional benefits of our study are the following: we first use the uniformity of the dE distribution from the total ink as a criterion for prediction quality assessment. Our approach solve the problem of the Black (K) channel generation automatically as we use the CMYK patches as inputs and outputs while common color prediction models operate in CMY recipes only, which requires additional efforts for the CMY-CMYK converting. The results obtained are preliminary. Some issues remain unsolved. We would expect that the prediction could be even more precise when the appropriate training algorithm and sample volume were selected. References 1. Balasubramanian, R.: Optimization of the spectral Neugebauer model for printer characterization. JEI, 8, 156–166 (1999) 2. Bala, R.: Device characterization. Digital Color Imaging Handbook, ed. G. Sharma (CRC Press, Boca Raton, FL), 269–379 (2003) 3. Kubelka, P., Munk, F. Ein Beitrag zur Optik der Farbanstriche. Zeitschrift für technische Physik, 12, 593–601 (1931) 4. Neugebauer, H. E. J.: Die theoretischen Grundlagen des Mehrfarbendrucks. Zein- schrift fur Wissenschaftliche Photographie Photophysik Photochemie, 36, 36–73 (1937) 5. Yule, J. A. C., Nielsen, W. J.: The penetration of light into paper and its effect on halftone reproductions. Proc. TAGA Conference 1951. 65–76 (1951) 6. Viggiano, J. A. S.: Modeling the color of multi-colored halftones. Proc. TAGA Conference 1990, 44–62 (1990) 7. Hersch, R. D., Crété, F.: Improving the Yule–Nielsen modified spectral Neugebauer model by dot surface coverages depending on the ink superposition conditions. Proc. SPIE 5667, 434–445 (2005) 8. Wyble, D. R., Berns, R. S.: A critical review of spectral models applied to binary color printing. Color Research & Application, 25, 4–19 (2000) 9. Pauli, H.: Proposed extension of the CIE recommendation on “Uniform color spaces, color difference equations, and metric color terms”. J.Opt.Soc.Am, Vol. 66, 866–867 (1976) 10. Mazauric, S., Hebert, M., Fournel, T.: Revisited Yule–Nielsen model without fitting of the n parameter. J.Opt.Soc.Am, 35, no.2, 244–255 (2018) 11. Kipphan, H.: Handbook of Print Media. Springer-Verlag, 1207p (2001) 12. Milder, O. B., Tarasov, D. A., Titova, M. Yu.: Inkjet Printers Linearization Using 3D Gradation Curves. CEUR Workshop Proceedings, Vol.1814. 74–83 (2017) 13. Bishop, J. M., Bushnell, M. J., Westland, S.: Application of neural networks to computer recipe prediction. Color Research & Application, 16(1), 3–9 (1991) 14. Wölker, M., Kolk, M., Kettler, W. H., Spehl, J.: Color recipe prediction by artificial neural networks. Die Farbe, 42, No.1–3, 65–91 (1996) 95 15. Westland, S., Iovine, L., Bishop, J. M.: Kubelka-Munk or neural networks for computer colorant formulation? Proceedings of SPIE, 4421, 745–748 (2002) 16. Ameri, F., Moradian, S., Amani Tehran, M., Faez, K.: The use of fundamental color stimulus to improve the performance of artificial neural network color match prediction systems. Iranian Journal of Chemistry and Chemical Engineering, 24(4), 53–61 (2005) 17. Hill, B., Roger, Th., Vorragen, F. W.: Comparative Analysis of the Quantization of Color Spaces on the Basis of the CIELAB Color-Difference Formula. ACM Trans- actions on Graphics, 16(2), 109–154 (1997) 18. Kayri, M.: Predictive abilities of Bayesian regularization and levenberg-marquardt algorithms in artificial neural networks: A comparative empirical study on social data. Mathematical and Computational Applications. 21(2), 21020020,(2016)