<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta>
      <journal-title-group>
        <journal-title>B. L. 1974. An iterative technique for the rectifi-
cation of observed distributions. The astronomical journal
79(6):745-754.
Lustig</journal-title>
      </journal-title-group>
    </journal-meta>
    <article-meta>
      <title-group>
        <article-title>[Short paper] EM-algorithm Enpowers Material Science: Application of Inverse Estimation for Small Angle Scattering</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Akinori Asahara Hidekazu Morita</string-name>
          <xref ref-type="aff" rid="aff1">1</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Kanta Ono</string-name>
          <xref ref-type="aff" rid="aff0">0</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Masao Yano Tetsuya Shoji</string-name>
          <xref ref-type="aff" rid="aff4">4</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Kotaro Saito</string-name>
          <xref ref-type="aff" rid="aff3">3</xref>
        </contrib>
        <contrib contrib-type="author">
          <string-name>Chiharu Mitsumata</string-name>
          <xref ref-type="aff" rid="aff2">2</xref>
        </contrib>
        <aff id="aff0">
          <label>0</label>
          <institution>High Energy Accelerator Research Organization</institution>
          ,
          <addr-line>Tsukuba, 305-0801</addr-line>
          ,
          <country country="JP">Japan</country>
        </aff>
        <aff id="aff1">
          <label>1</label>
          <institution>Hitachi Ltd., Tokyo</institution>
          ,
          <addr-line>100-8280</addr-line>
          ,
          <country country="JP">Japan</country>
        </aff>
        <aff id="aff2">
          <label>2</label>
          <institution>National Institute for Materials Science</institution>
          ,
          <addr-line>Tsukuba, 305-0047</addr-line>
          ,
          <country country="JP">Japan</country>
        </aff>
        <aff id="aff3">
          <label>3</label>
          <institution>Paul Scherrer Institute</institution>
          ,
          <addr-line>Villigen, 5232</addr-line>
          ,
          <country country="CH">Switzerland</country>
        </aff>
        <aff id="aff4">
          <label>4</label>
          <institution>Toyota Motor Corporation</institution>
          ,
          <addr-line>Toyota, 471-8572</addr-line>
          ,
          <country country="JP">Japan</country>
        </aff>
      </contrib-group>
      <pub-date>
        <year>2020</year>
      </pub-date>
      <volume>1</volume>
      <fpage>23</fpage>
      <lpage>25</lpage>
      <abstract>
        <p>In this short paper, a machine-learning algorithm is applied to improve SAS (Small Angle Scattering) experimental analysis, which is commonly used in material science. In a SAS experiment, a particle beam incidenting to a material sample is scattered through the material sample. The distribution of the scattered beam indicates information about the grain-size distribution of the sample material; however, this distribution needs to be inversely estimated. Therefore, a stochastic model of the SAS experiment and EM (Expectation-Maximization)algorithm to estimate the grain-size distribution in the material sample are proposed. While existing methods require much manual effort, the proposed EM-algorithm works automatically. Six simulation-generated datasets and two actual observed datasets were processed with the proposed method for examination. The result show that the proposed EM-based grain-size distribution estimation method is useful for automatically analyzing SAS data.</p>
      </abstract>
      <kwd-group>
        <kwd>Detector plane</kwd>
      </kwd-group>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>Introduction</title>
      <p>Materials Informatics (MI) is an information technology
intended for making material development faster that has been
researched eagerly in recent years(National Institute of
Standards and Technology 2019). MI will help material science
researchers to discover new knowledge.</p>
      <p>One such MI function is a data mining technique to find
very small features of experimental data automatically.
Traditionally, material science researchers carefully inspect
experimental data to find small features because they might
indicate new knowledge. The researchers however might take
a long time to find such features or miss them. Therefore,
automatic knowledge extraction from experimental data is
attracting attention of the researchers.</p>
      <p>
        This study focuses on small-angle scattering (SAS)
experiments
        <xref ref-type="bibr" rid="ref5">(Higgins and Benoˆıt 1994)</xref>
        <xref ref-type="bibr" rid="ref1">(Asahara et al. 2019)</xref>
        ,
which are commonly conducted for observing
microstructures of materials. There are various similar scattering
ex
      </p>
      <p>0 0 0 0 1 1 0 0
0 1 1 2 0 2 0 0
0 4 0 2 5 3 1 1
1 0 4 5 8 0 1 0
2 0 5 8 10 2 1 1  
0 1 1 2 5 2 1 0
0 1 0 3 0 0 1 0
0 0 0 1 0 0 0 0</p>
      <p>SAS pattern
periments such as neutron-scattering, x-ray scattering,
ionbeam scattering, etc. Their difference lies just in the particles
to be scattered. The solution for the problem in SAS can be
expected to be applied for these experiments also. Thus, the
problem is crucial enough to need to be solved.</p>
      <p>One of the SAS-experimental objectives is to
estimate microscale-grain-size distributions in material
samples. Neutrons detected on a plane during a SAS experiment
make a pattern on the plane (called SAS pattern). Material
science researchers with special knowledges observe SAS
patterns carefully to find grain-size information about the
microstructure of the sample material.</p>
      <p>Accordingly, a method to automatically estimate
grainsize distributions with SAS pattern data is presented in this
paper. Several existing estimation methods are based on
function optimization to fit the grain-size distribution to the
SAS pattern, which requires much effort by maerial science
researchers to adjust parameters. In contrast, our automatic
estimation method is free from such effort because of
probabilistic modeling of SAS experimental processes (that is,
knowledges of the experimental settings). A maximum
likelihood approach based on the stochastic modeling can be
taken to estimate grain-size distribution without heuristic
assumptions. In this paper, an expectation-maximization (EM)
algorithm applicable to the estimation is shown and
examined with simulation data and actual measurement data.</p>
    </sec>
    <sec id="sec-2">
      <title>Problem settings</title>
      <sec id="sec-2-1">
        <title>Small angle scattering</title>
        <p>An experimental instrument setting of SAS is illustrated in
Figure 1. In the experiment, a particle beam incident upon
Wave number (nm-1)
(a)
0.1
e
d
u
t
i
l
p
m
a
g
n
i
r
e
t
t
a
c
S
1
the sample interacts with the microstructures therein. The
directions of the particles thus change due to the interactions.
The angle between a straight beam and the changed
direction of the scattered beam depends on the interaction. Finally
detectors arranged on a plane detect the scattered beam. The
counts of detection events form a pattern, called SAS
pattern, on the plane. Thus, such microstructure causing the
direction changes is called a ”scattering body.”</p>
        <p>The particle behavior during the scattering experiment is
modeled with a differential equation called the Scho¨dinger
equation. The solution of the Scho¨dinger equation is a
complex function called a wave function, of which the squared
absolute value corresponds to the probability of detection.
Because the distance L between the sample and the plane is
large enough, the coordinate values on the plane x = (x; y)
are approximately in proportion to jxj = L sin ' L . The
probability density function (PDF) P (x) of detection
corresponds to the probability P ( ) that particle goes in the
direction of , which is related to the microscopic structures
called grains.</p>
        <p>As the simplest setting, imagine a case in which the grains
are balls. Intensity I(r; q) of SAS pattern scattered by balls
of radius r is in proportion to the following I(r; q)
I(r; q) / I(r; q) =
1
r3
sin qr
q3
r cos qr
q2
2
:
(1)
The q in the formula indicates a quantity called ”wave
number,” which is the frequency of the wave function multiplied
by 2 . The frequency of the wave function is three
dimensional because it is derived with the Fourier transformation
of the wave function in three dimensional space. The
scattering angle depends on the frequency, so the size of q = q
along the vertical vector to incident beam (”q = (qx; qy)”
in Fig 1) appears in the formula. Therefore a q indicates a
location x on the detection plane, derived from distance
between the incident beam center and the location. That is, we
can obtain actual SAS intensity corresponding into I(r; q)
by converting x to q.</p>
        <p>This formula is feasible in the case of a uniform grain size
r. However actual grain sizes vary. The SAS pattern by
multiple grain sizes is the weighted sum of I(r; q) over r and the
weight is the grain-size distribution of the material, because
the solutions of the Scho¨dinger equation can be added
together, accordingly scattering pattern S(q) with a scattering
body that is derived as</p>
        <p>S(q) /</p>
        <p>Z
f (r)I(r; q)dr;
(2)
where the grain-size distribution is denoted as f (r).
Expert-knowledge-based analysis
To estimate grain-size distribution, S(q), which is the
integration of f (r)I(r; q), should be decomposed to the
summation of I(r; q); however this is difficult. Thus, material
science researchers have tried to guess f (r) with clues from
small features latent in the plot of I(r; q) as shown in Fig. 2.
The figure presents a log-log plot of a SAS pattern and it’s
domain is separated into three parts (a), (b) and (c). In (a),
that is q ! 0, the power series of a trigonometric function
with q</p>
        <p>1
I(r; q) ' r3
qr
q3
r
q2 (1
1
2
(qr)2)
2
=
r
4
S(q) is independent from q. Thus, it converges to a constant
value. In (b), corresponding to I(r; q) under q ! 1, is
approximated as</p>
        <p>1
I(r; q) ' r3
r cos qr
q2
2
:</p>
        <sec id="sec-2-1-1">
          <title>Therefore, S(q) is derived as</title>
          <p>1 Z
S(q) ' q4</p>
          <p>r2f (r) cos2 qrdr:
This behaves as the Fourier transform of r2f (r) with
decaying in the fourth power of q.</p>
          <p>(c) is intermediate between (a) and (b). I(r; q) in the
domain is the following.</p>
          <p>1
I(r; q) = q6 (sin qr
qr cos qr)2 :
I(r; q) is always non-negative and I(r; q) = 0 when
sin qr qr cos qr = 0. Therefore I(r; q) = 0 leads to
sin qr= cos qr = tan qr = qr. Figure 3 plots each side of
this equation. The horizontal axis x of the graph indicates qr.
The blue curve represents y = tan x and the orange line
represents y = x. Their intersections, indicated by the circles
in the figure, correspond to points satisfying tan qr = qr,
that is, I(r; q) = 0. Therefore, the zero points appear
periodically. Additionally local maximum points, which satisfy
sin x = 0, exist between the zero points. Thus I(r; q)
oscillates and it’s frequency depends on r. S(q), which is the sum
of the I(r; q), involves the oscillations of various phases, so
the oscillations are gradually canceled by q becoming larger.
Hence, only the oscillation at the small-q domain is readable.</p>
          <p>The material science researchers accordingly look for the
oscillation at the (c) domain because it gives implicit hints
to understand f (r). Therefore, f (r) can be estimated only
roughly. If f (r) were estimated directly, the SAS experiment
could give much more information of the sample.
Consequently, a method to directly estimate f (r) is highly needed.
Thus, a machine-learning-based method is proposed in this
paper.
(3)
(4)
(5)
(6)
-10
-20
and Ingo 2018). However, for this approach, the form of
f (r) is required. The true f (r) is generally unknown in
actual situations. Material scientists therefore should assume
many kinds of function forms to find the best estimation.
Until the best estimation is achieved, many trials will be
required, leading to a long calculation time.</p>
          <p>To avoid such difficulty, a function having a more general
formula should be used. One technique using such function
is Indirect Fourier Transform (IFT) (Otto 1977). For IFT,
summation of multiple stepwise functions n(x) is used as
the general function. The stepwise function
n(r) returns 1
when rn &lt; r &lt; rn+1, and 0 otherwise, where the domain
of the function is separated into N small partitions rn &lt; r &lt;
rn+1 (1;
n;</p>
        </sec>
        <sec id="sec-2-1-2">
          <title>N ). Formula (2) is</title>
          <p>S(q) '</p>
          <p>X an
n</p>
          <p>Z
n(r)I(r; q)dr;
(7)
Under this assumption, the integral is decomposed into
definite integrations in rn &lt; r &lt; rn+1. Because the definite
integrals can be carried out analytically, S(q) is described
as a linear combination of an. After minimizing the
difference between the linear combination of an and SAS
pattern, the grain-size distribution f (r) is obtained as the sum
of an n(r).</p>
          <p>The resolution of the grain-size distribution is determined
by</p>
          <p>n in IFT as shown above. Therefore, the range of n
should be small to improve the resolution of grain size.
Although many ans thus have to be determined for high
resolution results, the SAS pattern must be highly accurate because
0
5</p>
          <p>10
“sin x=0” points
Change
direction</p>
          <p>Detected
the higher resolution setting makes estimation error larger.
A technique to avoid this problem is to add regularization
terms to suppress over fitting. However, the regularization
terms is required to adjust manually. To automate
regularization, complicated methods to determine the regularization
terms have been proposed, but they are not common yet.</p>
          <p>In this paper, an approach in which machine-learning
algorithms are applied is taken against the problem.
Specifically, the SAS-experimental process is modeled as a
stochastic process with latent variables. After that, a likelihood
function derived from the stochastic process is maximized
to fit the SAS pattern. As the result, the grain-size
distribution is obtained as the optimal model parameter of the
stochastic process. No assumption is required for the method
if a non-parametric model (that is, a very general stochastic
model such as a Gaussian mixture) is applied for the
SASexperimental process. Generally an EM algorithm is applied
to non-parametric models. Similarly a method using a
nonparametric model and EM algorithm is proposed.</p>
          <p>
            Such techniques are used in astrophysics (William 1972)
(Leon 1974), bioinformatics (Lustig et al. 2008) (Lustig,
Donoho, and Pauly 2007) and compressed sensing
            <xref ref-type="bibr" rid="ref4">(Donoho
2006)</xref>
            . However this kind of approach is not common in
scattering experiments. Therefore, in this paper, algorithms
suitable for SAS are proposed and examined using simulation
and actual data.
          </p>
        </sec>
      </sec>
    </sec>
    <sec id="sec-3">
      <title>Stochastic process of SAS</title>
      <sec id="sec-3-1">
        <title>Approach</title>
        <p>The process consists of dispersion and observation, which
are modeled with two different probabilistic models. shown</p>
        <p>At the first dispersion step, the incident beam interacts
with grains. In Fig. 5, ”determine a grain” represents the
process. It can be interpreted as a stochastic process in which
particles of the incident beam choose a scattering body in the
sample material. The probability density function is
consequently assumed in proportion to f (r). That is, the
dispersion step of N particles is modeled as a N -times iteration of
random sampling from f (r).</p>
        <p>The second observation step, in which the incident beam
changes its direction and arrives at a point on the
detector plane, is also modeled as a random sampling process,
shown as ”change direction” in Fig. 5. The scattered
particles choose a scattering angle randomly and are detected as a
SAS pattern. This angle choice is stochastic due to the
principle of quantum physics. Thus the probability distribution
function is in proportion to I(r; q) defined in (3).</p>
        <p>The entire process of SAS is modeled as the combination
of these two stochastic processes. In the entire process, the
size of the scattering body interacting with each particle is
unobservable. When both latent variables and model
parameters are unknown, that Bayes statistics works. The
probability that q is chosen after determining r is described as a
posterior P (qjr) in Bayes statistics. Note that P (qjr) / I(r; q)
and the function to be estimated is P (rjq) because only q
is determined by the SAS pattern. These can be easily
connected with Bayes theorem:</p>
        <p>P (rjq) =</p>
        <p>P (qjr)P (r)</p>
        <p>P (q)
:
(8)</p>
        <p>This formula includes two new parts (P (r) and P (q))
though they do not cause problems. P (r) is a prior about
grain choosing. It can be set uniformly when no information
about grain size is given. Moreover, P (q) is a prior about the
wavenumber. Being independent from grain-size, P (q) will
be canceled with a normalization constant of P (rjq).
Consequently, P (rjq) equals P (qjr), which is in proportion to
I(q; r), except the normalization constant.</p>
        <p>This modeling is straightforward from a
machinelearning-based viewpoint. However from the
quantummechanics-based viewpoint, the incidenting particles are
dealt with as a wave. Consequently, in the proposed
approach, the model is simplified because of the aspect change
from wave-like aspect to a particle-like one.</p>
      </sec>
      <sec id="sec-3-2">
        <title>One-particle model</title>
        <p>The formula about the scattering process of one particle
should be precisely discussed as detailed above. The first
process is to decide the grain causing scattering. The grain
size r is continuous in domain 0 &lt; r &lt; R. However, as
mentioned above, it is separated into the L small partitions
labeled by 0 L 1. Assuming that the representative grain
size in each partition is set as the center of the partition
denoted as r0; rL (that is rn+1 = rn + R=L), we can
write the grain size frequency as f (r0); f (rL 1). As the
stochastic process, a particle randomly chooses a grain size
for scattering with probability P (ri) / f (ri). Accordingly,
P (rl) = Pm f (rm)</p>
        <p>:
f (rl)</p>
        <p>In the second process, the scattering angle is decided.
Similarly the wavenumber domain 0 &lt; q &lt; Q is also
separated into the K small partitions labeled by 0 K 1 and
the center of the partitions are denoted as qk. The
probability that the scattered particle is detected at the qk detector
is therefore described as P (qkjrl), which is in proportion to
I(rl; qk). Although some particles will go outside of the
detection plain, they are regarded as outside of the population
distribution to be modeled. Consequently,</p>
        <p>P (qkjrl) = Pm I(rl; qk)</p>
        <p>:
I(rl; qk)
(9)
For simplicity, P (rl) l, P (qkjrl) l;k hereafter. The
probability that a particle is scattered at rl and detected in the
kth partition is derived as i l;k. To estimate the grain size
distribution likelihood, we thus need P (f 0; Lgjqk).</p>
        <p>The grain-size partition in which the particle is actually
scattered is unobservable directly. Therefore, rl should be
marginalized as follows:</p>
        <p>P ( 0;</p>
        <p>Ljqk)
=
/
=</p>
        <p>P (qkj 0;</p>
        <p>L)P ( 0;</p>
        <p>L)</p>
        <p>P (qk)
X P (qkjrl)P (rlj 0;</p>
        <p>l
X</p>
        <p>i l;k;
l</p>
        <p>L)
(11)
where priors P (qk) and P ( 0; L) are regarded as
constant parameters. Figure 6 illustrates this calculation. Even
after a particle is detected at q2, their possible paths are
nonunique. Therefore the likelihood of the scatting process
involves the sum of the all paths.</p>
      </sec>
      <sec id="sec-3-3">
        <title>N-particles model</title>
        <p>Although the likelihood of the 1-detection event is
formulated as above, an actual SAS pattern includes many
detection events. Because the SAS pattern is a set of
counts of detection events, it is denoted as K integers:
fn0; nK g. With the total number N of the events,
N = Pk nk: f 0; ; Lg maximizing the total likelihood
P (n0; nK j 0; ; L) is required, indicating the
grainsize distribution.</p>
        <p>For simplicity of the calculation, the following
logarithmic likelihood is to be maximized by f kg.
However, because the ks are probabilities of the random
choice, they are restricted as P k = 1. Therefore, the
maximization is carried out under the constraint with the
Lagrange multiplier method.
where is the Lagrange multiplier. This leads to the
following L equations,
l l;k
= 0:
After is multiplied to both sides of the equations and the
equations are summed,</p>
        <p>Pj j j;k
X nk P
k l l l;k</p>
        <p>X
j
j</p>
        <sec id="sec-3-3-1">
          <title>Therefore the equation</title>
          <p>X nk P
k</p>
          <p>j;k
l l l;k
= N
should be solved to obtain f lg.</p>
          <p>
            To solve this problem, an iteration algorithm called an
EM-algorithm
            <xref ref-type="bibr" rid="ref2">(Bishop 2006)</xref>
            is generally applied (Zhang
1993)
            <xref ref-type="bibr" rid="ref3">(Demoment 1989)</xref>
            (Nagata, Sugita, and Okada 2012).
Because (10)(11) leads to
          </p>
          <p>j j;k
Pl l l;k
this part represents the probability that a particle detected at
qk is scattered at rl. Therefore, the expectation value ml of
the number of such particles is ml = Pk nkP (rljqk) when
nk particles are detected at qk. According to P (rl) = l,
additionally,</p>
          <p>X nk P
k</p>
          <p>j;k
l l l;k
= mj = N:
j
The equation can be separated into the equation to lead ls
and that to lead mls:
l =
ml
N
ml =</p>
          <p>X nk P
k
l l;k
j j j;k
Consequently, E-step to obtain the expectation value ml and
M-step to obtain f lg with the maximal likelihood are
iteratively carried out to derive the solution of the equation (16).
Algorithm 1 lists the procedures.
(13)
(14)
(16)
(18)
(19)</p>
        </sec>
      </sec>
      <sec id="sec-3-4">
        <title>Experimental settings</title>
        <p>Two different types of experiments were executed to
evaluate whether the proposed algorithm automatically estimates
grain-size distribution consistent with SAS pattern. In the
first experiment (Experiment 1), simulation-generated data
were processed because we can compare the results with
ground truth. In the second experiment (Experiment 2),
actual SAS pattern data with naive samples were processed to
assess the actual feasibility of the proposed algorithm.</p>
        <p>The two types of data were processed with the proposed
algorithm, and IFT for comparison. For the proposed
algorithm, 10,000 iterations of the EM algorithm were carried
out instead of checking convergence. That is because the
processing time is limited in an experiment but is
unlimited until convergence. The processing time is expected to
be limited when the iterations are limited.</p>
        <p>The IFT executed in the experiments involves the L1 and
L2 regularization. The weight parameters of the
regularization terms are tuned for IFT to return reasonable estimation
result. This tuning is carried out twice, that is, for
Experiments 1 and 2, because the best setting depends on the total
event number of the SAS pattern.</p>
      </sec>
      <sec id="sec-3-5">
        <title>Experiment 1: simulation data</title>
        <p>In Experiment 1, six types of grain-size distributions were
defined. Each pattern is one Gamma distribution or the sum
of two Gamma distributions having the most frequent point
around 10nm. The grain-size distribution is discretized by
0.2 nm, and its domain is set from 0 to 20 nm (i.e., 100
values), corresponding to f (r) in (2). The S(q) was
calculated by evaluating integration of (2). Because S(q)
indicates the probability of the detection, by multiplying the
detection event number to S(q), the most probable SAS
patterns can be generated. The q of SAS pattern is also discrete
and its domain is from 0.1nm 1 to 5nm 1. For the
experiment, the detection event number was set as 10,000, and the
SAS patterns of the grain-size distributions were generated
and named Patterns 1-6.</p>
        <p>Figures 7 and 8 show the results. In both figures, (a) plots
the SAS pattern by log-log plot, (b) plots the grain-size
distribution estimated by the proposed method, and (c) plots
the grain-size distribution estimated by IFT for comparison.
The blue lines in (b) and (c) plot the truth, i.e. the
original grain-size distribution. In (b), all estimation results are
highly similar to ground truth. In contrast, in (c), estimation
results are generally inaccurate.</p>
        <p>The grain-size distribution of Pattern 1 has a small peak at
the foot of a large peak. The two peaks should be separately
estimated. The ML results are so accurate that the small peak
appears clearly, whereas the small peak in the IFT results is
difficult to recognize.</p>
        <p>The grain-size distribution of Pattern 2 also has a small
peak, but it is located on the opposite side to that in Pattern
1. The IFT results do not accurately estimate the small peak,
whereas the ML results do.</p>
        <p>The grain-size distribution of Pattern 3 has only one peak.
The IFT results of this pattern are similar to those of
Pat0.1
0.1
0.1
10 20
grain size [nm]
tern 2. Both Patterns 2 and 3 have a large peak at a small
grain size. The small grain size corresponds to a large wave
number due to I(q; r). The features are very small as shown
in Fig. 7 (a) because the S(q) in the high-q area decays q 4.
Function-fitting-based techniques such as IFT cannot handle
such small components, whereas stochastic techniques such
as the proposed method take into account small
probabilities.</p>
        <p>One large peak in the intermediate grain-size is shown in
Pattern 4. Pattern 4 is so simple that the estimation is easy.
Indeed, both the ML and IFT results are very accurate.
However, the ML results are more accurate the IFT results.</p>
        <p>Two comparable peaks appear closely in Pattern 5.
Because the IFT results do not detect these two peaks, one peak
instead appears between them. In contrast, ML results detect
both peaks accurately.</p>
        <p>Three peaks are shown in Pattern 6. Similar to Pattern 5,
the IFT results did not extract the three peaks, whereas the
ML results did.</p>
        <p>The SAS patterns of (a) input are quite similar for
humans. Therefore, material scientists have to make an effort
to obtain their difference, which reflects radical changes in
the grain-size distribution. According to the results, the
proposed method is helpful and reliable. This shows that the
SAS experiment can become more useful for observing
microstructures of materials.</p>
        <p>Figure 9 plots processing time of the pattern estimation.
For this experiment, a computer loading Intel(R) Core(TM)
i3-4150 CPU 3.50GHz and 11 GB RAM and Cent OS. The
implementation is based on Python 3.6.5 and numpy library
(Oliphant 2006) is used to improve efficiency of the process.</p>
        <p>The proposed method takes around 1.2 seconds, which
is much shorter than the experimental time of SAS ( for
neutron scattering, around 20 minutes). In comparison, IFT
takes around 6.0 seconds, 5 times as long as the proposed
method. IFT is not much slower; however, this difference
can became important if material science researchers have to
conduct many iterations during trial-and-error experiments.
This shows the proposed method is quite useful for SAS data
analysis.</p>
        <p>According to the results, the proposed method enables the
grain-size distribution to be estimated accurately. IFT makes
large errors when the grain size is small, whereas the
proposed method works well for such cases. In actual
situations, we cannot know whether the grain size of a sample
is low (i.e., IFT applicable) or not. Therefore, IFT requires
much effort by material scientist but the proposed method
does not. This shows that the proposed method is suitable
for automatically processing SAS patterns.</p>
        <p>Experiment 2: actual measurements
In Experiment 2, SAS patterns of neutrons with a
polystyrene ball (radius 18 nm) sample and a silica ball
(radius 25 nm) sample were examined. Figure 10 shows the
results ((a), (b) and (c) are the same as in Experiment 1).
The SAS pattern are more noisy than those of Experiment 1.</p>
        <p>The most frequent radius of (b) and (c) is around the
sample true radius. This shows that both the proposed method
0.1
0.1
0.1
and IFT can be used. The difference between the ML
results and IFT results is that small peaks appear at the
integermultiplied true radius. This is considered to be because
clusters of the multiple balls are detected.</p>
        <p>The results show the proposed method is feasible for
actual SAS pattern analysis. Moreover small material-inside
behaviors might be observable. Thus this implies that the
proposed method will extract information leading to new
knowledge.</p>
        <p>Conclusion and Future Works
An expectation-maximization (EM)-based grain-size
distribution estimation method was proposed for the
automatically analyzing small angle scattering (SAS) patterns.
Experimental results showed that the proposed method can
accurately estimate the original grain-size distribution from
SAS patterns. Moreover, the proposed method does not
require parameter tuning to obtain good results, whereas the
existing method ( Indirect Fourier Transform ) does.</p>
        <p>The stochastic model that is the base of the proposed
method does not assume priors. However, with priors, the
estimation might be made more accurate and detection events
required to estimate the grain-size might be made fewer. In
addition, non-ball scattering bodies should be taken into
account. Such extensions are possible future works.
1000
100 ity
10 itsen</p>
        <p>n
1
estimated
estimated
150
150
50 100
grain size [nm]
estimated
estimated
150
150</p>
      </sec>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          <string-name>
            <surname>Asahara</surname>
            ,
            <given-names>A.</given-names>
          </string-name>
          ;
          <string-name>
            <surname>Morita</surname>
            ,
            <given-names>H.</given-names>
          </string-name>
          ; Mitsumata,
          <string-name>
            <given-names>C.</given-names>
            ;
            <surname>Ono</surname>
          </string-name>
          ,
          <string-name>
            <given-names>K.</given-names>
            ;
            <surname>Yano</surname>
          </string-name>
          ,
          <string-name>
            <surname>M.</surname>
          </string-name>
          ; and Shoji,
          <string-name>
            <surname>T.</surname>
          </string-name>
          <year>2019</year>
          .
          <article-title>Early-stopping of scattering pattern observation with bayesian modeling</article-title>
          .
          <source>In Proceedings of the AAAI Conference on Artificial Intelligence</source>
          , volume
          <volume>33</volume>
          ,
          <fpage>9410</fpage>
          -
          <lpage>9415</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          <string-name>
            <surname>Bishop</surname>
            ,
            <given-names>C. M.</given-names>
          </string-name>
          <year>2006</year>
          .
          <article-title>Pattern Recognition and Machine Learning</article-title>
          . New York: Springer.
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          <string-name>
            <surname>Demoment</surname>
            ,
            <given-names>G.</given-names>
          </string-name>
          <year>1989</year>
          .
          <article-title>Image reconstruction and restoration: overview of common estimation structures and problems</article-title>
          .
          <source>IEEE Transactions on Acoustics, Speech, and Signal Processing</source>
          <volume>37</volume>
          (
          <issue>12</issue>
          ):
          <fpage>2024</fpage>
          -
          <lpage>2036</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          <string-name>
            <surname>Donoho</surname>
            ,
            <given-names>D. L.</given-names>
          </string-name>
          <year>2006</year>
          .
          <article-title>Compressed sensing</article-title>
          .
          <source>IEEE Transactions on information theory 52</source>
          <volume>(4)</volume>
          :
          <fpage>1289</fpage>
          -
          <lpage>1306</lpage>
          .
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          <string-name>
            <surname>Higgins</surname>
            ,
            <given-names>J. S.</given-names>
          </string-name>
          , and Benoˆıt,
          <string-name>
            <surname>H.</surname>
          </string-name>
          <year>1994</year>
          .
          <article-title>Polymers and neutron scattering</article-title>
          .
          <source>Clarendon press Oxford.</source>
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          <article-title>50 100 grain size [nm] 50 100 grain size</article-title>
          [nm]
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>