<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta />
    <article-meta>
      <title-group>
        <article-title>A First-Order Theory of Film Scores for Generation from Lightweight Speci cations</article-title>
      </title-group>
      <contrib-group>
        <aff id="aff0">
          <label>0</label>
          <institution>Algorithm Input and Output</institution>
        </aff>
        <aff id="aff1">
          <label>1</label>
          <institution>Halley Young Department of Computer Science University of Pennsylvania Pennsylvania</institution>
          ,
          <addr-line>PA 19104</addr-line>
          ,
          <country country="US">USA</country>
        </aff>
      </contrib-group>
      <abstract>
        <p>This paper proposes a formal theory of the way lm scores operate for the purpose of enabling semiautomatic generation. Among the contributions are a formalization of the entire generation process as a bit-vector-array satis ability problem, an approach to music generation not taken in many previous papers. The paper also formalizes the idea of \thematic" and \stylistic" time-dependent variables and their inherited constraints in speci cation-driven generation. In order to make the result more coherent, the paper formalizes a regular-expression-like grammar of melodic contour. Synthesizing all of these contributions, the result is a program which can take a lightweight speci cation of the relevant information in each scene of a lm, and produce a coherent and appropriate score to accompany it.</p>
      </abstract>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>Introduction</title>
      <p>The lm industry turns out over $200 billion of lms
every year, and a substantial portion of that is spent
creating appealing lm scores ((?)). According to
studies by Stuart Fischo , himself both a lm writer and
scholar of media psychology, music scores to lms
account for much of our understanding of the emotional
impact of lm scenes as well as characterizations of
different characters, locations, and events ((?)). However,
there has not been substantial research on formalizing
the way that lm scores produce these e ects. While
there has been some research on producing lm scores
semi-automatically, these approaches are either
statistical (and su er the same limitations as most
deeplearning based music, such as lack of memorable
material or global structure), or don't include a background
theory of lm composition, and thus require extensive
manual speci cation by the composer ((?) (?)). This
study proposes a formal theory of lm music, written
in a decidable fragment of rst-order logic. The theory
allows for the generation of appropriate lm music from
lightweight annotations.
We propose an algorithm for generating lm scores from
lightweight annotations. The input to this algorithm is
a speci cation. A speci cation contains an arbitrary
number of lines, each of which contains a list of
variables, a speci ed duration time in seconds, and,
optionally, a description of the scene (used only for
documentation and not by the algorithm). Variables can
either be stylistic variables (de ned in the theory of lm,
as per the appendix), or thematic variables (only
dened in the universe of the speci c lm). For instance,
in the following speci cation of a short manufactured
example, \Jose" (a character) is a thematic variable,
while \ amenco" (a well-de ned style of music found
in Spain) is a stylistic variable, as is \suspense" and
\happy":</p>
      <sec id="sec-1-1">
        <title>Jose grew up in Spain. {Jose, Flamenco} 8</title>
      </sec>
      <sec id="sec-1-2">
        <title>Then he moved to New Orleans. {Jose, Zydeco} 10</title>
      </sec>
      <sec id="sec-1-3">
        <title>It was there that he met Sally. {Sally, Zydeco} 4</title>
      </sec>
      <sec id="sec-1-4">
        <title>Sally was the most beautiful person he'd ever met. {Sally, romantic} 6</title>
      </sec>
      <sec id="sec-1-5">
        <title>They got married and moved back to Spain, where they had a child. {Jose, Sally, child, Flamenco} 6</title>
      </sec>
      <sec id="sec-1-6">
        <title>But then an alien invasion came, and infected the child. {child, horror, alien, suspense} 8</title>
      </sec>
      <sec id="sec-1-7">
        <title>In the end, Jose and Sally ended up having to go to space to beg the alien king to save their child. {Jose, Sally, child, alien, suspense} 8</title>
      </sec>
      <sec id="sec-1-8">
        <title>The alien king was touched by their plea, and their child was saved. {suspense, happy, alien, child} 6</title>
      </sec>
      <sec id="sec-1-9">
        <title>They all lived happily ever after. {sally, jose, child, happy} 6</title>
        <p>The output of the algorithm is music which conforms
to this speci cation, in that, for a sequence of variable
lists S1 : : : Sn and durations a1 : : : an, the variables in
Si are present in the music at timestamp Pij=10 aj to
Pij=0 aj. (A variable being \present" is de ned
below). Furthermore, the generated music is \musically
coherent" (also de ned below).</p>
      </sec>
    </sec>
    <sec id="sec-2">
      <title>Building on SMT solvers - the</title>
    </sec>
    <sec id="sec-3">
      <title>Set-Theoretic Universe of Film Score</title>
    </sec>
    <sec id="sec-4">
      <title>Theory</title>
      <p>Generating the basic building blocks of a lm score
involves determining a set of values of \mid-level"
musical variables at every moment in time. Some of these
properties are completely independent (rhythmic
density and harmonic progression), while others have
mutual constraints (the number of rhythmic values must
be the same as the number of pitch values). Some
properties depend on the duration of time in which they are
being used (for example, it's not realistic to have a full
Andalusian progression in 2 seconds or less).</p>
      <sec id="sec-4-1">
        <title>Musical Types and Values</title>
        <p>The mid-level universe consists of a set T = t0 : : : tn of
types of mid-level variable, such as \harmonic rhythm"
(the rate at which the chords change), \rhythmic
density" (the average duration of a single note in the
melody), or \has unpitched percussion". These types
are associated with their range of values, which can
be boolean (e.g. whether or not there is an ostinato),
a bounded integer (e.g. the degree of tension), a
nite set (e.g. the list of possible chord progressions),
or a list of bounded integers of bounded size (e.g. the
melodic contour). In principle this can be extended to
bounded oating point numbers, but for simplicity
bitvectors were used. Boolean variables in the SMT logic
are modeled simply as Boolean SMT variables, bounded
integers are modeled as bit-vectors, sets are modeled as
1-hot bit-vector variables, and lists of integers are
modeled as a tuple of a xed-size arrays of bit-vectors of
some maximal size kmax, and a value kactual &lt; kmax
such that all entries with indices &gt; kactual are ignored.</p>
      </sec>
      <sec id="sec-4-2">
        <title>Time-Span Sets and Thematic Sets</title>
        <p>Consider a speci cation with m scenes and n total
thematic variables in all of the scenes (n can easily be
obtained by counting the number of unique elements
of each Si that are not designated as \stylistic" by the
theory). Under these assumptions, there will be a set
M0 : : : Mm of sets of mid-level properties corresponding
to each scene. These sets will be total in the sense that
for every musical type t in T , one possible value of t
will be in Mi: There will also be a set N0 : : : Nn of
midlevel properties corresponding to each theme. However,
these sets will not be total - some sets may, for instance,
include one element of the set of possible chord
progressions but no possible value of rhythmic density, while
another may contain a possible valuation of rhythmic
density but not of chord progressions. This is because,
as lm music scholar Andrew Powell acknowledges, a
leitmotif (the musical elements that together make up
a \theme", or a memorable gestalt which can appear in
various versions but still be recognizable), can be one
of a variety of musical markers which \which serve[s] to
distinguish a character, idea, or symbol", rather than
one or several necessary and su cient musical
characteristics ((?)).</p>
      </sec>
    </sec>
    <sec id="sec-5">
      <title>Axioms of the Formal Theory</title>
      <p>Axioms of the formal theory relate the speci cation
of a lm score to constraints on its attributes. These
constraints include formal de nitions of the high-level
stylistic elements that are described in annotations,
rules regarding the existence of a well-de ned
leitmotif when appropriate, some basic rules which establish
musical coherence, and ontological claims which relate
to the logically necessary relationship between various
variables.</p>
      <sec id="sec-5-1">
        <title>Stylistic</title>
        <p>Stylistic axioms tie the stylistic de nitions assumed by
annotators to mid-level variables. Unfortunately, there
is a shortage of academic papers on the speci c musical
attributes associated with more broad words
encompassing ideas such as genre or emotion. Where
possible, I took de nitions from academic sources, including
((?) (?)). However, in some instance it was necessary
to simply survey the non-academic resources available
((?)).</p>
        <p>Below are a few of the de nitions used:
1. \Flamenco" style implies at least three of: the use
a amenco percussive pattern, the use of castanets
and clapping as percussive instruments, the use of
guitar, the use of phrygian mode, and the use of an
Andalusian chord progression.
2. \Horror" implies at least three of: existence of a
repeating ostinato, use of dissonance, use of a
chromatic chord progression or chord transformation, use
of low register.
3. \Jazz" implies: the use of a stereotypically jazz
percussive pattern, use of dominant seventh chords, and
use of a high level of syncopation.
4. \Happy" implies: the use of a major scale, and
either the use of a fast tempo or the use of a high
register.</p>
        <p>Note that in practice, these de nitions do not ensure
the desired feeling, and indeed in the examples it can be
di cult to discern exactly what style is being evoked.
However, they can be thought of as probably necessary
conditions, such that P (style(m) = xjm j= ), where
x is a style and are its related constraints, is much
higher than P (style(m) = xjm 6j= ). More research
is necessary to determine other variables which would
increase P (style(m) = x) for various styles.</p>
      </sec>
      <sec id="sec-5-2">
        <title>Ontological</title>
        <p>Ontological constraints include constraints which are
inherent to the meaning of the di erent mid-level
variables. For instance, in order for a scene to have \a
violin playing the accompaniment", it is necessary both
for the scene to be accompanied and for the scene
to contain a violin; in order for a scene to have a
\Andalusian cadence" (a speci c pattern de ned under
the rules of 12-tone tuning and unde ned for tunings
where n 6= 12), it is necessary for the tuning to be
12tone. To be precise, ontological constraints occur when
there exist two variable assignments, p^hi and ^, such
that there is no possible music satisfying the condition
p^hi^ 6 th^eta. In an end-to-end system, where the entire
music generation could be described as a single SMT
instance, than if ^ were constrained to be true than the
system necessarily would return a result such that ^;
however, due to tractability issues discussed elsewhere,
it was necessary to decouple the generation of
\midlevel" and \low-level" variables. Therefore, the system
has no a-priori knowledge that the variable describing
\Andalusian cadence" is dependent on the variable
describing \12-tone."</p>
      </sec>
      <sec id="sec-5-3">
        <title>Leitmotivic/Thematic</title>
        <p>The second type of constraint concerns making sure
that leitmotifs are associated with the correct theme.
In a major simpli cation, we assume that the speci
cation can be cleanly separated into stylistic and thematic
variables, so, while \the alien" might only occur in
sciscenes, it is not itself assumed a-priori to have di erent
de ning properties than \the cowboy" (although the
cowboy may only occur in scenes which are designated
as \Western", and thus will also in e ect be associated
with this genre). The thematic variables impose
additional constraints on the mid-level variables associated
with each time span, as a musical element can only
represent a theme if it is present in every scene where the
theme is included in the speci cation, and not present in
any scene where the theme is not included in the
specication. In addition, themes must be unique, and must
be noticeable. In a simpli cation, this is expressed as
the following constraint on the thematic sets N0 : : : Nn:
The sets must be completely disjoint, and each must
contain at least two elements. Thus, for any two given
themes a and b, either the value of type t associated
with a is di erent than the one associated with b, or a
includes a value of type t while b does not.</p>
      </sec>
    </sec>
    <sec id="sec-6">
      <title>Axioms of Contour - Creating Coherent</title>
    </sec>
    <sec id="sec-7">
      <title>Motivic Material</title>
      <p>The axioms regarding melodic contour require their own
section, as they are slightly more complex. As
discussed above, rhythmic and melodic contours are lists
of bounded size of bounded integers. A tuple of a
rhythmic contour and a melodic contour form a \motivic
contour," and this tuple is one of the variables which can
be assigned to a theme. Furthermore, an ordered list of
motivic contours are assigned to each scene, with the
number of elements roughly correlating to the duration
of each scene. Thus, a scene can contain individual
motivic contours corresponding to multiple themes if the
scene duration is large enough.</p>
      <p>A rhythmic or pitch contour is a list of numbers C =
c0 : : : ck such that, if x 2 C; then 80 j &lt; x; j 2 C:
The rhythmic (or pitch) contour is associated with the
following constraint on the rhythmic values r0 : : : rk:
8i &lt; k, 8j &lt; k, if ci &lt; cj then ri &lt; rj , if ci &gt; cj then
ri &gt; rj , and if ci = cj then ri = rj . Thus, the contour
restricts the relative size of the di erent durations
without restricting absolute sizes or even size ratios. This
is a standard de nition among modern music theorists
(?).</p>
      <sec id="sec-7-1">
        <title>Why Motivic Contours?</title>
        <p>Most accounts of melodic ideas involve speci c pitches
and rhythms. Arnold Schoenberg was perhaps the rst
to explicitly promote the motivic contour as a core
component of a musical idea; however, uses of
constant contoural structures over changing pitches have
been an element of Western music at least since Bach
((?) (?)). For this algorithmic approach, it is useful
to cleanly divide between contoural structure and the
speci c pitch and rhythmic elements so that both can
be constrained and assigned values independently. For
instance, a [0; 2; 1] pitch contour can be associated with
any type of scale or chord, thus increasing the size of
the possibility space by an order of magnitude. It could
be ful lled by a pitch sequence [C4, G4, E4] (a major
triad, or one particular kind of harmony), or [C4, G4,
D4] (a sus4 triad, or a harmony with a di erent
emotional valence).</p>
      </sec>
      <sec id="sec-7-2">
        <title>Grammatical Contours</title>
        <p>I introduce a language for describing valid
grammatical contours. This language is based on prior work
by cognitive and computational musicologists. It can
be used to enumerate a sequence of contoural values
which is more likely to sound \musical" than a
contoural sequence generated by randomly choosing
numbers on a given interval in Z. Readers can
subjectively compare tunes generated by the two methods
by going to https://www.seas.upenn.edu/~halleyy/
random-and-verified-melodies.</p>
        <p>Note that this grammar assumes that the user wants
the theme to be coherent and uphold the sort of
contoural constraints seen in pre-20th century music. This
is not always the case for avante-garde music, nor for
lm music in general. An extension of this work would
eliminate the contoural grammar in very speci c
scenarios where doing so would create the appropriate e ect.</p>
      </sec>
      <sec id="sec-7-3">
        <title>Musicological Antecedents of the</title>
      </sec>
      <sec id="sec-7-4">
        <title>Contoural Grammar</title>
        <p>The development of the contoural grammar draws on
work by authors including Larson, Narmour, Ockelford,
and Meredith, all of whom developed melodic theories
((?) (?) (?) (?)). Central to all of their work (whether
under the label of \inertia" in Larson's physics-based
theory or as \compressibility" in Meredith's
computational theory) is the idea of a necessary degree of
repetition and controlled variation, including several
stereotyped methods of variation. Larson also introduces
other operators such as \gravity", or the idea that after
a leap a pitch should tend to fall down, which will be
incorporated into the contoural grammar.</p>
      </sec>
      <sec id="sec-7-5">
        <title>The Rhythmic and Melodic Contoural</title>
      </sec>
      <sec id="sec-7-6">
        <title>Languages as Interpreted Subsets of</title>
      </sec>
      <sec id="sec-7-7">
        <title>Regular Grammars</title>
        <p>The contoural languages take the following form: a list
of values (with an optional repetition exponent),
references, and transformations on references, followed by
a list of reference valuations. The reference valuations
are lists of values, with an optional repetition exponent.</p>
      </sec>
      <sec id="sec-7-8">
        <title>Examples of Famous Works in Regex form</title>
        <p>The main theme of Mozart's 25th piano concerto is
one of the most beloved melodies in classical music.
Below is the pitch contour of the theme:</p>
        <p>
          [0; 0; 0; 1; 1; 2; 2; 3; 1; 3; 5;
4; 4; 3; 3; 2; 2; 2; 2; 3; 3; 4; 4;
5; 1; 3; 5; 3; 3; 2; 2; 1]
(
          <xref ref-type="bibr" rid="ref1">1</xref>
          )
(
          <xref ref-type="bibr" rid="ref2">2</xref>
          )
(
          <xref ref-type="bibr" rid="ref3">3</xref>
          )
This can be read as the interpretation of the following
regex:
(012(ui1)(m0)(uu1)2(i1))
((0; 0; 0); (1; 1; 2; 2; 3); (1; 3; 5))
(
          <xref ref-type="bibr" rid="ref4">4</xref>
          )
(
          <xref ref-type="bibr" rid="ref5">5</xref>
          )
In English, this can be interpreted as \the rst pattern
(0,0,0) followed by the second pattern (
          <xref ref-type="bibr" rid="ref1 ref1 ref2 ref2 ref3">1,1,2,2,3</xref>
          )
followed by the third pattern (
          <xref ref-type="bibr" rid="ref1 ref3 ref5">1,3,5</xref>
          ) followed by the
second pattern transposed to start at the current value
and inverted (
          <xref ref-type="bibr" rid="ref2 ref3 ref3 ref4 ref4">4,4,3,3,2</xref>
          ) followed by the rst pattern
transposed up two levels followed by the third pattern
followed by the inverted second pattern".
        </p>
        <p>Similarly, the rhythmic contour of the main theme
of Smetana's Moldau has the following pattern:
This can be read as the interpretation of the following
regex:
In English, this can be interpreted as \the rst pattern
(itself a tri-fold repetition of a simple pattern), followed
by a tri-fold repetition of the value 2, followed by the
second pattern, followed by a retrograde of the rst
pattern, followed by the second pattern with the third
value augmented."</p>
      </sec>
      <sec id="sec-7-9">
        <title>Necessary Constraints on Melodic</title>
      </sec>
      <sec id="sec-7-10">
        <title>Contours - Axioms of \Coherence"</title>
        <p>According to several authorities cited above, coherent
music necessarily must involve a substantial (but not an
excess) of repetition, and speci cally varied repetition.
It is thus necessary to constrain the valuations of the
regex. The following constraints were imposed:
1. If the melody is of su cient length (&gt; 6 seconds),
each of the patterns has to either be used in its
original form at least once and then used in some other
form twice, or used twice in its original form.
2. If the melody is very short (&lt;4 seconds), only one
pattern can be used, and if it is somewhat short (&lt;6
seconds), only two patterns can be used.</p>
        <p>In addition, as per Larson's description of the
consequences of \gravity" and \inertia", there was a
constraint on what can follow a leap (contoural values xi
and xi+1 such that abs(xi+1 xi) &gt; 4, namely that a
leap has to be followed by a \step" - abs(xi+1 xi) &lt; 2
- in the opposite direction.</p>
        <p>Note that the only tested melodies were at most
15 seconds long (about 8 bars, which is typical of
an antecedent-consequent style Classical theme), which
signi cantly reduced the possible complexity of the
melodies. More research is necessary in order to achieve
coherence across larger time spans.</p>
        <p>mid-level materials to notes
To generate notes from the values (including
contoural values and mid-level variables) output by the
SMT solver in the rst pass, further satis ability and
constrained optimization problems were constructed.
First, generating a rhythm from the rhythmic contour
was framed as constrained optimization: It was
necessary for the contoural constraints to be recognized for
all i and j, (length(xi) &gt; length(xj ) if and only if the
ith contoural value was greater than the jth), the total
length of the rhythmic pattern was constrained to be
the speci ed length of the scene, and an optimization
was sought that maximized the sense of meter. After
the rhythm was generated, melody was generated with
constraints maintaining contoural values and de nitions
of chord progressions and scales (each pitch modulo 12
has to be either in the respective chord, in the respective
scale and in between two notes less than 3 semitones
apart, or in between two notes less than two semitones
apart). Finally, the accompaniments were chosen to
match the instrumentation, dissonance level, thickness,
spacing, etc. of the other mid-level variables.</p>
      </sec>
      <sec id="sec-7-11">
        <title>Restriction sequences and tractability</title>
        <p>The method of determining rst the mid-level variables,
then rhythmic values, then pitch values, and nally
accompaniment and timbral features in a series of
disjoint SMT instances suggests an interesting avenue of
research. In the rst implementation, features were to
be generated all at once in a single constrained
optimization instance. However, the search space was
apparently far too large for the optimization to
terminate. Even the di erence between separating duration
and pitch generation vs. determining them together
was decisive in determining feasibility (separate runs
proved tractable while joint generation was not). One
could understand the ordering of variables to be
synthesized as a sequence of operations, each of which further
restrict the search space over all possible melodies.</p>
      </sec>
    </sec>
    <sec id="sec-8">
      <title>Empirical Results</title>
      <sec id="sec-8-1">
        <title>Generation from random scores</title>
        <p>A suite of 30 random lm speci cations were generated
by assigning xed probability distributions over seeing
a given style over each scene's timespan, joint
probabilities over theme variables, and a xed distribution of
number of scenes. According to this analysis, 20.0% of
randomly generated lm scores were satis able. In
contrast, all of the three handcrafted synthetic lm speci
cations and three handcrafted speci cations for existing
lms were satis able. This discrepancy suggests that
the distribution of styles and thematic material in real
lms is non-uniform.</p>
      </sec>
      <sec id="sec-8-2">
        <title>Generation from hand-crafted speci cation lm and</title>
        <p>Three stories of lengths 27-62 seconds were
handcrafted for the sake of this research. They were
intended to be realistic, but also erred on the side of
having a large amount of thematic and stylistic variety. It
took an average of 189.6 seconds for the process of
generation. Each of these lm speci cations had a
satisfying generation. The reader is free to evaluate the
results at https://www.seas.upenn.edu/~halleyy/
synthetic-film-score-generation. In particular, it
is worth noting the drastic di erence in quality between
the example where a composer manually wrote the piece
but constrained herself to use the generated mid-level
variables (example 1), and where the end-to-end system
was used. This suggests that the mid-level generation
may be more robust than the middle-to-low-level
system.</p>
      </sec>
      <sec id="sec-8-3">
        <title>Generation from lightweight annotation of existing lm</title>
        <p>Three short lm clips of length 28-62 seconds were
chosen, and speci cations were written for each. In
practice, it took less than ve minutes to create each
speci cation, suggesting that speci cation creation itself is
not a limiting factor. Each of these lm speci cations
had a satisfying generation. The reader is free to
evaluate these results at https://www.seas.upenn.edu/
~halleyy/real-film-score-generation.</p>
      </sec>
    </sec>
    <sec id="sec-9">
      <title>Future Work</title>
      <p>One of the appealing features of this approach is the
lightweight nature of the speci cations - it took less
than 5 minutes for the author to write up the
annotations for a real-world 46-second scene, which according
to several internet forums would be viewed by a
professional composer as a task deserving of $375-750 ((?)).
However, the opportunity to expand the speci cation
could decrease the gap in expressivity between music
generated automatically and music written by
professional composers. For instance, in this approach there
is a clear and simple distinction between stylistic and
thematic variables. However, in a more expressive
language, it would be possible to explicitly associate
certain properties with characters as well as how those
properties change over the course of the lm. In
addition, several lm theorists have suggested that
leitmotifs can be changed in a very deliberate manner through
the span of the movie so as to suggest character
development, a very important expressive possibility that is
completely absent in this work.</p>
      <p>Long-term structure is notoriously hard in lm
music, and in music in general. This approach incorporates
long-term structure in that there are recurring
leitmotifs and in that each individual scene is scored using
a principled musicological approach, but the sense of
continuity between scenes is still signi cantly less than
one would typically nd in most music. An
improvement on this approach would be to include constraints
on the distances between subsequent scenes, although
the nature of these constraints are not obvious.</p>
      <p>Due to the nature of SMT solvers, the valuations of
each mid-level variable are not independent across
executions of this algorithm (even on completely di erent
scripts). This is a de cit because, as discussed above,
composers ideally would like their material to sound
relatively unique. Furthermore, the algorithm is not
stochastic, as the output is determined by the
heuristics used by the SMT solver. Thus, it is di cult to
obtain a diverse list of possible outputs from a single
speci cation. The most obvious solution to this is to
use a Uniform-SAT module to maximize the
independence between executions. However, at this time the
number of SAT clauses is too large to apply
UniformSAT. Optimizations which either reduce or modularize
the number of SAT clauses could make this approach
feasible, which would drastically increase the appeal of
this algorithm.</p>
    </sec>
    <sec id="sec-10">
      <title>Areas for Collaboration</title>
      <p>As mentioned above, there is not su cient academic
literature on what makes something sound
"underwater" or "eerie." Collaborating with music experts could
prove useful in developing more precise and accurate
de nitions, as could partnering with HCI experts who
work on learning conditional user preferences.</p>
      <p>In particular, collaborating with lm composers
could provide much needed feedback on the approach
as well as benchmarks to compare to and speci c advice
on areas for improvement within the algorithm.</p>
      <p>Collaborations with experts in SMT solving could
prove as rewarding as the interactions with lm
composers. This is because the approach is fundamentally
limited by what is tractable to compute, and
significant sacri ces were made in the name of e ciency
(namely deciding rhythmic contour/harmonic
progression, rhythm, and pitch as three separate steps). If it
were tractable to produce end-to-end systems, we would
avoid issues such as pitches and rhythms being unable
to t a given harmonic progression well.</p>
    </sec>
    <sec id="sec-11">
      <title>Conclusion</title>
      <p>In conclusion, this paper proposes a logical theory of
lm scoring, as well as a theory for creating and
verifying coherent melodic contours. Empirical studies
suggest that lm scores do have signi cant structure and
that this method may be promising. User studies in the
future could enhance the impact of this research.</p>
      <p>List of Stylistic Terms</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          1.
          <string-name>
            <surname>Suspense</surname>
          </string-name>
          <article-title>(de ned by a high tension level, resulting from some combination of having an ostinato, tremolos, dynamic contrast, chromaticism, rising contour, and dissonance</article-title>
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          2.
          <string-name>
            <surname>Relaxed</surname>
          </string-name>
          <article-title>(de ned by having a low tension level)</article-title>
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          3.
          <string-name>
            <surname>Zydeco</surname>
          </string-name>
          <article-title>(de ned by having three of the following: amenco percussion, guitar, Andalusian cadence</article-title>
          , and phrygian mode)
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          4.
          <string-name>
            <surname>Americana</surname>
          </string-name>
          <article-title>(de ned by using 3 of 4 of major scale, harmonica, I-IV-V progression, and washboard drum pattern, as well as lack of synth-based sounds)</article-title>
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          5. Sci- (
          <article-title>de ned by having two of three of synth-based sounds, ostinatos, and modes of limited transposition)</article-title>
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          6.
          <string-name>
            <surname>Jazz</surname>
          </string-name>
          <article-title>(de ned by having dominant seventh chords, a jazz-kit-based rhythm, and electric guitar or other stereotypically jazz instruments)</article-title>
        </mixed-citation>
      </ref>
      <ref id="ref7">
        <mixed-citation>
          7.
          <string-name>
            <surname>Romance</surname>
          </string-name>
          <article-title>(de ned by 3 of 4 of major key, moderate rhythm, high pitch</article-title>
          ,
          <source>string instruments)</source>
        </mixed-citation>
      </ref>
      <ref id="ref8">
        <mixed-citation>
          8.
          <string-name>
            <surname>Happy</surname>
          </string-name>
          <article-title>(de ned by 3 of 4 of major key, fast rhythm, high pitch</article-title>
          , consonance)
        </mixed-citation>
      </ref>
      <ref id="ref9">
        <mixed-citation>
          9.
          <string-name>
            <surname>Sad</surname>
          </string-name>
          <article-title>(de ned by 3 of 4 of minor key, slow rhythm, low pitch</article-title>
          , dissonance)
        </mixed-citation>
      </ref>
      <ref id="ref10">
        <mixed-citation>
          10.
          <string-name>
            <surname>Underwater</surname>
          </string-name>
          <article-title>(de ned by use of marimba or whole-tone scale)</article-title>
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>