=Paper= {{Paper |id=Vol-1348/maics2013_paper_12 |storemode=property |title=A System Description of P^4: Possible Punctuation Points Parser |pdfUrl=https://ceur-ws.org/Vol-1348/maics2013_paper_12.pdf |volume=Vol-1348 }} ==A System Description of P^4: Possible Punctuation Points Parser == https://ceur-ws.org/Vol-1348/maics2013_paper_12.pdf
                                      A System Description of P^4:
                                    Possible Punctuation Points Parser
                                          Thomas Boehnlein and Jennifer Seitzer


                                                 Department of Computer Science
                                    University of Dayton, 300 College Park, Dayton, OH 45469

                                                                  asked without the need of specifically stating any
                                                                  punctuation marks. However, punctuation marks are
                           Abstract                              necessary for efficient processing of the text as these visual
  We present a Natural Language Understanding (NLU)               and auditory cues that occur in conversations are
  implementation that automatically inserts punctuation           completely removed for the reader of a mobile voice-to-
  marks into a sequence of words to create a group of one or      text transcription. The lack of automatic punctuation
  more syntactically correct sentences.         The software,     insertion in mobile systems presents a usability gap for
  Possible Punctuation Points Parser (P^4) provides the           either the speaker being forced to speak each punctuation
  ability for the user to input a string of words to process,     mark or the reader having to do without the punctuation
  performs the punctuation possibilities, and then provides       marks. The hope for this work is to contribute to the
  several visualizations to illustrate how the software arrived   improvement of mobile voice-to-text systems by
  at its final solution. P^4 uses a chart parsing algorithm       processing the data after the transcription is performed then
  combined with a search algorithm that creates data              inserting new punctuation information.           The system
  visualization structures. A potential application of this       presented here, P^4, is an operable small-scale
  software is to serve as a formidable starting point for         implementation that is a step toward achieving this goal.
  automatic punctuation mark insertion during voice-to-text
  conversion found on many mobile platforms.
                                                                              Overall Architecture of P^4
                                                                  The main facets of this work include:
                      Introduction
Natural language understanding (NLU) manipulates and               the definition and implementation of an efficient parsing
internally represents natural language to perform                   system
algorithmic linguistic operations for the purpose of               the implementation of a novel search algorithm that
understanding. This NLU system is a work in progress                allows for efficient processing of a large search space of
and a result of the first author’s Master’s software project        possible punctuation mark insertions
that automatically punctuates a string of words into
syntactically correct sentences. Semantic analysis and             the implementation of an interface to help visualize the
representation are not addressed here. Nor have any                 process of both parsing and searching while inserting
probabilistic methods based on usage history been utilized          possible punctuation marks
in the system. Instead, a syntactic approach is used by
organizing and identifying the words and their syntactic          Overview of Parsing System
relationships to one another by using a chart parsing
algorithm [1]. Much work has been done in chart parsing           The parser is a chart parser that employs a type of dynamic
[1][2]. Work has also been done in prediction of                  programming. Chart parsers have the benefit of reusing
punctuation in NLU [3][4]. This is a proof of concept of          sub-parses to avoid backtracking [1]. This is accomplished
applying the chart parsing algorithm to automatic                 by doing both a top-down and a bottom-up parse in
punctuation, now to be applied to voice-to-text systems.          parallel. Both time and space complexity is polynomial
                                                                  which is important given the multitude of parses performed
The current interfaces typically provided by voice-to-text        while navigating the punctuation mark search space. The
systems awkwardly require the user to explicitly state            parsing output is then analyzed to find the best parses that
individual punctuation marks during the process of voice-         are complete (since partial parses are part of the output of
to-text conversion. This method is used most notably by           the chart parser). A selection method is then used to rank
Apple’s iOS [5]. Explicitly stating punctuation marks is          the parses and choose the best. The grammar used by the
unnatural because people generally do not interact with           parser is read into the system from a text file and converted
each other verbally in that way. Instead, they rely on            into appropriate data structures for later use. The user is
conversational, visual, and intuitive cues. Speech patterns
                                                                  able to see all of the rules for guidance of valid input into
indicate when a sentence ends or if a question is being
                                                                  P^4 and can modify the rules as needed.
Overview of Searching System                                       and Parse System cyclically interact to create and validate
                                                                   numerous punctuation-possibilities. When the search
The number of permutations of possible punctuation                 system ultimately chooses the best one (“the result”), both
schemes grows at an exponential rate of O(mn) where m is           systems send data to the visualization system for the output
the number of punctuation points available (plus                   of the result and the meta-information of how that result
whitespace) and n is the number of words in the list. For          was achieved.
the word list “good morning mom how are you” using four
(plus one) possible punctuation marks 15625 permutations
of punctuation possibilities exist. If the list grows to
fifteen words, over 30 billion punctuation permutations                           Implementation of P^4
exist.                                                             In this section, implementation details of P^4 are explained
In general conversation, one typically uses a much larger          by describing the constituent classes, data structures, and
vocabulary than six words and requires more than four              algorithms of the three sub-systems: the parser, the search
punctuation marks. Thus, trying to exhaustively generate           engine, and the visualization tool.
all possible combinations quickly becomes prohibitive, and
an efficient search algorithm that prunes much of the
search space is necessary. Fundamentally, the search               Parsing System Implementation
algorithm does this by ignoring the parts of the search            It is the responsibility of the parsing system to analyze the
graph that do not form valid sentences which is                    individual strings (punctuation possibilities) that are
ascertainable by using simple graph accessibility                  generated by the search engine of P^4. It is a chart parser
techniques described later.                                        [1] made up of the classes ParseEdge, ParseGrammar and
                                                                   ParseParser. A chart parser differs from a traditional left-
Overview of Visualization System                                   to-right recursive descent parser by avoiding backtracking
Visualization methods are included in P^4 to gain insight          [2]. That is, it expands all possible production paths
into system operation, to assess the efficiency of the             initially and stores them in the data structure called a chart
algorithms, and to diagnose failure points. NodeXL,                along with meta-information of how the chart was created.
created by the Social Media Research Foundation, was
used for all of the visualization controls [6]. The library is     A chart can be considered a subgraph of the parse space
highly optimized for visualizing complex graphs that may           made up of instances of the data structure called an edge.
need to be organized using a variety of different node             That is, a chart is a collection of edges. In particular, the
layout algorithms. This is important since the punctuation         chart is made up of a vertex before the first word, in-
mark search space is extremely large. In this system, plots        between each word and after the last word. This results in
visualize the navigation of the search space and how the           n+1 vertices for every collection of n words. These are
final result was parsed.                                           used to define the beginning and ending points of each
                                                                   edge. Intuitively, edges are used to define how a
Overview of Main Operating Loop                                    subsection of the sentence was parsed. An example of a
The interaction of the three sub-systems described above is        chart for the sentence “The boy ate candy.” is shown in
shown in Figure 1. After the user enters the ordered               Figure 2.
collection of words to be punctuated, the Search Engine




                               Figure 1 - Data Flow Diagram of Possible Punctuation Points Parser
                                                                The next function, Scanner, uses a bottom up approach.
                                                                Scanner looks at each edge in a chart to see if the
                                                                remaining words can be used to move the first object from
            Figure 2 Chart Vertices and Edges                   the Needs list to the Haves list. If it can be moved, then a
                                                                new edge is created with the new word(s) and added to the
                                                                appropriate edge list. An example of the new edge creation
“The boy” would be a valid edge in the sentence which           process using the scanner function is shown in Figure 4.
begins at V0 and ends at V2.
In addition to storing the starting point, ending point and
the edge goal, the edge also stores two lists: the Haves and
the Needs. The Haves list represents what the parser has
identified so far. Going back to the “The boy” edge, the
goal of the edge is a sentence. The Haves list has a noun
phrase. But in order to be a sentence, which is the goal,
the edge also needs a verb phrase. So the Needs list
includes a verb phrase. If the edge cannot find a verb
phrase in the remaining sentence, it is called an
incomplete edge. Otherwise, the edge is called a complete
edge. Complete edges always have an empty Needs list.
The edges are implemented by the class ParseEdge.

The ParseParser class operates on a list of edges for each
                                                                        Figure 4 - Execution of Scanner Function
vertex. Edges that end at a particular vertex are stored in
the same list. The ParseParser class organizes the edges
like this to facilitate simultaneous top-down and bottom-up     The final function, Extender, also operates in a bottom-up
parsing using its three main functions: Predictor, Scanner,     manner. It takes an edge X and looks at all other edges in
and Extender. These functions create the edges and new          an edge list that ends where X begins. If the goal of X
edges as the parse proceeds.                                    matches with the first item in the Needs list of one of the
                                                                searched edges, a new edge is created by combining the
The first function, Predictor, uses a top-down approach. It     two edges. The new edge starts at the searched edge and
creates new edges from the needs of the current edge. “The      ends at the end of edge X. Lastly, the matching object from
boy” edge needs a verb phrase. Thus, Predictor creates a        the Needs list is moved to the end of the Haves list. An
new edge that starts at the vertex right after the word “boy”   example of the new edge creation process using the
in the sentence and has a goal of a verb phrase. Each new       extender function is shown in Figure 5.
edge contains the right hand side (RHS) of one of the
available rules in the grammar that has verb phrase on the
left hand side (LHS) of a rule. An example of the new edge
creation process using the predictor function is shown in
Figure 3.




     Figure 3 – Execution of Prediction Function                     Figure 5 - Execution of Extender Function
Searching System Implementation                                  like it.” Moreover, in testing, P^4 only took 0.135 seconds
The Searching System creates all feasible punctuation            to find the final solution on a modern notebook computer.
possibilities. It uses five punctuation marks: period,
comma, semicolon, and question mark, plus whitespace             Visualization System
when no punctuation mark is chosen. The rules that govern        The final sub-system of P^4, the visualization system,
the insertion of punctuation marks are found in the class        shows the results of the previously discussed parse and
ParseGrammar.                                                    search sub-systems. It is responsible for creating the
                                                                 visualizations and tabular results that are shown in the user
Because of the combinatorial complexity of considering all       interface of P^4. This includes the Edge Table, Search Plot,
possible punctuation schemes, ParseSearcher only                 and Parse Plot. The actual drawing of the charts is handled
generates the sections of the search space that have the         by NodeXL.
possibility of containing a valid parse. The decision to stop
                                                                 The Edge Table lists all edges generated by the parser’s
looking for more punctuation insertion points is made by
                                                                 parse chart during an execution. The top of the table
ascertaining the nonexistence of an edge in the chart that       contains a list of the parsed words and the vertices that the
begins at the first vertex and ends at the last vertex and has   edges used. Then each edge is written to the table. An
a sentence, sentence phrase, partial sentence, connected         example of edges as presented in the Edge Table are shown
sentence, or sentence with an ending punctuation, as the         in Figure 6. Each edge entry shows the span of indices that
last item in the Haves list.                                     the edge covers, the goal of the edge, the edge’s Haves
                                                                 objects, the edge’s Needs objects, and the words that the
During each new iteration, P^4 takes advantage of the fact       edge covers. The edge’s goal appears first and is separated
that a chart parser can reuse past partial parses. Thus, as it   from the rest of the parts by “-->”. The Haves and Needs
goes down another level in the search space, it passes           objects are separated by “@”. Finally, the words covered
along the previous parse, copying all of the edges from the      by the edge are separated by “-->” if the edge is complete.
previous parse into the current parse. Thus, during this new     These edges show how the grammar was explored to figure
parse, the parser only has to create the new edges added by      out how to parse the collection of words and punctuation
the new word or punctuation mark. This dramatically              marks. In addition to showing how the parse search space
increases the efficiency of the search system.                   was explored, they are also critical to creating the final
                                                                 parsing chart which will be discussed last.
The final task of the search system is deciding which
collection of one or more valid sentences that has been
generated is the best one. In some cases, it is impossible to      (0, 5) S_END --> S Pun_End --> good
pick the best parse without knowing the context of what            morning , mom .
was said. In such cases, one would need the timing                 (5, 6) Question --> Interrogative @ VP NP
information to decide if a pause meant a comma or a
period. For example, “good morning, mom. how are you?”              Figure 6 - Example of a Complete and Incomplete
is as equally valid as “good morning. mom, how are you?”
P^4 makes no attempt to resolve this ambiguity. To choose
the final result, P^4 selects the parse that created the most    The Search Plot depicts the paths traveled while searching
sentences with the least amount of punctuation while still       for possible punctuation points. It is created by first
remaining valid as a best parse. This is done by rewarding       inserting a root node into the tree which is represented as a
the parse with a large increase in rating per sentence found     cyan square at the top of the tree. A new node is added for
and subtracting a small amount for each punctuation point        each decision point that is made while searching for the
found.                                                           punctuation point placements. Each decision node is
                                                                 colored black. All possible decision outcomes are given a
As an example, consider the following ordered sequence of        node: comma (red), period (purple), question mark (blue),
words: “good morning mom how are you I am okay are               semicolon (green) and no punctuation mark (white) as
you okay I got a telescope for my birthday the telescope is      shown in Figure 7. This creates an n-ary tree where there
good and I like it”. This word sequence generates 7.4E18         are n possible search decision choices. The black circle
possible punctuation permutations if created using a brute       represents this decision point in the tree and allows the
force approach. P^4 required only 200 parses while               shape of the tree to remain visually pleasing.
searching by quickly eliminating branches that would not
result in a solution, and found a correct solution of “good
morning, mom. how are you? I am okay. are you okay? I
got a telescope for my birthday. the telescope is good and I
                                                                punctuation marks found in the final solution are placed in
                                                                the leaf node in a right-to-left manner. Once all Have
                                                                objects derived from the ROOT edge have been completely
                                                                matched, the parse tree is complete and can be displayed to
                                                                the user interface. An example of the Parse Plot is shown in
                                                                Figure 9.




             Figure 7 - Example Search Plot


The Parse Plot is built from the complete edges found in          Figure 9 - Example of Parse Plot
the parser’s parse chart. Not all complete edges in the
parser are part of the final chosen parse. In order to
determine which edges belong and which edges do not, the                     Sample Runs and Results
table of complete edges must be recursively searched and
                                                                In this section, a sample run of P^4 is described. The Input
matched using a bottom up approach relative to the Edge
                                                                to Parse text box is located at the top of the window. It is
Table as illustrated in Figure 8.
                                                                where the user will input the series of words that P^4 will
                                                                use. The sample run text is shown in the Input to Parse text
                                                                box in Figure 10.




                                                                  Figure 10 - Input to Parse Text Box


                                                                Since the current grammar is limited in scope, the user
                                                                must know the rules of the grammar in order to create a
                                                                series of words that can be processed by P^4. To find out
                                                                the terminal and non-terminal rules of the grammar, the
                                                                user can press the Show Grammar button located above
       Figure 8 - Creating Parse Plot from Edge Table           Input to Parse. Sections of the grammar window are shown
                                                                in Figures 11 and 12. These rules can be modified by the
                                                                user since they are stored in a simple text file.
The algorithm finds the correct ROOT edge first by
starting at the bottom of the table then recursively expands
up the table. The objects in the Haves list are matched in a
right-to-left manner to the goal of subsequent edges in the
table. As edge goals are matched to objects in the Haves
list, the goal is flagged as used and cannot be used by other
Haves objects as a potential match. In addition, the Haves
object is also flagged once it has a match so it can no
longer be expanded. If a match cannot be found, it must
be a leaf node of the parse tree. The words and
                                                                  Figure 15 - Portion of Edge Table from Sample Run


                                                               In the highlighted example, the middle edge has a goal of
                                                               S_END. It has completed that goal with Question and
                                                               Pun_Ques. The phrase “how are you?” is located between
 Figure 11 – Part of Terminals from Grammar Window             the indices of 5 and 9 as seen at the top of the Edge Table.
                                                               Next, the Search Plot displays how P^4 efficiently
                                                               navigated the search space of possible punctuation
                                                               permutations. The plot is shown in Figure 16.




 Figure 12 – Part of Non-Terminals from Grammar Window


Below the Input to Parse text box is the Optimal Solution
text box. The best possible fully punctuated parse that P^4
could find will be written here after the search is complete
as shown in Figure 13.


                                                                      Figure 16 - Search Plot from Sample Run


Figure 13 - Optimal Solution Text Box                          To the right of the Search Plot is a text box showing the
                                                               amount of time it took for the search to conclude. It allows
                                                               the user to track the increase of time as the parses become
Running the sample input with P^4 generates the three          more complex when adding additional words. The time it
main visualization outputs: Edge Table, Search Plot and        took to search and parse for “good morning mom how are
Parse Plot. The Edge Table control features a table of         you” is shown in Figure 17.
edges and two checkboxes. The top of the table displays
the final solution with numbered nodes. In this case, there
are ten nodes and nine edges to represent the parse chart
after punctuation marks were inserted. Figure 14 shows the
top of the Edge Table for the sample input.
                                                                     Figure 17 - Search Time from Sample Run


                                                               The final output is the Parse Plot. This plot, shown in
 Figure 14 - Top of Edge Table from Sample Run                 Figure 18, displays how the sample input was parsed after
                                                               it was punctuated by P^4. It is made up of a parse tree
                                                               where each node is the LHS of a grammar rule except for
The checkboxes determine whether incomplete or                 the leaf nodes which are terminals. Punctuation marks are
complete edges are shown. Figure 15 displays a portion of      represented by the word in all capital letters since the actual
the Edge Table from running the sample input.                  punctuation marks may be hard to see due to their small
                                                               size. The organization of the graph does not take into
                                                               account the order of the words in the sentences. It only
                                                               ensures that they are on the right ply. Currently, the
                                                               software cannot specify the left-to-right order in which the
graphs are drawn. The tree starts at the root and then is      Future Work
split into sentence phrases which are made up of one or        Currently, the most pressing limitation is the grammar’s
more sentences. The sentence phrases are then defined as       small size which limits input, thus rendering the system’s
sentences with ending punctuation. Those are then defined      ability to handle only toy problems. The grammar needs to
as their respective type of sentence and punctuation mark.     be expanded to allow for a wider range of use cases. In
Phrases such as verb phrase are defined next which is then     addition, as more rules and terminals are added, this will
followed by parts of speech for each of the words. Finally,    impact the parsing time as chart parsers create exhaustive
the words and punctuation marks are inserted to complete       solutions by examining every rule. This may lead to
the parse tree.                                                considering combining probabilistic methods with the chart
                                                               parser.

                                                               Additionally, the ranking algorithm in selection of the final
                                                               result can potentially be improved by incorporating syntax
                                                               usage patterns found in the American English language.
                                                               The final improvement of the parsing itself would be to add
                                                               error handling to the parsing system by either exiting
                                                               gracefully with a partial parse or looking at other methods
                                                               to predict the missing information when a correct parse is
                                                               not possible with the supplied grammar. Concerning the
                                                               interface, the plots could be made more robust in terms of
                                                               organization and presentation of the nodes found in the
                                                               trees generated by P^4. This includes the ability to add
                                                               additional information about the parses that occurred at
                                                               each node through changes in visual characteristics such as
                                                               color, size and location. Finally, individual partial parses
                                                               need to be examined in depth. This could be accomplished
                                                               with additional pop-up windows made available by simply
         Figure 18 - Parse Plot from Sample Run
                                                               selecting one of the nodes in the Search Plot.

As shown in the results from the sample run, P^4 provides
a relatively straightforward interface with a goal of                                  References
creating syntactically correct sentence(s) by adding
punctuation marks to a sequence of words supplied by the       [1] Arnold, D. “Chart Parsing”, Dept. of Language & Linguistics,
                                                               University of Essex.
user.
                                                               [2] Russell & Norwig. AI A Modern Approach. 2013.
                                                               [3] Huang, J., and Zweig, G. “Maximum entropy model
                      Conclusions                              for punctuation annotation from speech”. In Proc. of
                                                               ICSLP, pp. 917-920, 2002.
This work presented a system, Possible Punctuation Points      [4] Kim, J., and Woodland, P. “The use of prosody in
Parser (P^4) that efficiently inserts punctuation marks into   a combined system for punctuation generation and
a string of words to form a syntactically correct sequence     speech recognition”. In Proc. of Eurospeech, 2001.
of one or more sentences. It does this by using a chart        [5] “iPhone User Guide For iOS 6.1 Software”, p. 25, 2013.
parser that simultaneously performs a bottom-up/top-down       [6] NodeXL: Network Overview Discovery and Exploration for
parse with an optimized search algorithm that maximizes        Excel 2007/2010. Social Media Research Foundation. Retrieved
pruning of the search space. In order to make the results as   March 13, 2013 from http://www.smrfoundation.org/nodexl.
easy as possible to be analyzed by the user, P^4 provides
several data visualizations to show the output from the
chart parser, how the punctuation space was searched, and
finally, how the final solution was generated by the
grammar. The main contribution of this work is that it
provides a proof of concept that shows that punctuation
marks can be automatically inserted into voice-to-text
transcriptions produced by mobile devices rather than
requiring the user to explicitly state the necessary
punctuation marks.