=Paper= {{Paper |id=None |storemode=property |title=Mapping of glossary terms from the Flora of North America to the Plant Ontology enhances both resources |pdfUrl=https://ceur-ws.org/Vol-897/poster_6.pdf |volume=Vol-897 |dblpUrl=https://dblp.org/rec/conf/icbo/WallsCMMCSJ12 }} ==Mapping of glossary terms from the Flora of North America to the Plant Ontology enhances both resources== https://ceur-ws.org/Vol-897/poster_6.pdf
Mapping of glossary terms from the Flora of North America to the
           Plant Ontology enhances both resources
      Ramona Walls1,* Hong Cui2, James A. Macklin3, Chris Mungall4, Laurel Cooper5,
                        Dennis Stevenson1 and Pankaj Jaiswal5
                                           1	
  New	
  York	
  Botanical	
  Garden,	
  Bronx,	
  New	
  York,	
  USA	
  
       2School	
  of	
  Information	
  Resources	
  and	
  Library	
  Science,	
  University	
  of	
  Arizona,	
  Tucson,	
  Arizona,	
  85719	
  USA	
  
                         3Research	
  Branch,	
  Agriculture	
  and	
  Agri-­‐Food	
  Canada,	
  Ottawa,	
  Ontario,	
  Canada	
  
                                      4Lawrence	
  Berkeley	
  National	
  Lab,	
  Berkeley,	
  California,	
  USA	
  
                   5Department	
  of	
  Botany	
  and	
  Plant	
  Pathology,	
  Oregon	
  State	
  University,	
  Corvallis,	
  OR,	
  USA	
  



                                                                                   where the FNA has multiple terms with the same name but
1   INTRODUCTION                                                                   separate meanings that should map to separate PO terms. A
   Traditional taxonomic literature can provide a wealth of                        curator mapped the remaining FNA glossary terms to PO
data, but access to that data is limited by its free-text format.                  terms, based on the FNA and PO definitions.
Taxonomic treatments such as the Flora of North America                                 A total of 193 FNA terms mapped to existing PO pri-
(FNA Editorial Committee 1993) consist of terse descrip-                           mary term names and 126 mapped to existing synonyms.
tions of the characters used to identify taxa, such as:                            333 FNA terms had the same meaning as existing PO terms
     “…Leaves usually alternate or opposite, sometimes in                          and have been added as synonyms to the PO, citing the FNA
     basal rosettes, rarely in whorls; rarely stipulate, usually                   glossary as the source. 143 unique new terms will be added
     petiolate, sometimes sessile…”                                                to the PO, corresponding to 180 FNA glossary terms. 118
Converting taxonomic descriptions to computer-readable                             FNA terms could not be mapped to PO terms, either because
format makes them available for automatic retrieval and                            they were too vague (12 terms, e.g., FNA:lamella, which
large-scale analyses. Ontologies such as the Plant Ontology                        could apply to many different tissue types), because they are
(PO) play a central role in automatic annotation, by provid-                       subcellular components and belong in the Gene Ontology (5
ing semantic meaning for the words in a description. We                            terms, e.g., FNA:flagella), or because they are better mod-
used automated and manual methods to map terms from the                            eled as qualities (93 terms, e.g., FNA:puncta is better treated
Categorical Glossary for the Flora of North America Project                        as the quality punctate).
(http://128.2.21.109/fmi/xsl/FNA/home.xsl) to the PO.                                   The PO is fairly extensive in its coverage of plant ana-
                                                                                   tomical entities, as many of missing terms are specialized
2   METHODS                                                                        structures found only in a few taxa. The PO benefits from
                                                                                   this mapping through increased coverage of plant terminol-
   Terms from the pre-existing categories of “structure”,
                                                                                   ogy. Text mining tools such as CharaParser (Cui 2012) that
“feature”, or “nominative” were extracted from the FNA
                                                                                   are being developed to mine taxonomic descriptions can
glossary, roughly corresponding to the PO class plant ana-
                                                                                   now use the PO more effectively for automated text annota-
tomical entity. An automated mapping to PO release 16 was
                                                                                   tion and in return mine more candidate terms from the lit-
done using Obol software (Mungall 2004). We manually
                                                                                   erature to further enrich PO. The mapping of FNA IDs to
checked the automated mapping, and removed any matches
                                                                                   PO IDs is available at http://tinyurl.com/FNAPOmapping.
that were incorrect. Remaining glossary terms were either
manually mapped to existing PO terms, classified as inap-
                                                                                   ACKNOWLEDGEMENTS
propriate for the PO, or marked to be added to the PO.
                                                                                   NSF-IOS: 0822201 to the Plant Ontology Project, and the
3   RESULTS AND DISCUSSION                                                         Flora of North America Association.
    839 terms were extracted from the FNA glossary, com-
                                                                                   REFERENCES
pared to 1080 terms in the plant anatomical entity branch of
                                                                                 Cui, H. 2012. CharaParser for fine-grained semantic annotation of organism
the PO. Using text matching, Obol mapped 264 FNA terms                                morphological descriptions. J. of Am. Soc. of Information Science and
to 313 existing PO terms or synonyms, including 49 FNA                                Technology. 63(4) doi:10.1002/asi.22618
terms that matched more than one PO term or synonym.                             FNA Editorial Committee, eds. 1993. Flora of North America North of
                                                                                      Mexico. 16+ vols. New York and Oxford.
Most duplicate matches arose because the PO has many                             Mungall, Christopher J. 2004. “Obol: Integrating Language and Meaning in
synonyms in Spanish that are identical to the English term                            Bio-­‐‑ontologies.” Comparative and Functional Genomics 5 (6-­‐‑7) (Au-
name. Only 30 Obol matches had to be removed, in cases                                gust 1): 509–520. doi:10.1002/cfg.435




                                                                                                                                                            1