=Paper=
{{Paper
|id=Vol-2810/paper7
|storemode=property
|title=Supporting Natural History Collections by Connecting Collections
|pdfUrl=https://ceur-ws.org/Vol-2810/paper7.pdf
|volume=Vol-2810
|authors=Constance Rinaldo,Danielle Castronovo,Joseph DeVeer,Diane Rielinger
|dblpUrl=https://dblp.org/rec/conf/colco/RinaldoCDR20
}}
==Supporting Natural History Collections by Connecting Collections==
Supporting Natural History Science by Connecting
Collections
Constance Rinaldo1 , Danielle Castronovo2, Joseph deVeer1, Diane Rielinger2
1 Harvard University Ernst Mayr Library and Archives of the Museum of Comparative Zool-
ogy, Cambridge, MA, 02138 USA
farmandcircus@gmail.com, jdeveer@oeb.harvard.edu
2 Harvard University Botany Libraries, Cambridge,
MA 02138
{castronovo,drielinger}@fas.harvard.edu
Abstract. Information held in Libraries and Archives expands scientific knowledge by connect-
ing specimens to rich data such as observations taken at the time of collection, species descrip-
tions, and distribution records. Digitization of these resources transport them from the individual
library and archives to the world. However, many of the primary resources are handwritten, lim-
iting their use and reuse due to difficulties in deciphering cursive writing and a lack of machine
readable data. This paper presents three case studies from the Harvard University Herbaria
(HUH) Botany Libraries (HUH) and the Harvard University Ernst Mayr Library and Archives
(EMLA) of the Museum of Comparative Zoology (MCZ) that utilize crowd-sourcing, detailed
access and discovery tools, and open access platforms to make handwritten materials more ac-
cessible to researchers as well as connecting content across collections held within and outside
of Harvard University.
Keywords: Transcription; Digital Libraries; Zoology; Ornithology; Botany; Field Notes; Cor-
respondence; crowd-sourcing
Copyright © 2021 for this paper by its authors. Use permitted under Creative Commons License Attribution 4.0 International (CC BY 4.0).
78
1 Introduction
Libraries and Archives play an integral role in the understanding of natural history col-
lections by providing context for collecting activities and species identifications. Natu-
ral history literature and archives contain rich data sources such as species descriptions,
distribution records, connections to specimen collections, climate data, ecosystem data,
documentation of changes over time and historical scientific observations from expe-
ditions and surveys. Digitization of this literature is making vast amounts of knowledge
from libraries and archives accessible.
Treasure troves of previously undiscovered data exist in scientists’ personal jour-
nals and field notes that are not in the published record. However, handwriting remains
a challenge as optical character recognition (OCR) fails to properly transcribe cursive
writing Applications that learn handwriting, such as MONK, can be impractical as the
learning curve is steep and time-consuming, and therefore unsuitable for smaller col-
lections. This paper presents three case studies from the Harvard University Her-
baria Botany Libraries (HUH) and the Ernst Mayr Library and Archives (EMLA) of
the Museum of Comparative Zoology (MCZ), both at Harvard University, and demon-
strates ways in which the libraries are making handwritten materials more accessible to
researchers as well as connecting content across collections within the same institution
and among different institutions. Key to the success of these three projects is the avail-
ability of open access platforms providing services both to human and machine users.
ArchivesSpace provides detailed discovery, direct access to items, and extractable
metadata. The Biodiversity Heritage Library (BHL) provides content from multiple
world-wide institutions accessed in a single platform with full-text searching, species
name recognition, and data extraction tools.
Data sources like the ones we highlight are digital, becoming plentiful and are
available to be mined, manipulated and analyzed to reveal unexpected linkages and new
perspectives that may ultimately connect information across the full spectrum of the
biodiversity knowledge network.
2 Case Study #1: Connecting and enhancing William Brewster’s
ornithological archives
William Brewster (1851-1919) was an ornithologist who lived in Cambridge, Massa-
chusetts, U.S.A. He was a curator of birds and mammals at the Museum of Comparative
Zoology (MCZ) of Harvard University, and the first president of the Massachusetts
Audubon Society. He was also a key figure in the origin of the American Ornithologists'
Union (Emmet 2007). Brewster collected over forty thousand specimens of birds, nests,
and eggs, primarily from the New England region of the United States. His collection,
now held in the Museum of Comparative Zoology, was considered one of the finest
private collections of North American birds ever assembled at the time (Emmet, 2007;
Henshaw, 1920).
Brewster recorded his scientific and other activities in diaries and field notes
from 1865 at age 14 until his death in 1919. The Ernst Mayr Library and Archives
(EMLA) of the MCZ holds Brewster’s collection of more than 100 volumes of diaries,
journals, and notebooks. These are a rich source of species occurrence data, as well as
79
data about regional environmental changes. They provide invaluable context for his
specimens housed in the MCZ collection. The archival collection also includes nearly
10,000 pieces of correspondence dating from 1862 to 1919, and about 2000 photo-
graphs. About 94% of the collection has been digitized and over 60% is now in the
Biodiversity Heritage Library (BHL), an online platform that includes digitized mate-
rials from libraries and museums around the world.
In 2014, as part of a collaborative grant-funded project led by the Missouri Bo-
tanical Garden, the EMLA began the process of selecting a tool to transcribe its digit-
ized Brewster materials. Our requirements for the tool were that it be open-source, de-
signed for crowdsourcing, user-friendly, provide library staff with administrative over-
sight of their project including the ability to edit completed transcriptions, be sustaina-
ble, and supply export files compatible with BHL. Tools researched included the Smith-
sonian Transcription Center, FromThePage, DigiVol, Scripto, Transcribr (National Ar-
chives Transcription Project), and T-Pen. Solutions such as Monk, designed to learn
handwriting, were not considered because the technology was very new, and our project
time frame did not allow for training the software. Additionally, Brewster’s writing
frequently employs ornithological shorthand and abbreviated scientific names that
would be challenging for software to interpret. DigiVol was selected over other options
because it was originally designed for the transcription of scientific field notes and
specimen labels. It is highly customizable, providing administrators the ability to create
tutorials and templates specific to their materials. The DigiVol administrative interface
allows project staff to easily track the progress of projects through transcription and
validation. This, and the ability to easily retrieve useful export files was well developed
in DigiVol in comparison with the other tools at the time. Additionally, there was a pre-
existing community of volunteers, most with an interest in biodiversity, using DigiVol
to transcribe natural history manuscripts when this project began (Mika et al. 2017).
DigiVol was developed by the Australian Museum in collaboration with the
Atlas of Living Australia. Institutions with materials in need of transcription set up an
administrative account, create projects, and upload digitized items at no cost. Volunteer
transcribers create free accounts and choose projects of interest to them. The user inter-
face is intuitive. It consists of a book viewer in which the manuscript page is displayed,
a text box for transcription entry, and a series of fields to record species names found
on the page along with geographic location and date. Thus, species occurrence infor-
mation is collated. Users have the option to save a partially completed page and return
to it later for completion. When a transcriber finishes a page, they submit it for valida-
tion. The project administrator has access to a dashboard listing all the pages of an item
and action required on each page (“transcribe,” “validate,” or “review”) along with a
link to the page. From there, the administrator can review and validate the transcription.
Validation is accomplished by trained library staff who are familiar with the Brewster
materials and conventions prescribed for the project. Validation involves reviewing the
completed transcription, making sure conventions are followed, and looking for any
glaring errors or words the transcriber was unable to decipher. The validator makes any
necessary edits and marks the transcription as valid. When an item has been fully tran-
scribed and validated, the text files may be downloaded. DigiVol developers ask project
administrators to delete the page images of an item when no longer needed in order to
keep server space at a maximum. All data for each item, including export files, are
80
retained on DigiVol and may be accessed at any time by the institution in charge of the
project.
During the 2020 COVID-19 quarantine, progress on transcription of Brewster
materials accelerated considerably as all work in DigiVol can be accomplished re-
motely. When it became apparent that the quarantine period would be indefinite, Har-
vard University Library implemented a work share program providing opportunity for
staff in need of remote work to sign up for projects in need of assistance. EMLA posted
the Brewster transcription project as such an opportunity, and hired two staff from other
Harvard libraries to assist with validation. This allowed us to successfully address a
growing backlog of validation work.
In 2018, BHL developers implemented transcription file upload functionality,
enabling import of transcription text files from DigiVol to accompany page images of
manuscripts. With the advent of this functionality, the full text of field notes can be
indexed with the full corpus of published literature in BHL. This results in enhanced
access to ornithological field data, and the potential for extracting large historical data
sets from field notes and other primary source materials such as correspondence. These
data, previously inaccessible except by in-person visits to the EMLA, are now globally
available to biodiversity organizations, aggregators, researchers, educators, and the
public.
2.1 Transcription Conventions
We have designed the transcription conventions for Brewster’s materials to facilitate
full-text indexing and species name recognition in BHL. This is accomplished in large
part by limiting text markup, expanding abbreviated names and other words, and cor-
recting spelling errors in the original manuscript. BHL uses Global Names Architecture
(GNA) for taxonomic name finding across the repository. It is thus especially important
that scientific names be recognized in the transcription text. Taxonomic names are the
key link for the literature in BHL and the individual work of systematists and taxono-
mists. The ability to extract these scientific names enables data connections across the
biodiversity knowledge ecosystem. Even if full pages are not transcribed, if scientific
names are added in the metadata, discovery improves for these taxa, although this is
currently not an automated process. There is no parallel system in BHL for identifying
manuscript names although the full text search may facilitate identification of such
names.
81
Fig. 1. DigiVol user interface showing a portion of a page from Brewster’s 1900 diary. Below is
the text box into which users type their transcription of the manuscript.
Were the genus name and specific epithet separated by brackets, e.g. “V[ireo]
flavifrons,” they would likely not be recognized by GNA. We try to apply this practice
to proper names as well, e.g. “Mus. Comp. Zool.” is expanded and rendered as
“[Museum of Comparative Zoology]” as can be seen in Figure Brewster made extensive
use of symbols or shorthand in his field notes. For example, the asterisk-like symbols
that can be seen in Fig. 1 mean “in full song,” and encircled numbers mean “in a flock.”
In this example, he saw 2 individuals of Chaetura in a flock. In the transcription any
such shorthand is rendered as text e.g. “Saw or heard Merula 1 [in full song], 3 or 4
juv. [juvenile].” The importance of transcribing symbols as text for indexing purposes
adds an additional layer of complexity when considering use of machine learning for
manuscript OCR.
82
2.2 Transcriptions in BHL
When a field notebook has been completely transcribed and validated, transcription
files are exported in csv format from DigiVol, and then uploaded to BHL. In the BHL
book viewer each page from a notebook can then be displayed alongside its completed
transcription. The right-hand pane can be expanded to display the transcription text for
the page on the screen (Fig. 2).
Fig. 2. BHL book viewer with transcription text visible in right-hand pane, and scientific names
on page shown in lower left-hand pane.
For most items in BHL, i.e. published literature, this pane displays text produced by
Optical Character Recognition (OCR). The OCR fails to adequately capture manuscript
text so the original OCR files are replaced with transcription exports from DigiVol.
With the transcription text in place, scientific names on each page are recognized,
indexed, and displayed as shown in Figure 2.
The addition of transcriptions enables full-text searching within field notebooks.
For example, a search for Chaetura pelagica (Chimney Swift) found 6 instances of this
name within this particular field notebook (Fig. 3). A search for this taxon across the
full BHL repository now results in a species bibliography that includes field notes as
well as published sources (Fig. 4).
83
Fig. 3. BHL book viewer showing results of search for Chaetura pelagica in right hand pane.
The results list provides brief context and a link to each name occurrence within the volume.
Fig. 4. Search results for Chaetura pelagica across the full BHL repository. Note that the
results include instances of this species in Brewster’s transcribed Journals as well as the
published literature.
The incorporation of transcriptions, a relatively new function in BHL, enables more
extensive discovery options from within field notes and other handwritten materials.
Field notebooks, and often correspondence are replete with species occurrence records.
Brewster’s detailed bird observations always include the essential data: taxon, locality,
and date. Field notes often augment and enhance associated specimen labels by
providing more detailed locality information, habitat description, other species noted in
the area, and weather data (Tingley and Beissinger 2009). Brewster’s records
sometimes include failed attempts to find certain birds in a locality, adding weight to
absence reporting. This primary source data can now be integrated with occurrence
records in the published literature, such as species checklists and surveys.
84
Digitization and full-text indexing of field notes will facilitate cross linking of
specimen documentation. An example is Brewster’s discovery of a previously
undescribed subspecies of Black duck (Anas obscura rubripes). Brewster records the
collecting event in his journal entry of October 8, 1889, and published the new form
in 1902 in The Auk (Brewster 1902), both of which are in BHL. The specimen is
included in the collections of the MCZ, and the associated record in MCZbase, the
museum’s specimen database, links to its entry in an accession ledger, its publication
in The Auk, and several photographs of the specimen. Complete specimen
documentation and history is thus accessible via a single discovery platform.
Full-text indexing will also enable cross-referencing across Brewster’s diverse
collection of writings and photographs. For example, Brewster often referred to “the
jungle” when recording bird observations, with no real explanation of what “the jungle”
was. A search for “jungle” in Brewster’s 1898 Journal retrieves an instance of this term
on page 187. This reference can be linked to a digitized photograph of the “jungle.”
The metadata record for this photograph records an inscription on the verso of the
photograph reading: “Cambridge, Jan. 7, 1903, The jungle from the front lawn.” The
“jungle” is thus revealed as a specific area of the grounds at Brewster’s home in
Cambridge, Massachusetts.
The next logical steps are to continue adding transcriptions to BHL as time and
resources allow. The continued development of tools and APIs to mine and export
historical data from BHL as structured species occurrence records and data sets will
enable individuals and biodiversity data aggregators such as the Global Biodiversity
Information Facility ( GBIF) to easily harvest this information. The improvement of
digital collections platforms to better support linked open data will facilitate
establishment of connections within and between online collections.
3. CASE STUDY #2 Connecting and enhancing Asa Gray’s
botanical archives
Often called the “Father of American Botany,” Asa Gray (1810-1888) was instrumental
in establishing systematic botany as a field of study at Harvard University and, to some
extent, in the United States. His relationships with European and North American
botanists and collectors enabled him to serve as a central clearing house for the
identification of plants from newly explored areas of North America. He also served as
a link between American and European botanical sciences.
85
The Harvard Botany Libraries, which are part of the Harvard University Herbaria
(HUH), have several archival collections related to Asa Gray. One of the most
frequently used collections is the Asa Gray correspondence files (AGCorr). This
collection, approximately 1820-1904, includes the correspondence of Asa Gray and
other Gray Herbarium staff. The collection contains letters from several of the most
distinguished European and American scientists of the 19th century, including Charles
Darwin, Joseph Dalton Hooker, George Engelmann, and John Torrey. The Darwin
correspondence contains a letter to Gray establishing Darwin's precedence in
developing a theory of natural selection. The collection contains over 1,000
correspondents and fills a five drawer file cabinet. A separate collection, titled the Asa
Gray papers (AGPapers), contains important travel correspondence including letters
written by Gray to the Torrey family. Much of the correspondence in these collections
contains discussions of specimen identifications and can include accompanying
determinations.
Previously, there were several challenges for researchers interested in Asa Gray
or related botanists to access this material. Correspondence was in multiple collections
in the HUH and all of that correspondence was not easily searchable or digitized.
Archival collections usually contain only one side of a conversation. While the Gray
collections contain some Gray correspondence, they primarily consist of letters sent to
Gray, making it challenging to reconstruct a conversation.
When the AGCorr was first processed in the 1980s, a detailed inventory listing
correspondent, date, and number of letters was created. This was eventually published
online, in a grid format finding aid across twenty webpages (Fig. 5), but that list was
not searchable through the Library’s online catalog. Researchers often found the
collection by google searching an individual correspondent’s name or contacting the
library for assistance, but it was difficult to get an overview of the whole collection.
Also, related correspondence in the AGPapers was not discoverable through this list.
Fig. 5: AGCorr web grid finding aid, not linked to catalog or digitized content
86
In 2008 Harvard University initiated a project called Open Collections to digitize
expedition material. About sixteen files from the AGCorr were digitized and published
online. Several years later, the Botany Libraries secured funding to digitize all of the
correspondence from the AGCorr and a selection from the AGPapers because of
frequent use and high research value.
To prepare for digitization, all the correspondence was rehoused, number of
letters and pages were counted, and the author name and dates were verified. The grid
finding aid was used to create catalog records for parts of the collection, either by single
sender or letter of the alphabet. Digital content from multiple collections was
sometimes combined into single catalog records and digital files. Persistent links
(URNs) to the digital content in the catalog records improved access to the Gray
correspondence but it was not easily discernible to which collection some content
belonged. Also, digital materials were not linked back to the web finding aid, limiting
discovery and access to the digital files and requiring librarian mediation to connect
users to digital files.
In 2018, the Botany Libraries implemented ArchivesSpace, an archives infor-
mation management application that can produce Encoded Archival Description (EAD)
XML finding aids with links to digital objects. Around that time, staff used the catalog
records and the grid to generate updated EAD finding aids that included correspondent
level description, physical location of materials, and URLs to the digitized content.
Now, all the information about a collection is in one system and the metadata is export-
able and reusable across a variety of platforms.
Reuse of the metadata generated a fully searchable finding aid with links (URNs)
to all the digitized content in the collections. The AGCorr finding aid contains over
1,000 persistent identifiers that link the digital images by individual authors (Fig. 6).
While the Asa Gray content had been available digitally prior to this the new finding
aids provide context and an overarching organization across collections. Now research-
ers can find this correspondence through a Google search as well as through the library
catalog and finding aid portal.
Reuse of the metadata generated a fully searchable finding aid with links (URNs)
to all the digitized content in the collections. The finding aid contains over 1,000
persistent identifiers that link the digital images by individual authors (Fig. 6). Now
researchers can find this correspondence through a Google search as well as through
the library catalog and finding aid portal.
Fig. 6: Detail of AGCorr finding aid with links to digitized content
87
Refining the finding aid by individual authors also encouraged staff to clarify collection
provenance of the mixed digital files and add links to the finding aid to the catalog
records for all parts.
To expand the reach of these correspondences and connect them with content
from other institutions, the digitized files were uploaded to the BHL. Already present
in the BHL were digitized letters from Asa Gray to George Engelmann held at the Peter
H. Raven Library and Center for Biodiversity Informatics at the Missouri Botanical
Garden. Adding HUH’s letters from Engelmann to Gray on the same platform allows
users to read the back and forth correspondence between these two botanists from
anywhere in the world (see Figs. 7 and 8). Dates are included in the page viewing
window, making it easy to find individual letters. The letters in the HUH Gray-
Englemann set also include a Table of Contents that links directly to the individual
letters.
Fig. 7: Gray letter to Engelmann referencing an earlier April 6th letter
Fig. 8: Englemann letter to Gray dated April 6th letter
Shortly after HUH’s Asa Gray collection was added to the BHL, correspondence
between Asa Gray and John Torrey was uploaded to the BHL platform. The Torrey
Collection is located at the LuEsther T. Mertz Library at the New York Botanical
Garden. This provides another set of important botanists whose correspondence could
be easily accessed online.
88
Digital versions of select correspondence from HUH’s Asa Gray collection are
also available on other platforms that are connecting content around the world. The
Darwin Correspondence Project contains over 270 letters between Asa Gray and
Charles Darwin, including the relevant letters from the HUH collection. This site
presents transcripts and footnotes for all the correspondence. The Joseph Hooker
Correspondence Project from the Royal Botanic Gardens, Kew also includes Harvard’s
digital files of Hooker’s correspondence to Asa Gray. Images of the letters are available
on Kew’s website and they are developing transcripts with footnotes for all the letters.
The open access nature of the BHL and its easy-to-use tools for downloading files
allows for additional projects to incorporate the HUH’s Asa Gray materials.
An enhancement of the BHL transcription functionality now allows ingestion
from multiple sources, including plain text files and online crowdsourcing websites in
addition to DigiVol. Given the difficulties of reading some handwriting, transcripts can
prove invaluable in understanding these rich primary sources. The Hooker transcripts
will be ingested into the BHL upon completion by Kew. The BHL provides full text
searching and APIs to allow for data mining, increasing the utility of these
correspondences.
What once was a limited system consisting of discovery via a catalog record and
a webpage chart and access only available via in person appointment has transformed
into a robust interconnected system. Users can now use the Harvard library catalog to
find correspondence related to Asa Gray in multiple collections, see the breath of the
collection via a series statement in those catalog records, and link directly to the finding
aids and the files. The finding aids, completely searchable and crawled by search
engines, provide a wealth of information about the collection and include persistent
links by individual authors directly to the digital files. The digital files themselves are
paginated by date for ease of navigation. Inclusion in external platforms such as the
BHL allow this correspondence to connect to related letters held by other libraries and
archives around the world. These other platforms are crawled by search engines, fully
searchable, and contain added features like transcripts, footnotes, identification of
individual authors and scientific name finding.
4. CASE STUDY #3: Connecting collections within Harvard
Walter Deane (1848-1930) engaged in a number of natural history activities throughout
his life. He served as a founding member of the New England Botanical Club, assisted
with several flora publications in addition to publishing short articles, and was an active
member of the Nuttall Ornithological Club. Deane was an avid collector and worked as
curator for William Brewster's ornithological museum from 1897-1907. Brewster’s
papers reside in the MCZ and HUH contains the Walter Deane papers. Although the
collections are housed next door to each other, access previously required a trip to
Cambridge, Massachusetts (USA) and multiple appointments. With digitization and
ingestion into the BHL, it is now possible to view the work of these two researchers
together, including their joint field experiences.
89
While the bulk of Walter Deane’s papers are in the HUH collections, the
Brewster collection at MCZ includes 37 letters from Deane to Brewster and 40 letters
from Brewster to Deane, all of which are in BHL. Digitization virtually integrates
Deane’s papers from separate repositories. Once all of the Deane-Brewster
correspondence has been digitized and transcribed, a full-text search in BHL could
include results from the complete collection of Deane’s and Brewster’s at Harvard.
William Brewster’s publication The Birds of the Cambridge Region of
Massachusetts contains detailed descriptions of bird species as well as references to
observations in the area. Walter Deane and William Brewster walked together
occasionally and the article notes multiple instances when both sighted a particular
species. With the digitization of HUH’s Walter Deane field notebooks and the MCZ’s
Brewster field notebooks, it is now possible on the BHL platform to view the individual
field notes of the joint bird walks referenced in the article. For example, a sighting of a
Red headed woodpecker (Melanerpes erythrocephalus) nest is documented on Jun. 27,
1901 (Brewster, 1906). Both Brewster’s journal entry and Deane’s go into detail about
the vegetation in the area and the nesting site. Brewster provides more detail about the
behavior of the birds (4 page journal entry) while Deane notes the date as June 28 and
includes more of a summary of behavior (1 page journal entry). By viewing multiple
sources, researchers can obtain additional information and different perspectives on the
same event.
Some of Brewster’s journals also contain notes by Deane regarding verification
of information and Deane’s addition of this data into Systematic Notes, a compilation
of bird sightings. Deane also typed up some pages of Brewster’s journals. These typed
selections along with the transcripts available in BHL are fully searchable, including
by scientific name and the names of individuals. Additional days of joint field work can
now be discovered beyond that indicated in the The Birds of the Cambridge Region of
Massachusetts, for example June 2, 1901 in Brewster’s journals and Deane’s. As addi-
tional transcripts are posted, more connections will be revealed.
5 CONCLUSION
These case studies demonstrate various ways collections from multiple repositories can
be digitized and made available on open platforms to researchers around the world. The
MCZ took one researcher’s body of archival work and digitized, transcribed and
connected it. HUH gathered projects done at different times and with different purposes
to develop integrated sets of archival materials that could then be connected to digital
projects at other institutions. While the MCZ and HUH have approached digitization of
archives differently, both approaches resulted in improved accessibility and
connections within Harvard and beyond. Techniques from one project can be applied
to the other, such as crowdsourcing Asa Gray and Walter Deane transcriptions or
developing a finding aid for the William Brewster collection.
BHL and other partnerships have been critical for the digitization, transcription,
and enhancement of metadata in these natural history and botanical library and archives
collections. Much of the success of these efforts is due to collaborative grants, joint
90
projects and the availability of low cost or free tools. Improved accessibility requires
human effort, time and resources.
Having the capacity to identify article level metadata, add scientific names, dates,
georeferences and transcriptions adds value and enhances discovery for primary users
and, in fact, all users. By incorporating the naming conventions and sources used by
other data providers and aggregators in the biodiversity knowledge ecosystem, such as
the Global Biodiversity Information Facility (GBIF), bridges are built. Articles and ar-
ticle segments can be directly incorporated into data aggregators such as GBIF as well
as contribute to the extended specimen network by connecting the knowledge about
museum specimens with the physical objects and their digital facsimile (known as the
extended specimen network--see Lendemer et al. 2019)
Most curation work for the collections described here is currently human medi-
ated - automation is in the early stages. Interoperability is a key result so that the data
generated can be reused easily in other platforms. For example, transcripts allow scien-
tific name finding in BHL and EAD data from ArchivesSpace can be harvested and
shared. Opportunities to learn from and collaborate with others help expand individual
capacity. An iterative process of trying different tools and techniques ensures that the
result fits the workflow within an organization. Implementation plans for the 2020-
2025 BHL strategic plan include working on ways to more fully automate metadata
enhancement (including transcription) and collection sharing.
The COVID-19 pandemic has highlighted the need for discoverable, searchable,
and openly accessible primary sources to allow for the continuation of research in the
sciences and humanities. This pandemic has eliminated physical access to most primary
source materials in many institutions. Both the MCZ and HUH have been closed to
researchers since March 2020 and, at the time of this writing (September 2020) no date
has yet been set for researchers to re-enter the reading rooms to view physical archival
materials. Currently, staff are mostly working from home on projects that are improving
accessibility of primary source materials. The BHL Secretariat staff at the Smithsonian
Libraries and Archives helped organize and support COVID-19 related telework pro-
jects for BHL partner staff at 10 institutions in three countries (US/UK/Australia) over
the last 7 months. Partner remote work activities centered around enhancing metadata
and curation, beyond the archival work described in this paper. For example, 30,000
articles were made visible by marking beginning and end pages and adding digital ob-
ject identifiers. During this difficult time of remote-only work we have been empowered
to refine, enhance and correct metadata as well as transcribe more of the digitized ma-
terials. These opportunities have resulted in improved discovery and exposure of con-
nections such as the ones cited between Brewster and Deane.
References
1. Brewster, W. An undescribed form of the Black duck (Anas obscura). The Auk 19:183-
188 (1902).
2. Brewster, W. The Birds of the Cambridge Region of Massachusetts. The Club. Cambridge,
Massachusetts (1906).
3. Deane W. Asa Gray. Bull. Torrey Bot. Club. 15(3):59-72. (1888)
4. Emmet, A. William Brewster: brief life of a bird-lover: 1851-1919. Harvard Magazine No-
vember-December (2007).
91
5. Farlow WG. Memoir of Asa Gray. 1810-1888. Biogr. Mem. Natl. Acad. Sci. U.S.A. 3:161-
175. (1895)
6. Gray A. Autobiography. In: Gray JL. Letters of Asa Gray. Boston (MA): Houghton, Mifflin
and Company. (1894)
7. Henshaw, H. In memorium: William Brewster. Born July 5, 1851 - Died July 11, 1919.
The Auk 37: 1-23.(1920)
8. Lendemer J, Thiers B, Monfils AK, Zaspel J, Ellwood ER, Bentley A, LeVan K, Bates
J,Jennings D, Contreras D, Lagomarsino L, Mabee P, Ford LS, Guralnick R, Gropp RE,
Revelez M, Cobb N, Seltmann K, Aime MC (2019) The Extended Specimen Network: A
Strategy to Enhance US Biodiversity Collections, Promote Research and Education. Bio-
Science 70 (1): 2330. https://doi.org/10.1093/biosci/biz140
9. Mika, K., J. DeVeer, C. Rinaldo. Crowdsourcing natural history archives: tools for extract-
ing transcriptions and data. Biodiversity Informatics. Vol 12 DOI:
https://doi.org/10.17161/bi.v12i0.6646 (2017).
10. MONK: System for word searching in historical and handwritten materials.
11. Robinson, B. L. 1930. Botanical Legacies of Walter Deane. Science. 72: 459.
12. Tingley, M. and S. Beissinger. Detecting range shifts from historical species occurrences:
new perspectives on old data. Trends in Ecology and Evolution 24: 625-633.
https://doi.org/10.1016/j.tree.2009.05.009 (2009).
13. Weatherby, C.A. Walter Deane. Rhodora. 35(411): 69-80. (1933)