<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Archiving and Interchange DTD v1.0 20120330//EN" "JATS-archivearticle1.dtd">
<article xmlns:xlink="http://www.w3.org/1999/xlink">
  <front>
    <journal-meta>
      <journal-title-group>
        <journal-title>BIR</journal-title>
      </journal-title-group>
    </journal-meta>
    <article-meta>
      <title-group>
        <article-title>Some questions for information science arising from the history and philosophy of science?</article-title>
      </title-group>
      <contrib-group>
        <contrib contrib-type="author">
          <string-name>Henry Small</string-name>
          <email>hsmall@mapofscience.com</email>
        </contrib>
      </contrib-group>
      <pub-date>
        <year>2020</year>
      </pub-date>
      <volume>14</volume>
      <abstract>
        <p>? A companion video is hosted at https://youtu.be/xOpFB0r0WPg.</p>
      </abstract>
    </article-meta>
  </front>
  <body>
    <sec id="sec-1">
      <title>-</title>
      <p>The occasion of the BIR conference has prompted me to say a few words
about research possibilities in information science that are suggested by my
background in the history and philosophy of science. I do not represent the
constructivist viewpoint and am more traditional in my belief that science is
or should be evidence-based and is not predominantly an interest-driven and
socially constructed activity.</p>
      <p>
        The rst research problem information scientists might address is the nature
of scienti c discovery. We know that from an information perspective, discoveries
are often associated with high citation rates after the fact, and that many
discoveries are not recognized as such for many years, although some enjoy immediate
recognition. We do not have a clear understanding of the reason for this di
erential. Nor do we understand how major discoveries can be distinguished from
more modest but important research ndings and advances. From a prospective
point of view, it has been argued that discoveries are novel associations of facts
or ideas that had not been previously connected. For example, Don Swanson [
        <xref ref-type="bibr" rid="ref5">5</xref>
        ]
proposed the idea of \undiscovered public knowledge" where we connect di erent
existing bodies of knowledge which, to some extent, can be anticipated by nding
indirect pathways through the knowledge network. However, many discoveries
involve novel or unanticipated entities or mechanisms. For example, the
hypothesis that CRISPR was a bacterial defense mechanism against invading viruses
was initially arrived at by comparing CRISPR \spacer" sequences against viral
gene libraries and thus was an inductive process [
        <xref ref-type="bibr" rid="ref3">3</xref>
        ]. Other discoveries are more
deductive in nature, for example, predicting some known empirical result from
theory.
      </p>
      <p>Another type of \discovery" that needs attention is the invention of new
methods. The importance of methods in contemporary science is revealed by
an analysis of the most cited papers in almost any eld of science. It might
be argued that methods are now driving science. We know very little about
how new methods are invented or how old ones enhanced. Do methods emerge
from basic research or do they represent a separate evolutionary path more akin
to technological developments? Finally, we are interested in the applications of
methods in the conduct of basic research. Obviously, they are a source of evidence
to test theories, but also data is collected for the sake of collecting and stored
in computer databases. Another question is whether theories rely on existing
methods for testing or require the development of new methods?</p>
      <p>
        If we look closely at some historical cases, for example, the discovery of the
neutron in 1931, we see that scientists had initial inklings prior to the discovery
that an electrically neutral massive particle existed. This takes us to the next
historical process that needs more research, namely con rmation [
        <xref ref-type="bibr" rid="ref4">4</xref>
        ]. How are
scienti c hypotheses, theories, or hunches con rmed or corroborated? Is Bayes's
rule su cient to account for most historical cases? If so, can the \prior" and
condition probabilities required by the Bayesian approach be derived from historical
records or statements of scientists? Can informetric and text-based methods be
devised to identify competing or alternative theories? Or are there alternatives
to a probabilistic theory of con rmation such as consilience or coherence of a
knowledge network?
      </p>
      <p>
        Implicit in my discussion of the problems discussed above are the application
of methods for clustering and mapping scienti c communities and the ability
to delineate structures of leading ideas and concepts at the specialty level [
        <xref ref-type="bibr" rid="ref1">1</xref>
        ].
      </p>
      <p>
        Fortunately, very e ective computation methods have been devised for
detecting community structures using bibliographic databases most notably citation
indexes. More research is needed, however, into studying how these structures
evolve over time. Do research areas go through a lifecycle and how do we identify
their starting and ending points? Where do discoveries and methods t into the
cycle and can we nd evidence of con rmation or discon rmation occurring? In
what sense do these clusters or communities de ne what Thomas Kuhn called
\paradigms" [
        <xref ref-type="bibr" rid="ref2">2</xref>
        ]? This construct has remained elusive and unde ned. Kuhn also
proposed that science undergoes periodic major upheavals called revolutions. We
should be able to detect such events, even if only retrospectively, given adequate
historical datasets, by studying changes in terminology or cited references. He
also proposed that we should see micro-revolutions at the level of small scienti c
communities. Is there any evidence that micro-revolutions are occurring and how
do they di er from their larger brethren? This will of course require taking a
deep dive into speci c specialty communities, for example covid-19 or CRISPR,
which is an approach not currently favored in the informetrics community. In my
view without detailed longitudinal case studies, guided by large scale clustering
or community detection analyses, we will not be able to get to the bottom of
these questions.
      </p>
      <p>Finally, I should mention various approaches to analysis of scienti c texts,
and particularly citation context analyses. I do not think that the analysis of
negative sentiments in citation contexts will be that fruitful because scientists
are reluctant to engage in public criticism of their colleagues in print. More
productive are studies of the degree of concept uncertainty as indicated by hedging.</p>
      <p>However, much broader in scope is the use of what might be called \epistemic
labeling" in scienti c contexts. If we look, for example, at highly cited papers we</p>
      <p>nd not only consensus in the citation contexts on the meaning of cited texts,
but also consistency in their labeling as \discoveries", \advances", \methods",
\reviews", \databases", etc. and the use of other terms that indicate the cited
Copyright © 2020 for this paper by its authors. Use permitted under Creative Commons License Attribution 4.0 International (CC BY 4.0).</p>
      <p>BIR 2020, 14 April 2020, Lisbon, Portugal.</p>
      <p>119
concepts' epistemic role, which are often expressed in relational terms, such as
\causing", \explaining", \predicting", \con rming", etc. Linguistic methods will
be required to delineate, for example, what is \explained" by what, or what is
\caused" by what. Obviously, such vocabulary analyses dovetail with some of
the questions raised above. Whether the study of such epistemic terms and
relationships will take us closer to understanding the \logic of science", as was long
the objective of empiricist philosophers of science, remains to be seen.</p>
    </sec>
  </body>
  <back>
    <ref-list>
      <ref id="ref1">
        <mixed-citation>
          1.
          <string-name>
            <surname>Klavans</surname>
            ,
            <given-names>R.</given-names>
          </string-name>
          ,
          <string-name>
            <surname>Boyack</surname>
            ,
            <given-names>K.W.</given-names>
          </string-name>
          :
          <article-title>Which type of citation analysis generates the most accurate taxonomy of scienti c and technical knowledge?</article-title>
          <source>Journal of the Association for Information Science and Technology</source>
          <volume>68</volume>
          (
          <issue>4</issue>
          ),
          <volume>984</volume>
          {
          <fpage>998</fpage>
          (
          <year>2017</year>
          ), doi:10.1002/asi.23734
        </mixed-citation>
      </ref>
      <ref id="ref2">
        <mixed-citation>
          2.
          <string-name>
            <surname>Kuhn</surname>
            ,
            <given-names>T.S.:</given-names>
          </string-name>
          <article-title>The structure of scienti c revolutions</article-title>
          . The University of Chicago Press, Chicago, 2nd edn. (
          <year>1970</year>
          )
        </mixed-citation>
      </ref>
      <ref id="ref3">
        <mixed-citation>
          3.
          <string-name>
            <surname>Lander</surname>
            ,
            <given-names>E.S.:</given-names>
          </string-name>
          <article-title>The heroes of CRISPR</article-title>
          .
          <source>Cell</source>
          <volume>164</volume>
          (
          <issue>1</issue>
          {2),
          <volume>18</volume>
          {
          <fpage>28</fpage>
          (
          <year>2016</year>
          ), doi:10.1016/j.cell.
          <year>2015</year>
          .
          <volume>12</volume>
          .041
        </mixed-citation>
      </ref>
      <ref id="ref4">
        <mixed-citation>
          4.
          <string-name>
            <surname>Small</surname>
          </string-name>
          , H.:
          <article-title>Past as prologue: Approaches to the study of con rmation in science</article-title>
          .
          <source>Quantitative Science Studies</source>
          (
          <year>2020</year>
          ), in press, preprint available at https://bit. ly/Small2020QSS
        </mixed-citation>
      </ref>
      <ref id="ref5">
        <mixed-citation>
          5.
          <string-name>
            <surname>Swanson</surname>
            ,
            <given-names>D.R.</given-names>
          </string-name>
          :
          <article-title>Two medical literatures that are logically but not bibliographically connected</article-title>
          .
          <source>Journal of the American Society for Information Science</source>
          <volume>38</volume>
          (
          <issue>4</issue>
          ),
          <volume>228</volume>
          {
          <fpage>233</fpage>
          (
          <year>1987</year>
          ), doi:dq8phz
        </mixed-citation>
      </ref>
      <ref id="ref6">
        <mixed-citation>
          <article-title>Copyright © 2020 for this paper by its authors</article-title>
          .
          <article-title>Use permitted under Creative Commons License Attribution 4.0 International (CC BY 4</article-title>
          .0).
          <source>BIR</source>
          <year>2020</year>
          ,
          <volume>14</volume>
          <issue>April 2020</issue>
          , Lisbon, Portugal. 120
        </mixed-citation>
      </ref>
    </ref-list>
  </back>
</article>