=Paper=
{{Paper
|id=Vol-2314/abstract3
|storemode=property
|title=None
|pdfUrl=https://ceur-ws.org/Vol-2314/abstract3.pdf
|volume=Vol-2314
}}
==None==
Decoding what the sender did not want to transmit.
Information technology and historical data; or something
(Abstract of the invited talk at COMHUM 2018)
Manfred Thaller
Emeritus
University at Cologne
Germany
manfred.thaller@uni-koeln.de
In 1978 I was hired by the then Max Planck Insti- ply the usual model of information science, where a
tute for History at Göttingen, to support a number common understanding of the context is supposed
of research projects in the field of micro-history by to allow such a cognitive understanding.
the provision of appropriate IT technologies. The We propose, therefore, to replace the sender-
projects planned to use an approach, which was receiver metaphor in information systems dealing
based on “extended family reconstitutions”, even if with historical data with an observer metaphor,
that precise term was coined only a few years later. where observers use observed messages to under-
A “family reconstitution” traditionally is employed stand the context in which they have been encoded
in historical demography. It starts with the mar- – the understanding of the observed message it-
riage registers of a historical community (a village self being a welcome side benefit. If one tries to
or small city) over at least two hundred years, iden- implement this metaphor determinedly and with-
tifies all brides and groom of the marriages in the out compromise, one soon discovers, that quite a
birth and death registers, assigns all children in the few technologies of current IT systems become
birth registers to the marriages of their parents and awkward soon – embedded markup, e.g., loses its
identifies their death entries. To this network of all charms, when a clear-cut separation between the
demographic relationships within a community an (mainly) static representation of the data and the
“extended family reconstitution” adds all mentions (always) dynamic interpretation of these data, a.k.a.
of every person in taxation registers, testaments, the information assumed provisionally, is required.
local court protocols – and basically every other While, as just mentioned, a number of techno-
surviving source. logical assumptions become problematic with this
It was clear from the very beginning, that such new metaphor, one of the most obvious bundles
a project would take time – and it was impossible of problems deals with the inherent vagueness and
to predict at the time of data entry, what part of uncertainty of the information derived from the
the source would be needed for analysis the years data.
later. The decision was therefore, to preserve “all In order of increasing complexity we will in the
information” contained in the source – even if such second half of our presentation with three example
information was vague, unclear or contradictory. A problems. For the sake of generality, we will handle
short impression of the rough solutions provided to these on the levels of concepts to be supported
come to grips with these properties of the data will by programming languages, not on the level of
be given. application systems. While many of the approaches
Handling massive (for the time) data bases, quickly discussed owe much to Zadeh’s concept of Fuzzy
leads to the understanding, that while one may in Sets, we use fuzziness in a broader sense, leaving
the long run understand, what information is con- it uncapitalized therefore.
veyed by a particular chunk of data in the source,
one certainly does not immediately. This raises the 1 Fuzzy numbers
question, how far the kind of information process-
ing to be supported actually follows the classical In many historical sources – or descriptions of their
paradigm of Shannon, where receivers are able to assumed content – numerical data are not points in
decode cognitively a message transmitted to them a continuum, but ranges, or sets of ranges. This is
immediately. It gets worse, when one hopes to ap- particularly obvious in the case of temporal infor-
9
Proceedings of the Workshop on Computational Methods in the Humanities 2018 (COMHUM 2018)
mation, where the handling of intervals has a long Applying the logic of computing with words,
tradition in IT applications for historical sources, we have to consider graphs, where some nodes
therefore, it is a more general problem however. We are connected by edges which connect them
will briefly describe, how a datatype would look with a truth value other than ‘true’ or ‘false’.
like, which can integrate the handling of such data
smoothly into existing programming paradigms. 3 Fuzzy control structures
We will use the examples presented earlier from The thorniest problem seems at first look to be the
the work of the late seventies and early eighties, to most simple. To support a logic with any kind of
show how mathematical developments since then truth values other than ‘true’ or ‘false’ is of course
can overcome limitations of the earlier approaches no problem, as long as it is restricted to situations,
and where major barriers still exist. where a decision about the combined truth value
of a decision problem has to be made. As soon,
2 Fuzzy terms and structures
as we intend to employ such a truth value in the
The greatest successes of computational approaches parts of a programming language controlling the
which are based on alternatives to Boolean logic are flow of the program, we encounter quite serious
visible in the fuzzy control structures of industrial situations, where we briefly describe to what sort
applications described as “computing with words”. of larger framework a solution would require.
The classical examples in this field, as “the truth
value of ‘Lausanne is more or less close to Geneva’
is more or less true”, seem at first look to be ex- References
tremely close to the kind of reasoning historians – Favre-Bull, Bernard (2001). Information und
or, indeed, humanists – frequently employ. We will Zusammenhang. Informationsfluß in Prozessen der
Wahrnehmung, des Denkens und der Kommunikation.
briefly examine reasons, why that kind of approach
Wien/New York: Springer.
has, nevertheless, only very rarely been applied in
historical research. Liu, Sifeng and Yi Lin (2011). Grey Systems. Theory
We will concentrate, however, on two broader and Practical Applications. No. 68 in Understanding
Complex Systems. Berlin/Heidelberg: Springer.
problems.
Shannon, Claude E. and Warren Weaver (1949). The
(a) As it stands, computing with words is cur- Mathematical Theory of Communication. Champaign,
rently almost always employed as a fuzzy IL: University of Illinois Press.
pocket in an otherwise crisp information sys- Zadeh, Lotfi A. (2005). Toward a Generalized Theory
tem, where the uncertainty of the decision is of Uncertainty (GTU) – an outline. Information Sci-
hidden from the main stream of the program. ences, 172(1–2):1–40. doi:10.1016/j.ins.2005.01.017.
This would require a more general concept of
Zadeh, Lotfi A. and Janusz Kacprzyk, eds. (1999a).
a fuzzy term which could be seemlessly inte- Computing with Words in Information/Intelligent Sys-
grated into a program in such a way that it co- tems 1. No. 33 in Studies in Fuzziness and Soft Com-
exists with variables of traditional datatypes. puting. Heidelberg: Physica. doi:10.1007/978-3-7908-
1873-4.
(b) In the semantic technologies, which are mak-
Zadeh, Lotfi A. and Janusz Kacprzyk, eds. (1999b).
ing much headway in the Humanities cur- Computing with Words in Information/Intelligent Sys-
rently, ontologies organize terms in graphs tems 2. No. 34 in Studies in Fuzziness and Soft Com-
currently. In graphs, where two nodes are puting. Heidelberg: Physica. doi:10.1007/978-3-7908-
either connected or unconnected by a node. 1872-7.
10