Scene of Crime Information System: Playing at St. Andrews
Bogdan Vrusias, Mariam Tariq, Lee Gillam
Department of Computing
University of Surrey, England
{b.vrusias, m.tariq, l.gillam}@surrey.ac.uk
Abstract
This paper discusses the adaptation of the Scene of Crime Information System developed within an EPSRC-funded project, to the
collection of data within the ImageCLEF track of the Cross Language Evaluation Forum 2003. The adaptations necessary to
participate in this activity are detailed, and initial results are briefly presented.
1. ImageCLEF Collection
ImageCLEF is concerned with the retrieval of images from a specific collection by the captions associated to those
images, and is running in relation to an EPSRC-funded project at Sheffield University (Eurovision, GR/R56778/01). The
image collection consists of around 28,133 images from the photographic collection provided by St Andrews University
Library (Clough et al. 2003). The 28133 images are each referred to and annotated by a single text file, and the full set of
annotations are contained within one SGML-based document1. Each annotation comprises identifiers to the text file and
the image files (DOCNO, SMALL_IMG, LARGE_IMG), the caption of the image (HEADLINE), a set of categories that
have been assigned to this image (CATEGORIES), a database record identifier (RECORD_ID) and an unlabelled chunk
of text describing the image, denoted below in italics.
stand03_2093/stand03_27914.txt
The Open Championship, St Andrews 1955. Dai Rees and Max Faulkner fishing.
GMC-.000007.-.000009.-.000021
Rees and Faulkner fishing. Three men in rowing boat tied up at jetty, one holding two fishing rods, one holding oar.
July 1955 George Middlemass Cowie Fife, Scotland GMC-7-9-21 mb/
[piers and landing stages],[Fife all views],[rowing boats],[golf - general],[golf - British
Open],[rowing],[angling],[battlefields],[fresh water fishing],[fishing vessels],[fishing equipment]
stand03_2093/stand03_27914.jpg
stand03_2093/stand03_27914_big.jpg
The information encoded in the XML is intended for use in the retrieval task. By ranked retrieval matching, a set of
upto 1000 images is to be retrieved for Task 1, automatic ad hoc retrieval, of the track, and for other purposes in Task 2,
interactive image retrieval, of the track.
The above XML fragment refers to the image shown below, of three men in a boat, in Figure 1.
Figure 1: Example Image from the ImageCLEF collection
1
Although the file was proclaimed to be XML, a number of non-Unicode characters prevented its parsing. It was
necessary to replace these with their Hex sequences, ensuring full XML-conformance, to use this collection.
From the above example, it is apparent that some of the categories assigned to the images may not be wholly reliable.
While some of the associations are clear:
jetty piers and landing stages
Fife Fife all views
rowing (boat) rowing boats, rowing, fishing vessels
fishing (rods) angling, fresh water fishing, fishing
equipment
others could be associated to information that appears, but is not in the correct context – the combination of “Open
Championship” and “St Andrews” being candidates for explaining the golfing categories – while the assignment of a
“battlefields” category is less easily obvious.
2. Task 1: Automatic Ad Hoc Retrieval
The automatic ad hoc retrieval task aims at the ranked-retrieval of upto 1000 images from the Eurovision collection. The
images are to be retrieved in response to a set of pre-formulated queries. The queries themselves comprise of 50 topics.
Each topic has an English query, plus narrative description of the expected result of the query, and the English query has
been translated into 5 other languages, French, German, Spanish, Italian and Dutch. Some queries have more than one
translation for a given language.
The retrieval results are to be assessed by personnel from the University of Sheffield such that they can be evaluated
using the trec_eval program with recall and precision metrics. Similar to TREC, the results will subsequently be
published.
An example topic encoded in XML2 is shown below:
Number: 25
Golf course bunkers
A relevant image will show a picture of a golf course in which a bunker can be clearly identified. The picture
must be a photograph or a postcard, but not a drawing, e.g. a plan of the golf course. A bunker is a sandy hollow
formed by wearing away of the turf, or nowadays an artificial sand-hole with a built-up face. An example relevant
document is [stand03_1714/stand03_7020].
Number: 25
Golfplatz Bunker
Number: 25
Bunkers de terrain de golfe
Number: 25
Un bunker in un percorso di golf
bunkers in un campo di golf
Number: 25
Búnkers en un campo de golf
Pista de golf
Number: 25
Bunkers op een golfbaan
The example shown is for Topic 25, for which Golf course bunkers has been translated once into each of German,
French and Dutch, and twice each for Spanish and Italian. With multiple translations for some languages for the 50
topics, we have the following number of queries for the various languages:
2
Similar character issues as reported previously were also fixed for this collection.
Spanish 117
English 50
French 51
Italian 103
German 50
Dutch 50
Total 421
These 421 queries are to be made against the 28,133 annotations to retrieve images from the collection.
3. The SoCIS Archetype
The EPSRC-funded Scene of Crime Information System (SoCIS) project was run from October 1999 to March 2003.
The aim of the project was to study the link between images and texts within a specialist domain context. A method has
been outlined for developing an intelligent content-based image retrieval (CBIR) system, which can store and retrieve
images based on the linguistic descriptions of the images. The corpus-based method uses the lexical and semantic
properties of specialist texts for extracting key terms and for discovering the ontological organisation of the terms.
A prototype CBIR system was developed in the Java programming language for demonstrating the efficacy of the
corpus-based method. The system, which is based on a 3-tier architecture of client, server, and database, can be accessed
via a local intranet. SoCIS is an intelligent CBIR system that automatically: (a) labels (and indexes) images by keywords
as well as relational facts extracted from the descriptions provided by domain experts; (b) extracts physical features of an
image; (c) populates a database comprising domain-specific terminology, together with the semantic relationships
between terms, starting from a random selection of collateral texts of the domain; and (d) learns to link image and text by
using neural networks (Ahmad et al., 2002). SoCIS has integrated modules from (a) System Quirk (Ahmad & Rogers,
2001) - a set of tools for building and managing multilingual term bases with the use of powerful text analysis
techniques, and (b) GATE (Cunningham et al., 2002) - a framework and graphical development environment comprising
robust NLP tools. The main advantages that SoCIS can be said to have over other text-based and CBIR systems is its
ability to extract information from both texts and images, to encode this information for indexing, and to build thesauri,
all automatically.
The SoCIS prototype3 was evaluated using images normally used for the training of Scene of Crime Officers (SoCOs)
together with a description provided by the SoCOs as well as other collateral texts like crime scene reports and forensic
science research papers and manuals. The question of (inter) indexer-variability, the variances in the output of different
indexers for the same image, has been explored in the project (Handy & Ahmad, 2003). This study further reinforced the
need for automatic thesauri construction to aid in query expansion (Ahmad et al., 2003a).
4. Adapting SoCIS
SoCIS was specifically targeted at the use of specialist languages – or Languages for Special Purposes (LSP) (Harris,
1988, Ahmad & Rogers, 2001). The system has been built based on the knowledge gathered from Scene of Crime
experts, from the testing and evaluation sessions performed with them, and from a domain-specific text corpus. The
system had to be adapted to deal with multilinguality as well as structured data from a more general domain for the
ImageCLEF collection. SoCIS does not have a translation tool so the translation of the queries from the other languages
to English had to be carried out offline as discussed in section 4.1. A parser had to be written to extract the various fields
containing textual information (in English) about the images from the provided XML document that could be used for
indexing purposes. The indexing module was used to extract single and compound terms from the output of the parser.
The main difficulty we encountered (see section 4.2) was the creation of a terminology dictionary and thesaurus related
to the general domain, which is needed for the automatic indexing and query expansion modules. We decided to use
Wordnet for query expansion purposes but the indexing had to be carried out without using a terminology dictionary to
filter out invalid terms. A new relevance ranking mechanism, which is briefly described in section 4.3, was adopted to
handle the expanded terms retrieved from Wordnet.
4.1. Handling Multilinguality
The first step necessary was the translation of the various queries to English. Without in-house software, we relied upon
translation engines as found on the Internet. Some work was done in an attempt to exploit Google’s translation tools for
this purpose, however there were difficulties encountered in this. Eventually, Altavista’s Babelfish was selected as the
principal translation engine (http://babelfish.altavista.com/), however since this system does not translate Dutch,
FreeTranslation.com (http://www.freetranslation.com/) was also used.
To translate the queries, Java code was used to wrap definitions of the query syntax used by these sites (with the
HTTP POST command being used in both cases). Each query was posted to the site with its requested translation
language pair, and the HTML result was retrieved. Using the Java JTidy utility, the resulting HTML was converted to
XML (Bray et al, 2000), and XSLT (Clark, 1999) employed to strip out the end result of the translation.
3
http://www.surrey.ac.uk/socis
The results of translating the various languages for topic number 25 (Golf course bunkers) are shown in the table
below:
German Golf course shelter
French Bunkers of ground of gulf
Italian (1) A bunker in a distance of golf
Italian (2) bunkers in a golf course
Spanish (1) B??nkers in a golf course
Spanish (2) Track of golf
Dutch Bunkers on a wave job
Immediately, certain of these translations will cause problems with the retrieval. The topic identifies the image
stand03_1714/stand03_7020 as being relevant. In the run, this was located only for English, Italian (2), and Dutch at
ranks 798, 798 and 45 respectively. The quality of returned translation will therefore have a significant impact on the
results being returned.
4.2. Synonymy and Morphology
The thesaurus construction module of SoCIS was developed to provide a query expansion facility for the system. There
are general-purpose thesauri or lexicons available such as Wordnet4, which could be used but are inadequate in specialist
domains due to a deficiency in specialized terminology. For example, the two key compound terms ‘forensic science’
and ‘crime scene’ are not present in Wordnet. The method we developed was based on the analysis of a representative
domain-specific text corpus to automatically extract key terms and relationships, which were then used to build the
thesaurus (Ahmad et al., 2003a, Tariq et al., 2003). Since the ImageCLEF collection comprised of a wide range of
mainly general topics such as buildings, golfers, animals, boats and so on, to apply our method we would have had to
construct and analyze a corpus representing most of general knowledge, a clearly difficult and unpractical task. We
decided that Wordnet could be a possible resource to use for query expansion since its coverage is based on a general
English dictionary.
A program was written to query a Wordnet database to provide a set of synonyms and hyponyms for each of the
query terms. In Wordnet, English nouns, verbs, adjectives and adverbs are ordered into synonym sets (synsets). Each
synset can be said to contain the words that represent a specific concept. The synsets are then linked to each other based
on semantic relations such as antonymy, hyponymy and meronymy. Given a query term, the program returns all the
words in the synset that the particular term is an element of, as well as all the hyponyms of each synset element to a
specified level in the hierarchy. Initially we planned to go down 2 levels in the hierarchy but ended up using just the
synonyms due to system performance issues related to the large number of expanded terms returned, which is discussed
in section 5. Taking the query “Boats on Loch Lomond” as an example, the term ‘boat’ returned 53 expanded words
going down one level in the hierarchy. Some synonyms returned were: travel on water, sauceboat, gravy boat; some
hyponyms returned included motorboat, mail boat, mailboat gondola, propel by oars, propel by paddles, yacht, and so
on. ‘Loch’ returned one synonym lough while ‘Lomond’ was not present since it is a proper noun. The very common
term ‘man’ had 131 expanded words going down one level and 344 expanded words going down two levels with words
such as private, make swollen, belly out, candy striper, Homo erectus, clothes horse, ridicule with a satire, and
gentleman.
Some basic morphological analysis was also carried out for each query term to account for the use of variants such as
singular or plural terms as well as the verb or adjective forms. The morphology module uses standard rules (for example
if a word ends with ‘ss’ or ‘h’ then the plural form is usually derived by adding an ‘es’) as well as some common
exceptions (for example the plural of ~man will be ~men). This was also important for the query expansion part since
Wordnet only has singular forms of words as part of the synsets so a plural word used as the query term will return no
results.
4.3. Relevance Ranking
Each keyword carried a proportion of its frequency in an annotation divided by the total number of terms allocated to this
annotation. The original keyword was then multiplied with weight 1, each expanded term (synonyms) returned by
WordNet with weight 0.9, and words containing substrings of the original keywords with weight 0.1. The total ranking
was then given by:
f × wt
Rank = ∑ td
Nd
Where ftd is the term frequency of term t in document d, wt is the weight of a term t as described previously, and Nd
is the total number of words in document d.
4
http://www.cogsci.princeton.edu
5. Performance Issues
The main factor to have an effect on the performance of SoCIS was that the system has been designed for the analysis
of free text in specialist domains whereas with the ImageCLEF collection we were dealing with structured texts in a
general domain. This resulted in difficulties for SoCIS when indexing the images – the indices produced were relatively
unreliable due to the different syntactic structure of the ImageCLEF text when compared to free text, which also affected
the ranking. One example here is that the system considered all the category terms given by the ImageCLEF description
in the XML document (since they where enclosed in square brackets) as a single compound term. Also due to the fact
that we used Wordnet for query expansion, we encountered problems associated with polysemous words as well as
different word forms (see the example of boat and man in section 4.2). Due to the amount of time it was taking to
process the expanded queries (some times reaching up to 300 words, see section 4.2) we had to limit the expansion to just
synonyms of the original query terms. Even so we had six computers running in parallel to finish the processing, which
was taking approximately 8 hours per language.
6. Results and Evaluation
Although the combination of features outlined above would require significant efforts to develop as a usable real-world
system (parallelisation and optimisation issues at least), the combination of technologies and techniques presented did
enable participation in the ImageCLEF track. A system that in principle would allow a user to query a collection of
images that have been annotated in English, using a query in one of six languages has been prototyped from this
combination. According to the abstract from the Eurovision project, such a system had not been implemented or
researched. Though far from perfect, the evaluation of the results obtained at this stage is important.
Across all languages, the following sets of results were obtained (missing topics and quantities for that topic are given
in the third column):
Spanish 105 / 117 32 (3), 33 (1), 34 (1), 36
(1), 39 (2), 43 (3), 47 (1)
English 48 / 50 40, 46
French 47 / 51 7, 17, 25
Italian 91 / 103 13 (2), 17 (1), 27 (3), 29 (2)
31 (1), 39(1), 43 (1), 45 (1),
German 43 / 50 4, 7, 13, 27, 40, 46, 48
Dutch 38 / 50 5, 7, 13, 17, 18, 20, 27, 29,
36, 39, 40, 43
Total 372 / 421
From a selection of topics, we should evaluate where the exemplar image is ranked and the relevance of the top 10
images retrieved to the query.
Caption Exemplar
7 Home guard on parade stand03_1955/
during World War II stand03_24985
14 Boats on Loch Lomond stand03_1346/
stand03_15600
21 Animals by the stand03_1955/
photographer Lady stand03_5603
Henrietta Gilmour
28 Pictures of golfers in the stand03_2036/
nineteenth century stand03_7549
35 The mountain Ben Nevis stand03_1643/
stand03_4692
42 University buildings stand03_1853/
stand03_21431
Language and Rank
7 Not found
14 Not found
21 Dutch [884], English [408], Spanish [408, 274,
884], French [408], German [764], Italian [408,
700]
28 Italian [179]
35 French [886], Italian [361]
42 Dutch [971]
For this selection of 6 topics, the exemplar image is only found for English for topic 21. This is an initially
disappointing result. We consider, first, the top image being retrieved for each of these topics.
7 (En) stand03_1749/stan Littlehampton. The d03_18895
d03_22144 Parade.
14 (En) stand03_1502/stan The Castle, Loch an 7 (It) stand03_1587/stan [Walker family?]
d03_16737 Eilein d03_28525 Untitled portrait of a
21 (En) stand03_1675/stan Engraving of a painting man.
d03_22740 of a Biblical scene, 14 (It) stand03_1502/stan The Castle, Loch an
[Noah, family and the d03_16737 Eilein
Ark at Mount Ararat]. 21 (It) stand03_1675/stan Engraving of a painting
28 (En) stand03_1714/stan Old Tom Morris, golfer, d03_22740 of a Biblical scene,
d03_7540 St Andrews. (ca 1900) [Noah, family and the
35 (En) stand03_1851/stan Trossachs. Loch Ark at Mount Ararat].
d03_7899 Achray, Trossachs 28 (It) stand03_2046/stan Kingsbarns. Old Grave
Church and Ben An or d03_13818 Stone, Kingsbarns
Binnein (Ben A 'an). Churchyard.
42 (En) stand03_1590/stan Samuel Messieux, 35 (It) stand03_1778/stan Launch X.
d03_28349 refugee from Paris and d03_4502
teacher of French at 42 (It) stand03_2054/stan Motherwell. Town Hall.
Madras College [South d03_18895
Street], St Andrews.
7 (Es) stand03_1587/stan Man in theatrical
7 (Fr) No results d03_7524 costume. [St Andrews ?].
14 (Es) stand03_1853/stan Boat of Garten.
14 (Fr) stand03_1853/stan Boat of Garten. d03_12134
d03_12134 21 (Es) stand03_1675/stan Engraving of a painting
21 (Fr) stand03_1675/stan Engraving of a painting d03_22740 of a Biblical scene,
d03_22740 of a Biblical scene, [Noah, family and the
[Noah, family and the Ark at Mount Ararat].
Ark at Mount Ararat]. 28 (Es) stand03_2046/stan Kingsbarns. Old Grave
28 (Fr) stand03_2046/stan Kingsbarns. Old Grave d03_13818 Stone, Kingsbarns
d03_13818 Stone, Kingsbarns Churchyard.
Churchyard. 35 (Es) stand03_2092/stan Lochgilphead. Crinan
35 (Fr) stand03_1851/stan Trossachs. Loch d03_14170 Canal at
d03_7899 Achray, Trossachs 42 (Es) stand03_1590/stan Samuel Messieux,
Church and Ben An or d03_28349 refugee from Paris and
Binnein (Ben A 'an). teacher of French at
42 (Fr) stand03_1590/stan Samuel Messieux, Madras College [South
d03_28349 refugee from Paris and Street], St Andrews.
teacher of French at
Madras College [South 7 (Nl) stand03_1587/stan No results
Street], St Andrews. d03_7524
14 (Nl) stand03_1502/stan The Castle, Loch an
7 (De) No results d03_16737 Eilein
21 (Nl) stand03_1974/stan Brompton Oratory. Altar
14 (De) stand03_1857/stan View of ship at sea. d03_11773 of Our Lady of Good
d03_9586 Counsel.
21 (De) stand03_1675/stan Engraving of a painting 28 (Nl) stand03_2046/stan Kingsbarns. Old Grave
d03_22740 of a Biblical scene, d03_13818 Stone, Kingsbarns
[Noah, family and the Churchyard.
Ark at Mount Ararat]. 35 (Nl) stand03_1853/stan Fettercairn. Cairn o'
28 (De) stand03_2046/stan Kingsbarns. Old Grave d03_21295 Mount and Clatterin'
d03_13818 Stone, Kingsbarns Brig
Churchyard. 42 (Nl) stand03_2054/stan Motherwell. Town Hall.
35 (De) stand03_1851/stan Trossachs. Loch d03_18895
d03_7899 Achray, Trossachs
Church and Ben An or
Binnein (Ben A 'an).
42 (De) stand03_2054/stan Motherwell. Town Hall.
These tables of results show some interesting features. For Topic 7, 3 of the queries returned no results, while those
that did have a different first result. For topic 14, 5 of the 6 results refer to just 2 images. For topic 21, all but then Dutch
result refer to the same image. For topic 28, all but the English result refer to the same image, however judging by the
caption, the English result is the best. For topic 35, one image is referred to in 3 results. For topic 42, 2 images are
equally referred to.
For Topic 14, the top 5 results have been taken once for each language, and the similarity matrix between these
results is as follows:
En Fr De Es It Nl Total
16737 2 1 1 2 4
12134 1 1 1 3
14211 3 3 2 3
22301 4 4 3 3
16430 5 5 5 3
29031 3 3 2
16014 4 4 2
12138 5 5 2
16009 2 1
9586 3 1
9587 4 1
4618 5 1
13150 1 1
16833 2 1
5702 2 1
20573 4 1
The top 5 results show degrees of similarity between the English, Italian and Dutch results, with German and Spanish
showing similarities, and French showing the most marked behavioural difference. This top 5 have captions as follows:
The Castle, Loch an Eilein
Boat of Garten.
Dunkeld. Loch of Craiglush and Creag nam Mial (Creagnam Hill).
Linlithgow Palace and Loch, from the air.
Bearsden. St Germain's Loch.
It would appear that a number of Lochs, apart from Loch Lomond with any boats on have been discovered in
response to this query! Indeed, none of the 16 results above make mention of Lomond.
From this, it is apparent that although similar behaviour is achieved for certain language translations, the end result of
retrieval is not correctly weighted. The initial concern that translation would have a significant bearing on retrieval is
perhaps now not so relevant as the retrieval itself.
Taking a list of the exemplar images for retrieval, the ranking (where it exists) of that image within the 1000 results
for each language was considered. For each language, if the exemplar image was retrieved within the first 1000, this was
counted. If it was retrieved within the top 100 results, this was also noted. The following table presents the results
obtained.
High Low Ave In top
100
Nl 20 50 17 971 319.55 7
De 21 50 11 973 309.05 11
En 28 50 8 798 257.96 12
Fr 33 51 11 995 337.52 11
It 42 103 1 967 353.14 14
Es 51 117 7 884 314.52 19
For two queries in Italian, both for Topic 19, the exemplar image was retrieved in first place. This is certainly a result
of interest given the analysis of other results in this paper. In the above table, the first column represents the language
code, the second the amount of exemplar images retrieved in the 1000 results, the third is the amount of queries, the
fourth and fifth show the highest and lowest ranking of the exemplars, with the sixth column showing the average
ranking. Column 7 shows the quantity of exemplars occurring in the first 100 retrieved results. This set of results tends
to indicate that there is some value to the approach taken here, but how that compares to other approaches remains to be
seen.
7. A Note on Text and Image Retrieval
Increasingly, images are being indexed and retrieved by both their visual content and by related texts such as captions
that describe the image (Srihari, 1995, Srihari et al., 2000, Paek et al., 1999, Barnard & Forsyth, 2001). Image
descriptors extracted directly from image data (colour, texture and shape) tend to capture little of an image’s semantic
content (Squire et al., 2000, Eakins, 2002) – hence there is a need to extract information about the image content from
collateral texts (Smeulders et al., 2000, Gillam et al., 2002, Salway & Frehen, 2002, Ahmad et al., 2003a).
8. Future Work
Numerous improvements suggest themselves, for example if the system could be grid-enabled then the different
processing modules, as well as instances of the same module, could be run as a service, in parallel, which would
significantly improve the processing time. The ranking mechanism needs to be further refined and tuned by carrying out
more trial runs. To improve the query expansion one suggestion could be to use part-of-speech information from the
query sentence to filter out some of the irrelevant expanded terms returned from Wordnet – for example in the query
“Boats on loch Lomond”, the term boat is being used in the noun and not the verb form so the synonyms (propel by oars,
propel by paddles) related to the verb form of boat would not be retrieved. Otherwise, an attempt could be made to
analyze the British National Corpus5, which might perhaps yield only the more frequently used term associations.
Since the system deals with image retrieval, we are investigating methods of effectively combining text-based with
image-based retrieval techniques. The physical features of an image such as colour, texture, and shape can be extracted
and used in combination with the text features. This technique when incorporated into a system that learns how to index,
would result in a significant improvement in performance (Ahmad et al., 2003b). We are also investigating the creation
of multimedia thesauri, based on Picard’s initial work (Picard, 1995). The premise here is that since specialist texts can
be said to be a reflection of the ontological commitment of domain experts, specialist images may also reflect some form
of ontological commitment on the part of the expert. Also, objects depicted in specialist images often represent the same
concepts that are represented by lexical units in texts. The method discussed in Ahmad et al. (2002) could help in
establishing the link between an image and text.
9. Acknowledgements
This work was partially funded by the European Union through the Generic Information-based Decision Assistant
(GIDA: IST-2000-31123) Projects and the EPSRC through the Scene of Crime Information System (SOCIS:
GR/M89041/01) project.
References
Ahmad, K. Vrusias, B., Tariq, M., (2002) “Co-operative neural networks and integrated classification”. In Proceedings of
the International Joint Conference on Neural Networks. Hawaii, USA May 2002. IEEE Press.
Ahmad, K. and Rogers, M., (2001) “Corpus linguistics and terminology extraction”. In Wright S.E. and Budin G. (eds.)
Handbook of Terminology Management, Vol. 2, Amsterdam/Philadelphia: Benjamins, pp. 725-760.
Ahmad, K., Tariq, M., Vrusias, B. and Handy C. (2003a) “Corpus-Based Thesaurus Construction for Image Retrieval in
Specialist Domains”. In F. Sebastiani (ed.): Proceedings of the 25th European Conference on Information Retrieval
Research, ECIR-03, Pisa, Italy LNCS-2633. Heidelberg: Springer Verlag. pp 502-510,.
Ahmad, K., Casey, M., Vrusias, B., and Saragiotis, P. (2003b) “Combining Multiple Modes of Information Using
Unsupervised Neural Classifiers”. In: Windeatt, T. and Roli, F. (eds.), Proceedings of Multiple Classifier Systems 4th
Int. Workshop, Guildford, UK, June 11-13, 2003, LNCS 2709. Heidelberg: Springer-Verlag, pp. 236-245, 2003b.
Barnard, K., and Forsyth, D. (2001) “Learning the Semantics of Words and Pictures”. International Conference on
Computer Vision, Vol 2, pp 408-415.
Bray, T., Paoli, J., Sperberg-McQueen, C.M., Maler, E. (eds.), (2000). “Extensible Markup Language (XML)”, Version
2.0. W3C Recommendation. http://www.w3.org/TR/REC-xml
Clark, J. (ed.), (1999). “XSL Transformations (XSLT)”, Version 1.0. W3C Recommendation. http://www.w3.org/TR/xslt
Clough, P., Sanderson, S., Reid, N. (2003). “The Eurovision St Andrews Photographic Collection (ESTA)”.
http://ir.shef.ac.uk/imageclef/guide.pdf (February 2003)
Cunningham, H. Maynard, D. Bontcheva, K. Tablan, V. (2002) “GATE: A framework and graphical development
environment for robust NLP tools and applications”. In Proceedings of the 40th Anniversary Meeting of the
Association for Computational Linguistics, 2002.
5
http://www.hcu.ox.ac.uk/BNC
Eakins, J.P., (2002) ‘Towards intelligent image retrieval’. Pattern Recognition. Vol 35, pp 3-14.
Gillam, L., Ahmad, K. and Salway, S. (2002) Digital Heritage and the use of Terminology. In Proceedings of 6th
International Conference Terminology and Knowledge Engineering (TKE) 2002 ISBN 2-7261-1217-X
Handy, C.J. and Ahmad, K. (2003) “Indexer Variability in Visual Domains”. To appear in: Proceedings of the 13th LSP
Conference.
Harris, Z.S., (1998) “Language and Information”. In: Nevin, B. (ed.) Computational Linguistics, Vol. 14, No.4,
Columbia University Press, New York, pp. 87-90, 1988.
Paek, S., Sable, C. L., Hatzivassiloglou, V., Jaimes, A., Schiffman, B.H., Chang, S-F., and McKeown, K. R. (1999)
“Integration of visual and text based approaches for the content labelling and classification of Photographs” ACM
SIGIR'99 Workshop on Multimedia Indexing and Retrieval, Berkeley, California, USA.
Picard, R.W., (1995) “Towards a Visual Thesaurus”. In: Ian Ruthven (ed.) Springer Verlag Workshops in Computing,
MIRO 95, Glasgow, Scotland.
Salway and Frehen (2002), “Words for Pictures: analysing a corpus of art texts”. In Proceedings of 6th International
Conference Terminology and Knowledge Engineering (TKE) 2002 ISBN 2-7261-1217-X
Smeulders, A.W.M., Worring, M., Santini, S., Gupta, A., and Jain, R. (2000) “Content-Based Image Retrieval at the End
of the early Years”. IEEE Transactions on Pattern Analysis and Machine Intelligence, Vol. 22, No. 12, IEEE Press,
pp. 1349-1380.
Squire, McG.D., Muller, W., Muller, H., Pun, T. (2000) “Content-Based Query of Image databases: Inspirations from
Text Retrieval”. Pattern Recognition Letters, Vol. 21. No. 13-14. Elsevier Science, Netherlands, pp. 1193-1198.
Srihari R.K., (1995) “Use of Collateral Text in Understanding Photos”. Artificial Intelligence Review (Special Issue on
Integrating Language and Vision), Vol. 8, pp. 409-430.
Srihari, R.K. and Zhang, Z. (2000) “Show&Tell: a Semi-Automated Image Annotation System”. IEEE Multimedia,
Vol.7, No. 3, pp. 61-71.
Tariq, M., Manumaisupat, P., Al-Sayed, R. and Ahmad, K., (2003) “Experiments in Ontology Construction from
Specialist Texts”. To appear in: Proceedings of EUROLAN Workshop: Ontologies and Information Extraction,
Bucharest, Romania, July 28 -August 08.