| |
HILT: High-Level
Thesaurus Project Proposal
Contents
- Introduction,
Background, and Overview
- Direct
Contribution to the DNR
- Relevance
to the DNER and the RDN
- Expected
Impact
- Regional,
National and International Importance to Researchers
- Description
of Collection, Strengths, Size
- Purpose
and Outline Description of the Project
- Lead
Institution, Partners, Relationship to Core Institutional Objectives
- Project
Management Proposals, Mechanisms for Self-monitoring, Self-evaluation
- External
Evaluation
- Deliverables
- Start
and Finish Dates
- Milestones;
Schedule
- Standards
- Person
Weeks required for the Work
- Total
Estimated Cost and Contribution from RSLP
- Dissemination
Strategy
- Proposed
Exit Strategy
- Statement
of Policy on Access
- Statement
on Minimum Standards for Bibliographic Records
- Biographies
of Key Individuals
- Key
Institutional Contacts
-
Introduction, Background, and Overview
Introduction
This
full bid can be considered either as a SCONE extension proposal
or as relating to Collaborative Collection Management Projects
priority E.1.2.5 of the RSLP call (Annexe B): 'Establishment
and evaluation of the costs and benefits of different approaches
to the co-ordination of collection management and service delivery
activities' or E 1.3 , which covers projects in areas other
than those specified by RSLP. It was invited by RSLP and JISC
subsequent to RSLP discussing the possibility of joint funding
with JISC after the HILT expression of interest had been accepted
by the RSLP Steering Group. The full bid has been re-written
subsequent to discussions with Ronald Milne and Chris Rusbridge
which suggested re-focusing the bid, widening the participants
to include archives community participants, and removing the
demonstrator proposed in the expression of interest (amongst
other things). The proposal comes from a partnership comprising
the Centre for Digital Library Research, UKOLN, OCLC, The MDA,
The National Council on Archives, NGfL (Scotland), SUfI and
the Scottish Library and Information Council (SLIC). Other relevant
organisations will be represented through the Steering Group.
Full details of the proposal are presented below (see 1 (Overview),
7 and 11 particularly). In general outline, however, the project
aims to study and report on the problem of cross-searching and
browsing by subject across a range of communities, services,
and service or resource types (libraries, museums, archives,
the DNR, clumps, the DNER, the RDN, bibliographic databases,
numeric data, and others). What exactly is the nature and structure
of the problem? Who needs to solve it? What does current expert
opinion say about the issue? What are the requirements of a
solution? Can a common subject scheme be found that will meet
all or nearly all of these requirements? If not, is a UK high-level
thesaurus that is integrated with one or more international
schemes the answer? The group would wish to look at these and
any other options that may exist once the exact nature of the
problem itself has been defined and documented.
Top
Background
From
the SCONE perspective, the HILT proposal arises originally out
of the SCONE deliverable to map the Conspectus subject scheme
used in CAIRNS and SCONE to other UK schemes but goes far beyond
this. It was found to be necessary to go beyond the SCONE mapping
because other schemes in use (e.g. in the other clumps projects
and the Research Assessment Exercise (RAE)) were one level schemes
and did not have the depth required either to map to the Conspectus
scheme - a three-level hierarchy - or to provide useful navigation
within and beyond CAIRNS. An alternative UK-oriented scheme
with similar depth - or a means of mapping to such a scheme
- is required and one aim of the HILT proposal is to address
this requirement. Earlier work done by SCONE, CAIRNS and NGfL
(Scotland) looking at the Conspectus scheme, the all-subjects
schools scheme used by NGfL (Scotland) and various other schemes
will help inform the specifics of this particular requirement.
This work had support from CAIRNS, SCONE, National Library of
Scotland (NLS), NGfL (Scotland), SLAINTE, the other UK clumps
projects (RIDING, M25 and Music Libraries Online), and the cross-sectoral
group represented by the Scottish Library and Information Council's
(SLIC) Advisory Group on Interoperability and Access (SAGIA)
which covers not only Higher Education, but also Further Education
(Glasgow Telecolleges Network), Public libraries, NGfL (Scotland),
the Scottish Cultural Resources Access Network (SCRAN) and the
Scottish University for Industry (SUfI). The HILT expression
of interest which preceded this full bid was a development of
this early work, and was further informed by discussions at
the MODELS 11 workshop on terminology held in Bath on January
11th of this year. One outcome of the workshop was
agreement that a HILT expression of interest should be submitted
to RSLP. MODELS 11 included representatives from a wide range
of organisations. In addition to those in the original SCONE
group described above, these included the UK Office for Library
and Information Networking (UKOLN), the Museum Documentation
Association (MDA), the National Preservation Office (NPO), the
British Library, English Heritage, the Science Museum, British
Educational Communications and Technology Agency (BECTA), Natural
History Museum, National Museums of Scotland, the Higher Education
Funding Council for England (HEFCE), and the Library and Information
Commission (LIC).
Top
Overview
of HILT proposal
This
full bid is informed by the earlier work described above, but
has been shaped in the main by the subsequent discussions with
RSLP and JISC, and by the comments of the external assessors
on the original expression of interest.
As
indicated earlier, the HILT proposal comes from a partnership
comprising the Centre for Digital Library Research, OCLC, UKOLN,
The MDA, The National Council on Archives, NGfL (Scotland),
SUfI and the Scottish Library and Information Council (SLIC).
All have a common interest in facilitating user searching and
browsing by subject, whether this be within a single service
(e.g. SLIC), across a group of similar services (e.g. the RDN),
or across a group of services that span sectors, domains, regions,
professions, languages, time periods with differing terminologies,
countries, or a mixed subset of these (one example covering
some of these would be a clumps project such as CAIRNS). The
aim of the HILT proposal is to determine how this requirement
to offer users subject searching and browsing - or, more commonly,
cross-searching and browsing - can best be met, when the various
communities, services and initiatives who have the need (HE,
FE, public libraries, Museums, The Archives Community, NGfL,
UfI, the RDN, the DNER, the Clumps projects, and others) usually
have different requirements, take different approaches, and,
more often than not, use different subject schemes. More specifically,
the HILT project aims to:
- Thoroughly
research, determine and document the exact nature of the problem
in detail, focusing on UK requirements across the various communities,
services and initiatives, but setting the study firmly (and
necessarily) in the context of international requirements and
standards:
- Surveying
and reviewing both the literature and expert opinion
- Identifying
all key communities, services and initiatives with an interest
in resolving it
- Determining
what perspective on the problem the various communities have
and what they see their users' requirements as being (Note:
it was agreed with RSLP and JISC that this was the only practical
way of taking user needs into account at this stage)
- Determining
which subject schemes or thesauri are used by the stakeholder
groups and also which other schemes exist that might solve
the problem
- Identifying
relevant organisational, inter-organisational and 'political'
issues
- Identifying
non-terminological and non-technical barriers to the adoption
of any given solution (e.g. barriers to uptake by stakeholders
or their cataloguers, difficulty and cost of retroconversion
of legacy metadata, etc.)
- Analyse
the data obtained in this exercise, and discuss the results
with the various communities, with a view to determining:
- The
exact nature of the subject terminologies problem itself
- The
structural requirements of any solutions (including not only
terminological relationship requirements but also other elements
such as ease and cost of maintenance)
- Other
requirements (organisational, non-terminological and non-technical
barriers etc.)
- User
and machine interface issues
- Requirements
in respect of integrating with subject terminologies and thesauri
focused on specific subject areas (and other, similar, 'narrow-focus'
in-depth schemes)
- Use
this information to reach a consensus within the project as
to whether:
- There
is an existing universal subject scheme, thesaurus, or other
solution that meets the requirements (or nearly so)
- It
would be possible to adapt one or more existing schemes, thesauri,
or other solutions to solve the problem
- It
is necessary (and possible) to create a subject scheme, or
thesaurus, or other subject organisation and indexing system
to solve the problem
- It
is impossible to solve the problem and, if so, why, and what
the implications of this are for users
- Attempt
to reach a similar consensus within the group of stakeholders
generally, both at a MODELS series workshop and through other
methods
- Contribute
to and co-operate with an external evaluation of the project
- Make
a final report and recommendations to RSLP, JISC, the various
stakeholders, and the national and international community generally.
Top
- Direct
Contribution to the DNR
A
successful project outcome will help make the DNR more accessible
by proposing a means by which cross-searching and cross-browsing
by subject can be improved, either by the identification of a
common subject scheme that could be accepted and used by different
sectors and domains, different regions of the UK, and major services
and initiatives (RDN, Archives Hub, DNER, clumps projects, SCRAN)
and that is internationally recognised and used, or by the specification
of a means by which different schemes required by different communities
and services may be mapped to such a common scheme, potentially
producing both human and machine-readable outputs.
- Relevance
to the DNER and the RDN
The
DNER and the RDN both have a requirement for high level subject
searching and browsing and have not yet made a decision on how
to approach the issue. The outcome of HILT will therefore be of
interest to them. If a suitable solution were available as a result
of HILT they would adopt this rather than fund an additional study
of their own.
Top
- Expected
Impact
All
disciplines are covered by this proposal, so there is no particular
impact on any given discipline. A positive outcome would benefit
all disciplines equally and would also offer major benefits to
researchers, teachers and students whose work is multi-disciplinary.
- Regional,
National and International Importance to Researchers
Since
the aim is to find a solution that will be of benefit in the context
of the whole DNR (and therefore the DNER), it is safe to assume
that the collections to which it is proposed HILT be applied contain
a significant range of materials of regional, national or international
value.
- Description
of Collection, Strengths, Size
The
aim is to provide a common subject scheme that will apply across
the whole of the distributed national resource (DNR). In so far
as this question is relevant to the proposal, therefore, the collection
in question is the DNR and has its strength and size.
Top
- Purpose
and Outline Description of the Project
The
purpose of the project is to study and report on the problem of
cross-searching and browsing by subject across a range of communities,
services, and service or resource types (Libraries, Museums, Archives,
the DNR, clumps, the DNER, the RDN, bibliographic databases, numeric
data, and others) - to research the problem, analyse and document
its exact nature in detail, determine whether it can be solved
and, if so, how, and attempt to reach a consensus on the issue
across the various communities, services and initiatives identified
by the project as stakeholders.
This
would be done by:
- Surveying
the literature for publications on the issue and for UK and
International perspectives on it; discussing the topic with
internationally active groups (OCLC, UNESCO and others). This
would look at issues relating to controlled vocabularies generally
and the project report would contain a pre-amble relating
to this. However, the intention is that the project should
focus mainly on the issue of subject terminologies
- Identifying
all key UK groups, communities and services with an interest
in finding a solution to this problem; creating mechanisms
for informing these stakeholders on project progress and obtaining
ongoing input from them (web-site, e-list, possibly a newsletter
if any participants will have difficulties accessing the web-site
and the e-list)
- Discussing
various project elements and approaches with the external
evaluator to agree a way forward
- Discussing
the problem with each of the stakeholders with a view to establishing
and documenting the details of their perspective on the problem
and related issues - the subject schemes they use (e.g. LCSH,
UNESCO thesaurus, Cornucopia, DDC), the requirements of their
users, the elements of the problem from their viewpoints,
their needs as regards integration of any UK solution with
international schemes, their views on which, if any, existing
universal schemes they see as potentially offering a solution
or partial solution to either the question of international
integration, or the cross-searching or browsing problem, or
both.
- On
the basis of this survey, and a follow up search for relevant
publications and projects, widening the review of the literature
to form a more complete view of available knowledge and opinion
on the issue
- Making
information on the main universal subject schemes identified
in this process available to the various communities (archives,
museums, libraries etc.) and ascertaining the views of each
on the merits and demerits of the various schemes.
- Conducting
a survey of literature, projects, organisations, and individuals
to determine current views on best practice in respect of
both user and machine oriented interfaces both to thesauri
and to subject terminologies generally
- Organising,
evaluating and analysing the data from the above data gathering
exercises, identifying a range of possible approaches to the
various issues raised, and compiling a first draft of a report
on the project, together with draft recommendations on the
best approach to the problem. This would look at the structural
requirements of any solution proposed, including maintenance
issues, community control issues, and issues related to the
needed inter-relationships between subject terms in any thesaurus
- Making
the report widely available for comment for a period leading
up to a major workshop in the MODELS series, this to include
breakout sessions on key issues
- Producing
a new draft of the report based on workshop outcomes and disseminating
widely for comment
- Producing
a penultimate draft of the report
- Bringing
in the external evaluator at this stage to evaluate the project
and the report
- Producing
a final report and recommendations, incorporating the external
evaluation report itself and, if appropriate, any changes
it proposes
All
of the above would, of course, be monitored by a Steering Group
with representation of all major stakeholders
Top
- Lead
Institution, Partners, Relationship to Core Institutional Objectives
The
proposal is led by the Centre for Digital Library Research at
Strathclyde University. The other partners are UKOLN, The MDA,
The National Council on Archives, OCLC, NGfL (Scotland), SUfI,
and the Scottish Library and Information Council (SLIC). Letters
of support are provided: 2 will arrive late and will be forwarded
later. A letter from the external evaluation consultant is also
provided.
- Project
Management Proposals, Mechanisms for Self-monitoring, Self-evaluation
Day
to day management will be the responsibility of the project staff
and the Project Director. This Project Team will report
to a Project Management Group consisting of the team and
representatives from each of the participating institutions, perhaps
enhanced by a few experts in sectoral terminologies. In addition,
there will be a Project Steering Group representing stakeholder
groups. To ensure the needs of the whole community are met, the
project Steering Group would have at least one, and sometimes
two, representatives from the various communities involved (Museums,
Archives, HE, FE, public libraries, NGfL, SUFI and so on). Consideration
will also be given to creating a larger working group that would
echo the high level Steering Group in terms of membership but
would work more directly and more frequently with the Project
Steering Group. Regional requirements in Wales and Ireland will
also be taken into account, as will possible future requirements
to build in multi-lingual capacity. Self-monitoring and self-evaluation
mechanisms will be built into this structure and will be agreed
in detail with the Project Steering group. However, there will
also be an external evaluation undertaken by a consultant with
appropriate experience in the field.
- External
Evaluation
An
external evaluation will be carried out towards the end of the
project. This has been itemised separately in the costs section.
Leonard Will is proposed as the consultant and it is estimated
that a total of 12 days of consultancy will be required.
Top
- Deliverables
The
project deliverables for HILT will be:
- Initial
survey and review of the UK and international literature on
the issue of high level subject terminologies and their inter-relationships,
and of expert UK and international opinion on problems, solutions
and strategies related to the topic. [Deliverable 1]
- A
comprehensive list of all major UK stakeholders [Deliverable
2]
- The
development and implementation of a dissemination and feedback
strategy for the project, the aim of which would be to inform
these stakeholders on project progress and obtain ongoing
input from them (web-site, e-list, possibly a newsletter if
any participants will have difficulties accessing the web-site
and the e-list) [Deliverable 3]
- A
report on each stakeholder’s perspective on the problem and
related issues - the subject schemes they use (including those
related to more specific subject areas), the elements of the
problem from their viewpoints, their needs as regards integration
of any UK solution with international schemes, their views
on which, if any, existing universal schemes they see as potentially
offering a solution or partial solution to either the question
of international integration, or the cross-searching or browsing
problem, or both. [Deliverable 4]
- Extended
version of deliverable 1 based on additional research informed
by the results of deliverable 4 and offering a more complete
view of available knowledge and opinion on the issue [Deliverable
5]
- Report
on the views of the various stakeholders on the merits and
demerits of the various high level subject schemes detailed
in deliverable 5. [Deliverable 6]
- Report
of a survey of literature, projects, organisations, and individuals
to determine current views on best practice in respect of
both user and machine oriented interfaces both to thesauri
and to subject terminologies generally [Deliverable 7]
- Draft
report on the findings of the project to date, organising,
evaluating and analysing the data, identifying a range of
possible approaches to the various issues raised, and making
draft recommendations on the best approach to the problem
(or whether it is resolvable at all). [Deliverable 8]
- Awareness
raising in the stakeholder communities through dissemination
of the draft report and a request for feedback and comment
[Deliverable 9]
- A
major workshop in the MODELS series, this to include breakout
sessions on key issues identified in the draft report [Deliverable
10]
- Wide
dissemination of workshop outcomes in the form of a new draft
of the report [Deliverable 11]
- Penultimate
draft of the report [Deliverable 12]
- Report
of the external evaluator [Deliverable 13]
- Final
report and recommendations, incorporating the external evaluation
report itself and, if appropriate, any changes it proposes
[Deliverable 14]
Top
- Start
and Finish Dates
It
is proposed that the project be carried out over twelve months.
The start date will depend on when the funds are made available.
Possible dates are August 1st 2000 to July 31st
2001.
- Milestones;
Schedule
|
The HILT project will last 12 months
|
Month:
|
|
Project set-up, staff in place, committees
in place
|
1
|
|
Initial survey and review [Deliverable
1]
|
1-3
|
|
List of UK stakeholders [Deliverable
2]
|
1-2
|
|
Dissemination and feedback strategy (includes
web-site) [Deliverable 3]
|
14
|
|
Stakeholders' perspectives report [Deliverable
4]
|
2-6
|
|
Extended survey and review [Deliverable
5]
|
6-7
|
|
Stakeholders' report on the merits and demerits
of the various high level subject schemes [Deliverable
6]
|
5-7
|
|
Report on best practice: user and machine
oriented interfaces [Deliverable 7]
|
5-7
|
|
Draft report, organising, evaluating, analysing
data, and making draft recommendations. [Deliverable
8]
|
5-7
|
|
Awareness raising through dissemination
of draft report [Deliverable 9]
|
8
|
|
MODELS Workshop [Deliverable 10]
|
9
|
|
Wide dissemination of workshop outcomes
in the form of a new draft of the report [Deliverable
11]
|
9-11
|
|
Penultimate draft of the report [Deliverable
12]
|
9-10
|
|
External evaluation and report [Deliverable
13]
|
2,10-11
|
|
Final report and recommendations, incorporating
the external evaluation report itself and, if appropriate,
any changes it proposes [Deliverable 14]
|
11-12
|
Top
HILT
Schedule
|
|
Month:
|
S
|
O
|
N
|
D
|
J
|
F
|
M
|
A
|
M
|
J
|
J
|
A
|
Post Project
|
|
Activity:
|
|
|
Project set-up, staff in place, committees
in place
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Initial survey and review [Deliverable
1]
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
List of UK stakeholders [Deliverable
2]
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Dissemination and feedback strategy (includes
web-site) [Deliverable 3]
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Stakeholder perspective's report [Deliverable
4]
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Extended survey and review [Deliverable
5]
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Stakeholders' report on the merits and demerits
of the various high level subject schemes [Deliverable
6]
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Report on best practice: user and machine
oriented interfaces [Deliverable 7]
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Draft report, organising, evaluating, analysing
data, and making draft recommendations. [Deliverable
8]
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Awareness raising through dissemination
of draft report [Deliverable 9]
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
MODELS Workshop [Deliverable 10]
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Wide dissemination of workshop outcomes
in the form of a new draft of the report [Deliverable
11]
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Penultimate draft of the report [Deliverable
12]
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
External evaluation and report [Deliverable
13]
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Final report and recommendations, incorporating
the external evaluation report itself and, if appropriate,
any changes it proposes [Deliverable 14]
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Top
- Standards
HILT
will aim to adhere to appropriate standards wherever possible.
The project is aware of the British standard guide to establishment
and development of monolingual thesauri (BS5723:1987) (ISO2788-1986)
and the British standard guide to establishment and development
of multilingual thesauri (BS6723:1985) (ISO5964-1985) and
will consult these and other appropriate works. In addition, it
will aim to build UK requirements around terminologies recognised
and used internationally (e.g. DDC, LCSH, UNESCO). In all other
matters, the eLib Standards Guidelines http://www.ukoln.ac.uk/services/elib/papers/other/standards/version2
would be consulted.
- Person
Weeks required for the Work
It
is estimated that the work will occupy Grade 1A researcher at
CDLR or equivalent, that MDA and NCA will each require £5000 to
cover the cost of working with the project. In addition, that
a .25 grade 1A researcher will be required at UKOLN, and that
£5000 will be required to conduct the external evaluation. The
CDLR and UKOLN posts will last for one year each.
- Total
Estimated Cost and Contribution from RSLP
The
total project cost is £71,955 over 1 year, the RSLP contribution
would be £67,955
Top
-
Dissemination Strategy
Dissemination
of information would be via the HILT web-site (and the CDLR, MDA,
NCA and UKOLN web-sites), postings to appropriate e-mail lists,
papers and news items submitted to professional publications and
presentations at seminars and conferences. Key progress reports
would be sent to all relevant organisations and institutions in
the United Kingdom.
The
dissemination strategy will be proactive:
- Bulletins
will be sent on project aims, progress, and outcomes to all
key e-mail discussion lists relevant to stakeholders and potential
stakeholders. Each will have brief details of the news item,
together with a specific URL (if appropriate) to further details
on the project web-site. Each e-mail will also have the general
URL of the project web-site
- Similar
news items and project reports will be disseminated by other
means - through news items and papers in journals read by
stakeholders and presentations given at seminars and conferences
relevant to stakeholders
- Stakeholders,
their discussion lists, journals, newsletters and meetings
will be identified early in the project so that the above
strategy can be implemented.
- Towards
the end of the project, brief 'glossy' leaflets describing
project outcomes will be published and sent to all relevant
UK institutions. A web-based copy will also be set up and
wider dissemination ensured through providing its URL in e-mail
bulletins to appropriate lists.
Top
- Proposed
Exit Strategy
HILT
is essentially a study and an exit strategy is not appropriate
(agreed with R. Milne). However, the project will aim to address
how a proposed solution to the problem it addresses might be supported
in the long term and will aim as far as possible to find a solution
that fits in with existing maintained schemes in order to minimise
any maintenance and development costs. It is not possible to say
at this stage whether this will be possible.
- Statement
of Policy on Access
HILT
participants agree to meet the requirements of the RSLP Steering
Group on access where this is relevant to them and to the project.
- Statement
on Minimum Standards for Bibliographic Records
HILT
participants agree to meet the requirements of the RSLP Steering
Group with regard to minimum standards for bibliographic records
where this is relevant to them and to the project.
Top
- Biographies
of Key Individuals
Include:
- Dennis
Nicholson (CDLR and CAIRNS/SCONE)
- Rachel
Heery (UKOLN)
- Matthew
Stiff (MDA, Cornucopia)
- Nick
Kingsley (National Council for Archives)
- Leonard
Will (external consultant - see associated letter)
- Paul
Miller (Interoperability Focus) - Steering Group and advice
- Gordon
Dunsire (CAIRNS and SLAINTE)
- Crawford
Revie (Department of Information Science, University of Strathclyde).Over
the past 2-3 years Crawford has been involved in a SHEFC funded
project to create a 'MultiBrowser' to navigate data collected
from the RAE exercises, and has set up a social science-based
thesaural interface with the CSPP to survey statistical data.
He is currently supervising a PhD student in the area of thesaurus-based
query expansion within the agricultural domain, and has recently
presented a workshop on "Thesauri on the Web" to the UN’s Food
and Agriculture Organisation at their headquarters in Rome.
- Key
Institutional Contacts
Senior
institutional officer: Derek Law, University of Strathclyde
Project
Director: Dennis Nicholson, University of Strathclyde
|
|