Citation: McCulloch, E & Macgregor, G. (2008). Analysis of equivalence mapping for terminology services. Journal of Information Science, 34 (1), p. 70-92.
Rationale: Terminology mapping is evident in a number of KOS interoperability approaches. It involves imposing equivalence, conceptual and hierarchical relationships between terms in different schemes. One problem inherent in the terminology mapping process is accurately characterizing the type of mapping match between terms. This paper examines various terminology mapping match types and assesses the suitability of Chaplan’s 19 match types [8] as forming the basis of a generic suite of equivalence matches to be used by services employing terminology mapping.
Research Questions: 1) To what extent can Chaplan’s match types form the basis of a
generic suite of match types to be used by terminology services? 2) Can such a large
number of match types be collapsed into a smaller number, possibly reflecting alternative approaches?
Methodology: To test the validity of Chaplan’s match types, four terminologies were selected for mapping to DDC: LCSH , MeSH, UNESCO Thesaurus and AAT. XML copies of the terminologies were imported into a database and 50 terms from each
terminology were randomly selected. The extracted terms were then mapped to DDC notation by both authors. The authors then categorized the mappings in accordance with Chaplan’s 19 match types.
Findings: The mean level of agreement between authors across all schemes was 164 (82%) . It was found that the level of agreement between authors was higher for discipline-specific schemes such as AAT and MeSH and lower for more generic schemes like LCSH and UNESCO. Nine of Chaplan’s match types were deemed valid for the purpose of expressing equivalence relationships from terms in AAT, LCSH, MeSH and UNESCO, to DDC. Exact matches, concept matches and narrower term matches were the three most frequently assigned match codes, and were the only three to prove valid across all four schemes investigated. Between them, they accounted for 178 of 200 (89%) codes assigned.
Conclusion: It is considered likely that the nine match types verified from Chaplan’s set could be further reduced, provided they are sufficiently well defined, to form a set closer to that proposed by the set theory-based SKOS Core Mapping Vocabulary Specification (MVS) model.