Selected results of a comparative study of four ontology visualization methods for information retrieval tasks

Citation
Katifori. A., Torou. E., Vassilakis. C., Lepouras, G., Halatsis, C.. Selected results of a comparative study of four ontology visualization methods for information retrieval tasks. Research Challenges in Information Science, 2008: 133 – 140

Summary
This paper is a user study to evaluate performances of four different ontology visualizations for information retrieval tasks.

The author chose four visualization methods. They are  Protégé Class Browser, Jambalaya, TGViz, and OntoViz.

(1)    Class Browser is a simple visualization technique, representing is-a inheritance relationships through the indented-list paradigm, with subclasses appearing below their superclasses and indented to the right.
(2)    Jambalaya employs nested nodes to denote the is-a type relationships among classes. Node nesting is also used for instance-of relationships, thus a class node contains both its subclasses and its instances.
(3)    TGViz Tab (Touchgraph Visualization Tab) depicts the ontology using a spring–layout technique. According to this technique nodes (classes) repel one another, whereas the edges (links) attract them, thus nodes that are semantically similar are placed closed to one another.
(4)    OntoViz renders the ontology as a two-dimensional graph using a vertical tree layout where parent/child relationships are derived from the is-a links within the ontology.

Users were assigned nine information retrieval tasks separately upon these visualization systems. Finally, they were asked to fill in a questionnaire consisting of two parts. In the first part users gave their opinion on various characteristics, the perceived ease of use and usefulness of each visualization. In the second part, users were asked to rate the four visualizations comparatively (1st to 4th).

This paper compares each visualization method in aspects of Correct Answer Percentages, Comparative Measured Times, and User Comments, and gets the conclusion that Class Browser was the overall “winner” of all aspects, and OntoViz had serious interaction problem. Future work includes the more focused study of individual visualization features, as well as the creation of a visualization for entity evolution.

Posted in Applications, Displays, Evaluation, Ontology, Systems | Tagged , , , | Leave a comment

ThManager: An open source tool for creating and visualizing SKOS

Lacasta, J., Nogueras-Iso, J., Lopez-Pellicer, F. J., Muro-Medrano, P. R., & Zarazaga-Soria, F. J. (2007). ThManager: An open source tool for creating and visualizing SKOS. Information Technology and Libraries, 26(3), 39-50.

In this paper, the authors presented the design and development of an open source thesaurus management system ThManager. They first reviewed the state of art in thesaurus tools, criticizing existing tools for incompatibility in supported thesaurus interchange format, as parts of bigger systems, not easy to be reused or integrated to other information management tools. Based on the recent efforts in thesaurus standardization, they propose SKOS as the most adequate representation model to store thesaurus.

System Architecture

Three layered structure: thesauruses are stored in concept repository, and their metadata description in Metadata repository.

Persistence model

Thesauruses are modeled with SKOS and then stored in binary format transformed with Jena API.

Thesaurus Interrelation

There is interrelation function that relates a thesaurus to an upper-level lexical database using a heuristic voting algorithm.

Graphics depicting the GUI are presented and performance testing and discussion are at the end of the article.

Posted in Systems, Vocabularies | Tagged , , | Leave a comment

Information is Beautiful

There are plenty of good examples and inspirations of how information can be displayed.  Here are the three most popular collections of information visualization samples.  Many are beautiful, but not many are meaningful to the viewer.  Could KOS structures help to make the displays more meaningful?  That’s what we want to explore in this project.   What are some of the ideas, styles, formats, or mapping algorithms that we can learn from these examples in order to create meaningful concept displays?   Your thought or comments are welcome.

1. Information is Beautiful: http://www.informationisbeautiful.net

Information is Beautiful

2. Visual Complexity: http://www.visualcomplexity.com/

3. Information Aesthetics: http://infosthetics.com

Posted in Displays | Tagged , | Leave a comment

Analysis of equivalence mapping for terminology services

Citation: McCulloch, E & Macgregor, G. (2008). Analysis of equivalence mapping for terminology services. Journal of Information Science, 34 (1), p. 70-92.

Rationale: Terminology mapping is evident in a number of KOS interoperability approaches. It involves imposing equivalence, conceptual and hierarchical relationships between terms in different schemes. One  problem inherent in the terminology mapping process is accurately characterizing the type of mapping match between terms. This paper examines various terminology mapping match types and assesses the suitability of Chaplan’s 19 match types [8] as forming the basis of a generic suite of equivalence matches to be used by services employing terminology mapping.

Research Questions: 1) To what extent can Chaplan’s match types  form the basis of a
generic suite of match types to be used by terminology services? 2) Can such a large
number of match types be collapsed into a smaller number, possibly reflecting alternative approaches?

Methodology: To test the validity of Chaplan’s match types, four terminologies were selected for mapping to DDC: LCSH , MeSH, UNESCO Thesaurus and AAT. XML copies of the terminologies were imported into a database and 50 terms from each
terminology were randomly selected. The extracted terms were then mapped to DDC notation by both authors. The authors then categorized the mappings in accordance with Chaplan’s 19 match types.

Findings: The mean level of agreement between authors across all schemes was 164 (82%) . It was found that the level of agreement between authors was higher for discipline-specific schemes such as AAT and MeSH and lower for more generic schemes like LCSH and UNESCO. Nine of  Chaplan’s match types were deemed valid for the purpose of expressing equivalence relationships from terms in AAT, LCSH, MeSH and UNESCO, to DDC. Exact matches, concept matches and narrower term matches were the three most frequently assigned match codes, and were the only three to prove valid across all four schemes investigated. Between them, they accounted for 178 of 200 (89%) codes assigned.

Conclusion: It is considered likely that the nine match types verified from Chaplan’s set could be further reduced, provided they are sufficiently well defined, to form a set closer to that proposed by the set theory-based SKOS Core Mapping Vocabulary Specification (MVS) model.

Posted in Vocabularies | Tagged , , , | Leave a comment

Visual Interfaces for Semantic Information Retrieval and Browsing

Katy Borner(2002).Visual Interfaces for Semantic Information Retrieval and Browsing. Groimenko, Vladimir and Chen, Chaomei(Eds), Visualizing the Semantic Web: XML-based Internet and Information Visualization, Springer Verlag, Chapter 7, pp. 99-115

Summary

The paper describes several approaches to semantic information retrieval and browsing, including Linguistic Approaches, ontology approaches and latent semantic analysis. Also research on visual interfaces for digital libraries and information workspaces are reviewed, mainly focuses on literatures related to two-dimensional map and Three-Dimensional Information Spaces. The inner workings of LVis Digital Library Visualizer is explained.

Key points

1. Visual interfaces do not provide the user-centred cues that help in remembering locations.

2. LVis Digital Library Visualizer’s explanation

(1) LVis Digital Library Visualizer uses latent semantic analysis to extract the semantic similarity of documents; applies clustering techniques and a so-called utility measure to select the partition that best reveals the semantic structure of a document; and visualized the result in a two-dimensional desktop screen or a in a three-dimensional virtual reality environment.

(2) LVis Digital Library Visualizer implemented two prototype interfaces to ease browsing through retrieval results of citation data and image data.

Posted in Displays | Leave a comment

Qualitative Evaluation of Thesaurus-Based Retrieval

Citation:

Blocks, D., Binding, C., Cunliffe, D., & Tudhope, D. (2002). Qualitative Evaluation of Thesaurus-Based Retrieval. In Research and Advanced Technology for Digital Libraries, Lecture Notes in Computer Science (Vol. 2458, pp. 75-107). Springer Berlin / Heidelberg.

Goals of Research:

  • To analyze at a micro level the user’s interaction with interface elements and reasoning in order to illuminate tacit sources of problems and inform iterative interface design decisions.

Research Questions:

  • What are the problems at different search stages if we want to integrate the thesaurus into the search process?
  • What are the pros and cons of FACET information retrieval interface?

Methodology:

  • Qualitative research method
  • Pilot study with eight participants
  • Data gathered included transcripts of think-aloud sessions, screen capture movie files, user action logs and participant observer notes
  • User reasoning and interaction sequences has been studied to achieve a better insight of benefits and problems of FACET interface
  • Data about incidents during search process has been analyzed to identify the possible reasons behind problems

Summary:

    FACET is an experimental system that investigate the possibilities of term expansion in faceted thesauri based on measures of semantic closeness. Getty Art and Architecture Thesaurus has been used for this purpose. FACET calculates a match value between a query term and an index term depending on traversal cost. This in turn feeds into a matching function that produces ranked results, including partial matches, from a multi-term query (p. 348). This study is a formative evaluation of a prototype thesaurus -based retrieval system. Qualitative method has been used to study user search behavior. Two prototypes has been used in one hour sessions and participants complete four search tasks.

Findings:

  • Four major incidents have been observed, window switching, incidents in browsing behaviors, breaking a task down into concepts, and reformulation of query
  • Result of analysis led to important issues such as, the allocation of search functionality to sub-windows, the appropriate role of thesaurus browsing in the search process, and the formation of faceted queries and query reformulation. Some lower level interface issues also have been noticed

Future Works:

  • “An integrated Query Builder tool, which combines searching/browsing with query formulation and maintains the top level facets, [to] better reflect the search process and the thesaurus as a source of terms for the query.” (p. 360)

Important References:

  • Tudhope D., C. Binding, D. Blocks and D. Cunliffe. Compound descriptors in context: a matching function for classifications and thesauri, Proceedings Joint Conference on Digital Libraries (JCDL’02), Portland, ACM Press, 2002.
Posted in Applications, Evaluation | Tagged , , | Leave a comment

Supporting user tasks through visualisation of light-weight ontologies

Fluit, C., Sabou, M., & Van Harmelen, F. (2003). Supporting user tasks through visualisation of light-weight ontologies. In Handbook on Ontologies in Information Systems (pp. 415–434). Springer-Verlag.

This paper explores how to use visualizations in ontology-based semantic web. Based on their opinions, the authors first give two important features to visualize ontology-based semantic web: the ontology should be lightweight, that is, taxonomies with few cross-taxonomical links and logical relations between the classes; the number of instances should be far larger than the number of classes.

They first reviewed existing visualization work based on two criteria: how they support the ontological life-cycle stages and what ontological nature of the visualized data are exploited. They observed that while there are many existing tools provide “schema level” visualizations, few offer instance level visualizations; there is a “clear lack of visualization techniques that (a) display a simple schema with instances and (b) scale to a large number of instances”.

Cluster Map, a map with large sperepheres representing classes, connected with directed links indicating hierarchies and ballon-shaped edges overlapping instances, was then proposed as their techniques for visualizing light-weighted ontologies. Two real world application scenarios of Cluster Map follows; one in construction information portal, the other in peer-to-peer networks.

The authors future analyzed how Cluster Map can be used for a variaty of tasks, including analysis, monitors and query. For analysis, they considers scenarios resulting from different combinations of dataset, ontology and classifications. For querying, they discussed applications of Cluster Map in the four stages of a search task, namely, query formulation, initial of action, review of results and refinement. Most of the tasks scenarios are with an illustrasive graph of the Cluster Map.

The article is well written, without discussing much technique details. The analysis should give useful insiprations for those who will design visualizations for ontologies.

Posted in Displays, Systems | Tagged , , | Leave a comment

Ontology visualization methods – a survey

Citation
Akrivi Katifori, Constantin Halatsis, George Lepouras, Costas Vassilakis, Eugenia Giannopoulou. Ontology visualization methods – a survey. ACM Computing Surveys (CSUR), 2007(39):10:1-10:39

Summery
This paper overviews different technologies for ontology visualization. It is an attempt to summarize existing literature related to ontology visualization, provide comprehensive cataloguing of existing method characteristics as well as record their strong points and weaknesses in relation with user tasks.

1.       Elements visualized include Classes, Instances, Taxonomy, Partial Views, Multiple Inheritance, Role Relations and Properties

2.       Visualization Types:
(1)    Indented List
The taxonomy of the ontology is represented as a tree, like Figure 1

Figure. 1. The Prot´eg´e class browser

(2)    Node-link and tree
Represents ontologies as a set of interconnected nodes, presenting the taxonomy with a top–down or left to right layout, like Figure 2 and Figure 3

Figure 2 Prot´eg´e OntoViz visualization

Figure 3 OntoSphere visualization

(a) Root Focus view: visualize role relations among some upper –level classes; (b) TreeFocus view: visualize tree structures of subtrees of a selected class

(3)    Zoomable Visualization
This category contains all the methods that present the nodes in the lower levels of the hierarchy nested inside their parents, and with smaller size than that of their parents. These techniques allow the user to zoom-in to the child odes in order to enlarge them, making them the current viewing level. (Figure 4)

Figure 4 Grokker. visualization of the results of a Web search on “ontology visualization.”

(4)    Space-filling
Space filling techniques are based on the concept of using the whole of the screen space by subdividing the space available for a node among its children. The size of each subdivision corresponds to a property of the node assigned to it—its size, number of contained nodes, and so on (Figure 5)

Figure 5 Treemap with path to instance “Toronto Raptors” highlighted

(5)    Focus + context or distortion
This group of techniques is based on the notion of distorting the view of the presented graph in order to combine context and focus. The node on focus is usually the central one and the rest of the nodes are presented around it, reduced in size until they reach a point that they are no longer visible. Usually a hyperbolic equation is used to this end. The user has to focus on a specific node, in order to enlarge it (Figure 6 and Figure 7).

Figure 6 Prot´eg´e TGVizTab

Figure 7 Hyperbolic Tree

(6)    3D Information landscapes
A very common metaphor used in VR environments for document management is the landscape metaphor, where documents are placed on a plane as color- and size-coded 3D objects (Figure 8).

Figure 8 Harmony Information Landscape

3.       Advantages and disadvantages for each type

4.       Task Support
(1)    Overview. Gain an overview of the entire collection.
(2)    Zoom. Zoom in on items of interest. When zooming, it is important that global context can be retained.
(3)    Filter. Filter out uninteresting items.
(4)    Details-on-demand. Select an item or group and get details when needed.
(5)    Relate. View relationships among items.
(6)    History.Keep a history of actions to support undo, replay, and progressive refinement.
(7)    Extract. Allow extraction of subcollections and query parameters.
Not all tasks may be supported by visualization.

5.       Conclusion-Future Work
Some ontology management tools already provide combinations of visualization methods. Prot´eg´e [Prot´eg´e Projecthttp://protege.stanford.edu] for example includes several visualization plugins that are coupled with the Prot´eg´e indented list Class Browser.

visualizations should be coupled with effective search tools or querying mechanisms. Browsing is not enough for tasks related to locating a specific class or instance, especially for big ontologies.

Posted in Displays, Ontology | Tagged , , | Leave a comment