CSE-291: Ontologies in Data Integration, Spring 2003
Logistics
Overview
Ontologies play an increasing role in data and information
integration, as evidenced by new workshops, special tracks at
conferences, new journals, and activities such as the "Semantic Web".
Within computer science, ontologies have traditionally been studied in
the context of AI and logic-based knowledge representation. The
database community has also (re-)discovered this topic and is now
increasingly active in this area. Application areas range from the
business world (Enterprise Information Integration) to information
integration for scientific data (e.g., Gene Ontology, Unified
Medical Language System, ...). The latter is the focus of this
course. In particular, we will study ontologies from different
perspectives, and will address questions such as:
- How are existing ontologies used in the various
application domains, e.g., bioinformatics?
- What
formalisms (graphs, logics, ...) are used (or not used...) for
the representation of ontologies?
- Given some formalism, what problem does an ontology
solve? What does it do for you and/or the data integration
problem? And what can we, as computer scientists, do to an
ontology (i.e., when we study ontologies in their own right)?
The seminar will include introductory presentations by the instructor,
possibly guest lectures by other faculty, and presentations by
students based on the literature or practical exercises.
More specifically, we will
- read and discuss relevant articles from the literature
- take a look at specific ontologies and ontology-like structures
(such as, e.g., GO, UMLS, ...)
- take a look at specifics "ontology tools" such as Protege
2000, FACT, ...
Students will have the choice between "theory" and "practice" studies,
where the former is typically a presentation of one or more selected
articles, and the latter is a "modeling experiment" that involves
applying an ontology tool or formalism to a specific domain (e.g.,
ecology, geosciences, biology, ...) or problem.
Grading
Grading will be primarily based on the quality of the report and
presentation that the student will prepare and give during one of the
last meetings of the class. The presentation will be based on one or
more papers, and may include a system demonstration (for the
``hands-on'' topics).
The difference in the number of units is reflected in the level of
effort: e.g., for one unit, a typical presentation and report will
cover 1-2 papers (for a theory topic), or a modeling exercise with a
single tool (for hands-on topics). For four units, a comparative study
with 3-4 papers is typical (or a very detailed analysis of 1-2
papers), or a comparative modeling exercise with several tools (for
hands-on topics).
Presentations will be ca. 45 minutes (independent of units taken).
Comparative studies can be conducted in teams of two students and
presented jointly (30-45 minutes each student).
Topics and References
Schedule
- April 4: Course overview, data/information integration,
mediator architecture, what is an ontology?, examples, ... [slides.ppt]
- April 11: Ontologies (cont'd), example
ontologies, introduction to topic maps, first-order logic
primer, introduction to description logic [slides.ppt]
- April 18 Guest lectures by ...
- April 25
- First-order logic revisited, formal
ontology: how to capture some of the meaning (e.g., of
"on-ness") through logic (ontological commitment, modal
logics) [slides.ppt]
- Guest lecture by Gully Burns on
the
NeuroScholar project [slides.ppt]
- May 2 Introduction to
description logics, and reasoning in FO: tableaux calculus
examples [slides.ppt]
- May 9 Tableaux calculus II,
reasoning about concepts with LeanTAP, definitorial
cycles [slides.ppt]
- May 16 (9am-10am!):
- Guest lecture by Douglas Greer: Ontology Driven
Architectures in Bioinformatics,
[slides.ppt]
- May 23: Student presentations
- Guilian Wang (Biological taxonomies)
- Dayou Zhou (Model management, query rewriting)
- Efrat Jaeger (FACT reasoner)
- May 30: Student presentations
- Tulika Agrawal (Semantic Web languages: OWL, RDF)
- Hrishikesh Gupta (Formal Concept Analysis)
- Kai Lin (GeoreferenceOnline)
- June 6: Student presentations
- Chien-Yi Hou (Ontologies in data integration)
- Michael Hoang (Semantic Web tools)
- Jonathan Ultis (Model management)
Bertram Ludaescher
Last modified: Tue Jun 17 13:24:21 PDT 2003