Automatic extraction of ontology from technical journals

Document Type

Conference Proceeding

Publication Date

1-1-2013

Abstract

Much of a business or government organization's "knowledge" is in the form of unstructured technical document sources such as project summary reports, case studies, research and development reports, technical conference papers, and similar types of documents. Many companies wish to extract and organize knowledge from these documents in a way that will facilitate incorporating the knowledge into organizational decision making faster and more broadly, such as in the form of automated Decision Support Systems (DSS) and knowledge-based expert systems. Unfortunately, the vast quantity of documents and the limited availability of expert personnel in knowledge extraction and knowledge structuring have impeded the potential usefulness of knowledge in unstructured documents. In this research, we focus on extracting knowledge into structured ontology from technical project and applied research reports in industry. In this paper, we specifically look at the problem of identifying the problem domain of the technical paper and the key attributes (characteristics) that serve to scope where the methodology discussed can or should be used. A combination of rule-based and machine learning classifiers are utilized. The test bed is an annotated corpus developed from technical text in the offshore oil and gas industries, based on the Offshore Technology Conference (OTC) annually held in Houston, USA.

Publication Source (Journal or Book title)

IIE Annual Conference and Expo 2013

First Page

224

Last Page

230

This document is currently not available here.

Share

COinS