Information Extraction Based Multiple-Category Document Classification for the Global Legal Information Network

Richard D. Holowczak, Nabil R. Adam

This paper describes a prototype application of an information extraction (IE) based document classification system in the international law domain. IE is used to determine if a set of concepts for a class are present in a document. The syntactic and semantic constraints that must be satisfied to make this determination are derived automatically from a training corpus. A collection of IE systems are arranged in a classification hierarchy and novel documents are guided down the hierarchy based on the results from the previous level. Experimental results for a research prototype are given on a subset of the Global Legal Information Network domain.

This page is copyrighted by AAAI. All rights reserved. Your use of this site constitutes acceptance of all of AAAI's terms and conditions and privacy policy.