BLENDER: Cross-lingual Cross-document
Cross-media Information Extraction and Fusion Lab

 
 


  » Home

  » Projects

  » Publications

  » People

  » Softwares
  » Wiki  » Blog  » Contact us

Overview

One of the initial goals for Information Extraction (IE) was to create a database of relations and events from the entire input corpus, and allow further logical reasoning on the database. The artificial constraint that extraction should be done independently for each document was introduced in part to simplify the task and its evaluation. What people expect to obtain from IE is not thousands of tables (summaries of facts); instead a concise multi-document summary created from information aggregation and reasoning will be much more useful. Fortunately, many of these events will be reported multiple times, in different forms, both within the same document and within topically related and continued documents (i.e. a collection of documents sharing participants in potential events). Therefore, especially for military and economic purposes, an automatic system which can predict events from existing articles will be very helpful, comparing to shallow summarization of the events already happened. Therefore the goal of BLENDER lab is to explore cross-lingual and cross-document information processing:

  • Information Aggregation: gather together IE results from a set of relevant documents to produce compact databases
  • Information Correction: correct wrong information and recover missed information
  • Information Prediction: predict events which are likely to happen in the future with probabilities
Ultimately we would like to build an information prediction system which is able to perform essential logic reasoning for natural language, and can be applied to more general domains and a multi-lingual environment.

© Copyright Heng Ji 2008, All Rights Reserved