6V8 - Production of my Mind

Home page > Publications > Document Processing and Visualization Techniques

Document Processing and Visualization Techniques

 
Here is a state of the art I’ve written with M.Rajman and M.Vesely at EPFL in the context of NEMIS about text mining and visualization.
State of the art for NEMIS | 5 May 2004, by Mortimer

Several Networks of Excellence have been set up in the framework of the European FP5 research program. Among these Networks of Excellence, the NEMIS project focuses on the field of Text Mining.

Within this field, document processing and visualization was identified as one of the key topics and the WG1 working group was created in the NEMIS project, to carry out a detailed survey of techniques associated with the text mining process and to identify the relevant research topics in related research areas.

In this document we present the results of this comprehensive survey. The report includes a description of the current state-of-the-art and practice, a roadmap for follow-up research in the identified areas, and recommendations for anticipated technological development in the domain of text mining.

In the part dedicated to document processing, the discussion focuses on research topics in natural language processing and information retrieval. More precisely, the work covers the tasks related with data selection, filtering and cleaning, morphological normalization and parsing, document representation and similarity computation, and various aspects of data analysis that have all been developed and successfully used in data mining.

In the part dedicated to the visualization, the study essentially focuses on the issue of high dimensionality for document representation. Indeed, the high dimensional representations that are produced in the various stages of the text mining process are usually not well suited for a simple and easily exploitable presentation of text mining results which require specific interpretation techniques, tightly connected to the task of document summarization. In addition, the study has identified a clear need for the development of a unified methodology in the field of visualization.

Publicly available here:

State of the Art, Evaluation and Recommendations regarding "Document Processing and Visualization Techniques".

Date of online publication: 5 May 2004
last-update: 29 December 2004
Forum messages 0
visits:
2689

Some Right Reserved: All right reserved License, (c) 2007 Pierre Andrews
 
 

The most read articles

 
©
Pierre Andrews
York, uk
| Site Map | Site created with SPIP 1.9.2c [10268] | RSS | template by IZO, Mortimer. | clicky stats
published Tina of other tiramisu
published Tina of other tiramisu
published Tina of other tiramisu
published Tina of other tiramisu
published Tina of other tiramisu