|
Lexical Acquisition and Linguistic ResourcesObjectives | Research activities | Publications | Links | People |
![]() TCC |
One of the major bottlenecks in the development of NLP applications is the lack of linguistic resources which are essential for building portable and robust systems with broad coverage. Moreover, language processing also requires a substantial amount of basic research, much of which largely depends on the availability of large-scale linguistic resources.
Linguistic resources should have large coverage, contain detailed and accurate information, and be updated to account for language changes. On the one hand, their manual construction is very expensive and requires highly trained human resources. On the other hand, automatic techniques for lexical acquisition do not guarantee precision and accuracy of the information acquired. The challenge in our approach to lexical acquisition is to combine automatic techniques and manual control, in order to obtain the best trade off between the reliability of the resouces and the human effort needed.
The lexical acquisition methodologies we have developed are dictionary-based as well as corpus-based. These methodologies are being applied to acquire Italian and English lexical information but, given the appropriate sources, they are usable also for other languages.
The results obtained so far from the different tasks of automatic lexical acquisition have led to the creation of a number of multilingual and large-coverage linguistic resources which are widely used by the TEXTEC NLP systems at ITC-irst (e.g. Question Answering and Word Sense Disambiguation).
Maintainer: bentivo
itc.it
Last modified: Fri Oct 24 16:43:28 MET DST 2003