Machine Readable Dictionary

 Machine Readable Dictionary (MRD)

The emergence of digital technology make it possible to write linguistic information in print dictionary such as pronunciation, definition, part of speech to CD-ROM or application program so that users no longer have to alphabetically index target words. We can just type in the query box, and the program will automatically search the target word and display the result. This kind of dictionary is often referred as electronic dictionary. It may come in software form for PC, or in application form integrated to portable gadgets or smart phones. It is important to note both paper and electronic dictionary primarily aim on language learning aspects.

 

On the other side, Machine Readable Dictionary (MRD) aims on natural language processing (NLP).Some MRDs can be used to support language learning, but most of them are served for computational purposes like part of speech tagging, text summarization, opinion mining etc. Users of MRD are usually experts in informatics, computer science, or computational linguistics. In essence, MRD is also composed of entries and some linguistic information. However, lexical data are structured in different way because they are going to be used to perform some NLP tasks. Therefore, unlike two previous dictionaries, MRD format might not be friendly to human eye, but friendly enough for computer. Consider the following illustration (Paumier, 2000:41)

 

The above illustration is an extract from MRD used in Unitex, a corpus-processing software. Each entry line in Unitex MRD is composed of some syntactic and semantic information. MRD is also often referred as Lexical Resource, one of the most widely known in computational linguistics is English WordNet. This resource is being used to perform various NLP tasks. One of the most interesting features of WordNet is its ability to track word relation such as: synonym-antonym, hyponym-hypernym, etc. Visual thesaurus (http://www.visualthesaurus.com) is an applications that use English WordNet.

 

However, the use of MRD really depends on its users, objectives and programs. Sometimes one particular MRD serves better for particular aim, but when it aims on different purpose, the use of another MRD can be more efficient. This is why MRDs are designed in different format, and contain different information.

  • References
  •    ● Mitkov, R. ed. 2004. The Oxford Handbook of Computational Linguistics. Oxford University Press: Oxford.

       ● Miller, G. A. 1995. WORDNET: A Lexical Database for English. Communications of ACM 11.

       ● Paumier, S. 2003. Unitex Manual. Université Paris-Est-Marne-la-Vallée. France.