An experiment in translation selection and word sense discrimination


A.Michiels, English Department, University of Liège,
3, Pl.Cockerill, B-4000 Liège Belgium
mailto:amichiels@ulg.ac.be
http://engdep1.philo.ulg.ac.be/michiels


Warning : This paper describes an older version of defi

ABSTRACT

This paper calls on lexical data bases derived from machine-readable dictionaries (both monolingual and bilingual English-to-French, as well as a subset of the WordNet package) to address the problem of automatically selecting the correct translation of a source item in context. The selection procedure is embodied in a Prolog program which assigns a weight to each of the proposed translations. In cases where the translation is selected on the basis of a match in field and/or semantic codes between the item found in text as argument to the predicate to be translated and the metalinguistic slot-filler of that argument position in the Prolog clause associated with the predicate governing the argument in the bilingual dictionary, a by-product of the selection procedure is the disambiguation of the source item in argument position.

A small corpus is used as test bed. Raw results are given and examined in detail, followed by a general assessment.


INTRODUCTION

The experiment reported on in this paper explores how far we can go towards solving the twin problems of translation selection and word sense assignment by using fully automatic procedures (embodied in a Prolog program) making use of the formalised information contained in the metalinguistic apparatus of three computerized dictionaries, namely LDOCE (The Longman Dictionary of Contemporary English, 1978 edition), RC (the English-French section of Le Robert and Collins, 1978 edition) and OH (the English-French section of the Oxford-Hachette, 1994), as well as a subset of the WordNet package. The Prolog program that carries out the task of partial word sense assignment and target selection calls on nearly half a million dictionary clauses, about half of which come from WordNet, the others being culled from the three machine-readable dictionaries just mentioned. The research reported on here is ongoing within the nationally-funded DEFI (Désambiguïsation par filtre) project, whose aim is, on the basis of online bilingual dictionaries, to help the reader understand a text in a foreign language by ranking the information given by the online dictionary, so that the most relevant information is presented to him first.

For the purposes of this experiment, translation selection means selecting the appropriate French target translation (as recorded in RC) for verbal predicates decorated with their deep arguments yielded by a syntactic parser such as Horatio (cf. Michiels 1995).

Word sense assignment concerns the source item, in this experiment the English heads of the deep grammatical arguments associated with the predicate. It means pointing to the appropriate reading in the monolingual dictionary, namely LDOCE.

To illustrate: we shall be looking at parsed predicate-argument structures such as predicate(give_up, arg(obj(job))), and we shall endeavour to select the appropriate French translation of give_up on the basis of its context (namely that it takes job as object) and the information concerning give_up in RC, job in LDOCE and the fillers of the metalinguistic slot opened by give_up for its object, in LDOCE, RC and OH. A by-product will be a partial disambiguation of the item job on the basis of its occurrence as the object of the predicate give_up.

We begin by looking at some essential characteristics of bilingual and monolingual dictionaries, in order to set the scene and avoid likely misunderstandings as to the scope of our experiment.

We move on to a very brief description of the metalinguistic apparatus of our three dictionaries, concentrating on the bits we make use of in this experiment.

We will not say anything about the transformation of our dictionaries into lexical data bases (LDOCE and RC), nor about the retrieval programs, written partly in Clipper and partly in Awk, that cull the relevant information from the dictionaries and massage it into Prolog clauses, so that it can be used by the Prolog program responsible for translation selection and reading assignment, defi. We give the main lines of the algorithm, which the Prolog code embodies in a very straightforward fashion, so that understanding the program (downloadable from our WWW site) should not present any problem to a linguist accustomed to using Prolog as a prototyping tool.

We do not describe the parser (Horatio, also downloadable from our WWW site) whose results we take as given in this experiment, because it is fully described elsewhere. It should be noted that the target selection tool is not tied to the parser: any parser able to recover deep syntactic relations will do, and a surface-oriented parser would already go some way towards providing the syntactic information needed by defi.

The experiment uses a very small corpus (an extract from John Le Carré's The Little Drummer Girl) which is given in the Appendix. We also give the translations to be found in the published French translation, so that they can be compared with those that the program written on the basis of the selector presented here puts forward.

Finally we go through the results (the proposed translations and reading assignments) and assess them on an individual basis first, and then globally in the conclusions. In so doing we take care not to overgeneralize from this small-scale experiment. Obtaining more data is not a problem, but assessing such data is likely to require considerable time and effort as it can only be carried out by a trained linguist/translator.


ESSENTIAL CHARACTERISTICS OF A BILINGUAL DICTIONARY

A number of fictions / oversimplifications / simplifications (depending on how charitable we wish to be) are shared by bilingual and monolingual dictionaries. The most important of these is the following:

For bilingual dictionaries this hypothesis entails that a source lexical item is associated with a list of target lexical items regarded as potential translations. In the simplest case the list is reduced to a single item. This view is word-centred and ignores the rephrasing process (which can span a phrase, a clause, a sentence, even a whole paragraph) that is often necessary to achieve high-quality translation.

It should be stressed that a bilingual dictionary offers a target-oriented division of the semantic space covered by the source lexical item. Monosemy and polysemy are projected from the target to the source language, a process which can only have practical justification. The noun DENT (in its non-figurative uses) is regarded as monosemic in English monolingual lexicography (witness LDOCE and COBUILD), but is polysemic for RC, because the hyperonym of ENTAILLE and BOSSE, whatever it is, is not the translation one would like to give for the item DENT.

This target orientation makes it impossible to automatically associate the source side of a bilingual dictionary with a monolingual dictionary. In our experiment, word sense assignment does not mean assigning a target word sense on the basis of the selected translation of the source, but disambiguating elements of the source language that are made use of in selecting the appropriate target. To go back to our example: in looking for the appropriate translation of give up in predicate(give_up,arg(obj(job))), we only reduce the possible readings of job with respect to the monolingual description of that word in LDOCE.

Similarly, we should keep in mind that bilingual dictionaries give translations under the form of target lexical items, not target lexical items under the appropriate reading(s). Selecting a translation is therefore far from the whole story. The target must be disambiguated if we wish to use information associated with it (field, grammatical and thesauric potential, etc.)


ESSENTIAL CHARACTERISTICS OF A MONOLINGUAL DICTIONARY

The basic property to remember here is that the granularity of the dictionary is essentially dependent on its size. Since the division of the word into listable meanings is a fiction, it should not come as a surprise that the fiction does not always yield the same list.

Similarly, the principles governing the relationships that are exhibited among the various readings are partly a matter of following established practices (e.g. organization according to historical evidence or frequency based on attested utterances, or according to the tradition enforced by previous lexicographical work), but they do leave some room for the lexicographer's idiosyncratic 'intuition'. Kilgariff has interesting comments on these points (see Kilgariff 1992).


THE BILINGUAL DICTIONARIES (RC AND OH)

THE METALINGUISTIC APPARATUS OF RC

In the printed dictionary the metalinguistic apparatus (dubbed 'general indicating material' in the prefatory matter) appears in italic. It can be subdivided into the following types:

The metalinguistic apparatus of OH is not basically different. It also includes typical subjects and objects, partial definitions and field codes. We have collected the metalinguistic information from both dictionaries into a set of Prolog clauses so as to be able to measure the distance between two items in terms of the number of metalinguistic slots that they share (see below).


THE MONOLINGUAL DICTIONARY (LDOCE)

THE METALINGUISTIC APPARATUS OF LDOCE

The metalinguistic apparatus of LDOCE is described elsewhere (see inter alia Michiels 1982 and Boguraev and Briscoe 1989). Here we concentrate on the items we use.

Field codes

They are made up of a selection from the Merriam codes and of codes for subdividing the fields that the Merriam codes refer to. The LDOCE 4-byte field can be filled in any of the following ways:

Semantic codes

The LDOCE semantic space is a ten-byte field, which is associated with each reading of a given lexical item. Nouns are coded according to their inherent semantic properties (e.g. CROWD is marked +Collective), whereas adjectives and verbs, on top of inherent features, are also assigned semantic codes to take care of the selectional restrictions that they impose on their arguments (both adjectives and verbs) or on the nouns that they modify (adjectives).

Since we use LDOCE only for nouns (the heads of the arguments associated with the predicates for which we are trying to find appropriate translations), we restrict our attention to byte 5, which is the one used to house inherent semantic properties.

The LDOCE semantic codes are organized into a hierarchy. We implement feature inheritance by moving up the hierarchy in Defi, the Prolog program responsible for translation selection and word sense assignment.


PROLOG CLAUSES

NOUN CLAUSES

The pattern of a Prolog noun clause is the following:

noun([headword],p(FieldCode,SemanticCode),Entrykey).

The field code, semantic code and entry key all come from the relational data bases built to house the information contained in LDOCE and make it more easily accessible for NLP purposes.

Here are a few sample noun clauses (there are 37747 noun clauses, covering all nouns in LDOCE)

noun([$trace element$],p(size,solid),'TAXR1').
noun([$tracer$],p(mizb,movble),'TAXS1').
noun([$tracery$],p(ac,abstr),'TAXT1').
noun([$trachea$],p(mdza,0),'TAXU1').
noun([$trachoma$],p(mdzd,abstr),'TAXV1').
noun([$tracing$],p(0,movble),'TAXW1').
noun([$tracing paper$],p(pp,solid),'TAXX1').
noun([$track$],p(0,0),'TAXY10').
noun([$track$],p(0,abstr),'TAXY1').
noun([$track$],p(0,notmov),'TAXY2').
noun([$track$],p(bdmi,movble),'TAXY6').
noun([$track$],p(re,abstr),'TAXY8').
noun([$track$],p(rr,notmov),'TAXY4').
noun([$track$],p(sp,notmov),'TAXY7').
noun([$tracker$],p(0,human),'TAXZ1').
noun([$tracker$],p(mpra,0),'TAXZ3').
noun([$tracker$],p(re,movble),'TAXZ2').
noun([$tracking station$],p(siao,notmov),'TAYA1').
noun([$tracksuit$],p(clsp,movble),'TAYC1').

The last clause in this sample set indicates that the noun tracksuit is coded as belonging to the fields of clothing (CL) and sports (SP). It is assigned the semantic feature movable and the entry key TAYC1. The meaning assigned to this reading in LDOCE is not mentioned in the clause, but can easily be recovered from the data base housing the LDOCE definitions (so that we can check whether our attempts at word sense selection are correct).


VERB CLAUSES (TRANS CLAUSES)

trans('SourcePredicate','Translation','Pos', Marker, 'RestrictionString', ListOfCodes).

Marker is either pdef (partial definition), obj (deep object), subj (deep subject) or field (RC field specification). ListOfCodes is either the string nocodes or a list of p structures of the following type: p(LdoceFieldCode, LdoceSemanticCode).

Here are the trans clauses for the phrasal verb give up (there are 53316 trans clauses):

trans('give up','vouer','vt sep',pdef,'devote',nocodes).
trans('give up','consacrer','vt sep',pdef,'devote',nocodes).
trans('give up','abandonner','vt sep',pdef,'renounce',nocodes).
trans('give up','abandonner','vt sep',pdef,'part with',nocodes).
trans('give up','abandonner','vt sep',obj,'friend',[p(0,0),p(0,human)]).
trans('give up','abandonner','vt sep', obj, 'interest', [p(0,0), p(0,abstr), p('bzzc',abstr), p('eczs',abstr)]).
trans('give up','délaisser','vt sep',pdef,'renounce',nocodes).
trans('give up','délaisser','vt sep',pdef,'part with',nocodes).
trans('give up','délaisser','vt sep',obj,'friend',[p(0,0),p(0,human)]).
trans('give up','délaisser','vt sep', obj, 'interest', [p(0,0), p(0,abstr), p('bzzc',abstr), p('eczs',abstr)]).
trans('give up','céder','vt sep',pdef,'renounce',nocodes).
trans('give up','céder','vt sep',pdef,'part with',nocodes).
trans('give up','céder','vt sep',obj,'seat', [p(0,0), p(0,movble), p('eqzd',abstr), p('pl',abstr)]).
trans('give up','céder','vt sep', obj,'place', [p(0,0),p(0,abstr), p('gbzr',abstr), p('mh',abstr)]).
trans('give up','abandonner','vt sep',pdef,'renounce',nocodes).
trans('give up','abandonner','vt sep',pdef,'part with',nocodes).
trans('give up','abandonner','vt sep',obj,'habit',[p(0,abstr),p('clrl',movble)]).
trans('give up','abandonner','vt sep',obj,'idea',[p(0,0),p(0,abstr)]).
trans('give up','renoncer à','vt sep',pdef,'renounce',nocodes).
trans('give up','renoncer à','vt sep',pdef,'part with',nocodes).
trans('give up','renoncer à','vt sep',obj,'habit',[p(0,abstr),p('clrl',movble)]).
trans('give up','renoncer à','vt sep',obj,'idea',[p(0,0),p(0,abstr)]).
trans('give up','quitter','vt sep',pdef,'renounce',nocodes).
trans('give up','quitter','vt sep',pdef,'part with',nocodes).
trans('give up','quitter','vt sep',obj,'job',[p(0,0),p(0,abstr),p(0,movble)]).
trans('give up','démissionner de','vt sep',pdef,'renounce',nocodes).
trans('give up','démissionner de','vt sep',pdef,'part with',nocodes).
trans('give up','démissionner de','vt sep',obj,'appointment',[p(0,abstr)]).
trans('give up','se retirer de','vt sep',pdef,'renounce',nocodes).
trans('give up','se retirer de','vt sep',pdef,'part with',nocodes).
trans('give up','se retirer de','vt sep',obj,'business', [p(0,0),p(0,abstr), p('bzzc',abstr), p('th',abstr)]).
trans('give up','cesser','vt sep',pdef,'renounce',nocodes).
trans('give up','cesser','vt sep',pdef,'part with',nocodes).
trans('give up','cesser','vt sep',obj,'subscription',[p(0,abstr)]).
trans('give up','livrer (<to> à)','vt sep',pdef,'deliver',nocodes).
trans('give up','livrer (<to> à)','vt sep',pdef,'hand over',nocodes).
trans('give up','livrer (<to> à)','vt sep',obj,'prisoner',[p(0,0),p('sozc',human)]).
trans('give up','se démettre de','vt sep',pdef,'deliver',nocodes).
trans('give up','se démettre de','vt sep',pdef,'hand over',nocodes).
trans('give up','se démettre de','vt sep',obj, 'authority', [p(0,abstr), p(0,concr), p(0,movble), p('pl',collec)]).
trans('give up','rendre','vt sep',pdef,'deliver',nocodes).
trans('give up','rendre','vt sep',pdef,'hand over',nocodes).
trans('give up','rendre','vt sep',obj,'key', [p(0,0),p(0,abstr), p(0,movble), p('bo',0), p('goqk',notmov), p('mu',abstr)]).
trans('give up','condamner','vt sep',pdef,'abandon hope for',nocodes).
trans('give up','condamner','vt sep',obj,'patient',[p('md',human)]).
trans('give up','ne plus attendre','vt sep',pdef,'abandon hope for',nocodes).
trans('give up','ne plus attendre','vt sep',obj,'visitor',[p(0,human),p('ozzo',animal)]).
trans('give up','ne plus espérer voir','vt sep',pdef,'abandon hope for',nocodes).
trans('give up','ne plus espérer voir','vt sep',obj,'visitor', [p(0,human), p('ozzo',animal)]).
trans('give up','renoncer à (résoudre)','vt sep',pdef,'abandon hope for',nocodes).
trans('give up','renoncer à (résoudre)','vt sep',obj,'problem', [p(0,abstr), p(0,human), p('af',0),p('ltzd',abstr)]).
trans('give up','renoncer à (résoudre)','vt sep',obj,'riddle', [p(0,0), p(0,abstr), p('hh',movble), p('lt',abstr)]).

The last trans clause in this sample set tells us that give_up is to be translated as renoncer à (résoudre) when it is a separable phrasal verb (give sth up / give up sth), and its deep object head is riddle (or a word that is near to it - collocationally or semantically; a good deal of the work carried out by defi is precisely devoted to measuring the distance between the object we have in text and the prototypical slot filler, e.g. here riddle; the list of LDOCE field and semantic codes associated with the prototypical noun, being important information in the computing of that distance, is therefore included in the clause. The word sense (" a difficult and amusing question to which one must guess the answer ") is coded as pertaining to the field of literature (LT) and bearing the semantic feature Abstract (abstr). The HH field code (household) and mvble (movable) semantic feature refer to another reading (or rather another lexeme) riddle (" a large SIEVE, as used for separating earth from stones in the garden "), but unfortunately we have no means of automatically rejecting it as unappropriate, because the slot fillers are words, not word senses.


COLL CLAUSES

Derived from both RC and OH. The pattern is the following:

coll(MetalinguisticItem, DictionarySource,HeadWord,Type)

The first argument is the filler of a metalinguistic slot. The second argument begins with oh or rc, followed by the identification number of the headword entry in the dictionary. The third argument is the headword governing the slot (e.g. a verb governing an object slot). The last argument is one of:

We have extracted the RC metalinguistic specifications from the edited version of the metalinguistic apparatus of the English-French RC prepared by Dr. Thierry Fontenelle in the framework of his recent doctoral dissertation (Fontenelle 1995).

Below are listed the coll clauses for the metalinguistic slot filler advice (there are 118.700 coll clauses):

coll($advice$,oh105796,$what's on offer in the catalogue?$,post).
coll($advice$,oh106888,$on-the-spot$,pre).
coll($advice$,oh117263,$poor$,pre).
coll($advice$,oh119629,$priceless$,pre).
coll($advice$,oh1204,$to get one's act together$,post).
coll($advice$,oh120618,$proffer$,post).
coll($advice$,oh126061,$receive$,post).
coll($advice$,oh127218,$her refusal to accept$,post).
coll($advice$,oh127632,$reject$,post).
coll($advice$,oh134071,$sagacious$,pre).
coll($advice$,oh136044,$scorn$,post).
coll($advice$,oh137625,$seek$,post).
coll($advice$,oh146584,$solid$,pre).
coll($advice$,oh147832,$to be sparing with$,post).
coll($advice$,oh148121,$specialist$,pre).
coll($advice$,oh148852,$spite$,post).
coll($advice$,oh149423,$spout$,post).
coll($advice$,oh149799,$spurn$,post).
coll($advice$,oh153271,$straight$,pre).
coll($advice$,oh168290,$to turn to sb for$,post).
coll($advice$,oh169433,$unbias<v>(s)</v>ed$,pre).
coll($advice$,oh170764,$unhelpful$,pre).
coll($advice$,oh171317,$unpalatable$,pre).
coll($advice$,oh173237,$valuable$,pre).
coll($advice$,oh173273,$value$,post).
coll($advice$,oh174905,$volunteer$,post).
coll($advice$,oh177904,$well-meaning$,pre).
coll($advice$,oh180177,$wise$,pre).
coll($advice$,oh28748,$to come to sb for$,post).
coll($advice$,oh30867,$confidential$,pre).
coll($advice$,oh37899,$to fall on deaf ears$,pre).
coll($advice$,oh39321,$defy$,post).
coll($advice$,oh42670,$dish out$,post).
coll($advice$,oh42732,$disinterested$,pre).
coll($advice$,oh42906,$dispense$,post).
coll($advice$,oh43925,$do without$,post).
coll($advice$,oh52491,$expert$,pre).
coll($advice$,oh59471,$follow out$,post).
coll($advice$,oh61163,$to be free with$,post).
coll($advice$,oh65077,$give$,post).
coll($advice$,oh70056,$hand out$,post).
coll($advice$,oh72445,$heed$,post).
coll($advice$,oh72755,$helpful$,pre).
coll($advice$,oh74283,$to sound hollow$,pre).
coll($advice$,oh76875,$ignore$,post).
coll($advice$,oh77329,$impartial$,pre).
coll($advice$,oh78041,$inappropriate$,pre).
coll($advice$,oh81445,$invaluable$,pre).
coll($advice$,oh85283,$ladle$,post).
coll($advice$,rc1068,$act on$,obj).
coll($advice$,rc106869,$sound$,mod).
coll($advice$,rc124078,$unhelpful$,mod).
coll($advice$,rc125160,$unsound$,mod).
coll($advice$,rc126067,$useless$,mod).
coll($advice$,rc126075,$uselessness$,cplt).
coll($advice$,rc126345,$valuable$,mod).
coll($advice$,rc131946,$worthless$,mod).
coll($advice$,rc131948,$worthlessness$,cplt).
coll($advice$,rc38534,$evil$,mod).
coll($advice$,rc43373,$flout$,obj).
coll($advice$,rc45243,$friendly$,mod).
coll($advice$,rc53136,$helpful$,mod).
coll($advice$,rc5506,$attend to$,obj).
coll($advice$,rc61847,$kindly$,mod).
coll($advice$,rc62445,$ladle out$,obj).
coll($advice$,rc73267,$neglect$,obj).
coll($advice$,rc97915,$scorn$,obj).
coll($advice$,rc98673,$seasonable$,mod).

This last clause indicates that advice is one of the fillers of the metalinguistic slot opened up for the preferred modified element of the adjective seasonable (entry 98673 in RC).


SOURCE PARSED STRUCTURES

As already stated, Defi expects to work on predicate/argument structures such as are produced by a deep syntactic parser. To illustrate: from the natural language string a printed booklet setting out his rights in English (see Appendix), the parser should work out that the predicate is set out, and has a subject whose head is booklet and an object whose head is right.

In this experiment we will restrict our attention to predicates and their deep objects. Defi will expect parses such as the following, which build up our tiny corpus:

[clutch, word, obj].
[forge, letter, obj].
['give up', boyfriend, obj].
['give up', job, obj].
[pretend, indifference, obj].
[raise, lighting, obj].
[resume, work, obj].
['set out', right, obj].
['sweep out', cell, obj].
[toss, head, obj].
[wear, charm, obj].


TRANSLATION SELECTION AND MEANING ASSIGNMENT

To sum up: the lexicon of the system is made up of trans clauses (verb information) and noun clauses (noun information). As explained above, trans clauses are derived from both RC and LDOCE, while noun clauses are derived from LDOCE alone. Defi also calls on the coll clauses to measure the collocational similarity of two nouns. Coll clauses are derived from RC and OH.

Defi makes use of semantic features and field codes. The semantic features are organized into a hierarchy with inheritance. They are a subset of the LDOCE semantic features, viz. those that are considered most important for disambiguation purposes. As to the field codes, they come from both LDOCE and RC. Defi records equivalences between the codes from the two dictionaries.

Defi assigns a weight to each of the French translations that it puts forward for a given English predicate. The weight is a numeric value; the highest value wins, i.e. is assigned to the preferred translation. The weighting procedure is purely heuristic and is open to revision.

The algorithm used by Defi is a matching algorithm. A number of cases ought to be distinguished:

  1. match between the head of the argument in the textual data and the metalinguistic slot of the RC-derived trans clause. We are looking at an argument which is textually recorded in the metalinguistic slot of the translating dictionary. The weight is the highest one assigned by the matching algorithm, i.e. 200;
  2. match of LDOCE field code for the textual head, as retrieved from the noun clause, with RC field code in trans clause. Weight: 15;
  3. the syntactic type of the argument (in this experiment, obj) is matched by the POS (here vt, transitive verbs). Weight: 1;
  4. no match at all; the entry is simply recorded in RC. Weight is 0.

Besides, there are three more complex cases.

A. Match in the LDOCE codes

This is a match in LDOCE codes between the head of the argument in the parsed text and one of the items filling the metalinguistic slot associated with the argument in the trans clause. Recall that we can call on noun clauses to retrieve the features of the deep argument and on trans clauses to retrieve the requirements put forward by the verb. For each item in the metalinguistic slot of the RC entry, we retrieve the list of corresponding LDOCE codes. We look for a match between these codes and the ones recorded for the argument noun.The weight is computed according to the quality of the match, measured by the following field code matching algorithm:

B. Metalinguistic slot sharing

The head noun in the parsed structure is taken together with each of the fillers for the slot corresponding to the argument in the trans clause.For each such pair, a collocational weight is computed on the basis of the number of headwords under which the two items are found together as slot fillers of a given slot. A weight of 4 is assigned to each cooccurrence of the two items within a given slot. The rationale for taking this measure of collocational similarity into account is to be found in Montemagni et al. 1996 and fits into the process of 'paradigm extension'. Montemagni et al. 1996 use the measure (computed on the basis of the Collins Italian-English dictionary) to select the appropriate word sense in the monolingual Garzanti dictionary and report promising first results. Their paper provided the inspiration for this part of the defi algorithm.

C. WordNet Link

The head noun in the parsed data is paired in turn with each element filling the relevant slot in the trans clause. The pair is submitted to an algorithm which computes the tightness of the link between the two items in the subset of the WordNet package used in defi:

Care has been taken to ensure that the various types of hyponymy represented in WordNet are all taken into account and allowed to mix in the establishment of hyponymy chains:

Finally, we regard two elements as linked in WordNet if they are both linkable to a common third item. Here too, we set limits on the length of the walk in the various types of hyponymy.

The weight assigned to a WordNet link can vary from 3 to 100.


COMMENTED RESULTS

We give complete, unedited raw results and provide some comments on how successful defi has proved to be in each case.

Structure: res(Translation,Weight,LdoceHeadArgKey)

The Weight argument gives the numeric value, but also the source of the match. In case of a lack of match, the weight is zero and the translation is marked as simply available (but not selected): available(0). The other cases are the following:

In the case of matches made on the basis of information culled from LDOCE, the defi algorithm also supplies the LDOCE entrykey, which makes it possible to see which word sense of the arg noun triggered the match, and thereby to disambiguate the noun in arg position in the parsed data.

Predicate argument pair: clutch word obj

res(agripper,available(0),nokey)
res(agripper,vframematch(1),nokey)
res(empoigner,available(0),nokey)
res(empoigner,vframematch(1),nokey)
res(s- agripper à,available(0),nokey)
res(saisir,available(0),nokey)
res(saisir,vframematch(1),nokey)
res(se cramponner à,available(0),nokey)
res(se cramponner à,vframematch(1),nokey)
res(se raccrocher à,available(0),nokey)
res(se saisir de,available(0),nokey)
res(se saisir de,vframematch(1),nokey)
res(serrer fort,available(0),nokey)
res(serrer fort,vframematch(1),nokey)
res(étreindre,available(0),nokey)
res(étreindre,vframematch(1),nokey)

Number of proposed translations: 9
Human translation: étreindre (la lettre ! ! !)
Number of selected translations: 7
Basis for selection: Part of Speech; the translations associated with the intransitive use are rejected, those with the transitive use selected
Selection assessment: OK
Word sense discrimination: none
Comments: there is no collocational link between clutch and word, so it is to be expected that only a grammatical property should be the basis for selection
General assessment: POOR


Predicate argument pair: forge letter obj

res(contrefaire,available(0),nokey)
res(contrefaire,codematch(sem(5)),L@UM1)
res(contrefaire,codematch(sem(5)),L@UM3)
res(contrefaire,vframematch(1),nokey)
res(fabriquer,available(0),nokey)
res(fabriquer,codematch(sem(5)),L@UM3)
res(fabriquer,vframematch(1),nokey)
res(faire un faux de,available(0),nokey)
res(faire un faux de,codematch(sem(5)),L@UM1)
res(faire un faux de,codematch(sem(5)),L@UM3)
res(faire un faux de,vframematch(1),nokey)
res(faire un faux de,metalmatch(letter,document,obj,28),nokey)
res(falsifier,available(0),nokey)
res(falsifier,vframematch(1),nokey)
res(foncer,available(0),nokey)
res(forger,available(0),nokey)
res(forger,codematch(sem(5)),L@UM1)
res(forger,codematch(sem(5)),L@UM3)
res(forger,vframematch(1),nokey)
res(inventer,available(0),nokey)
res(inventer,codematch(sem(5)),L@UM3)
res(inventer,vframematch(1),nokey)
res(maquiller,available(0),nokey)
res(maquiller,vframematch(1),nokey)

Number of proposed translations: 8
Human translation: contrefaire (l'écriture ! ! !)
Number of selected translations: 1 (faire un faux de)
Basis for selection: sharing of metalinguistic slots between letter (in the text) and document (the metalinguistic slot filler)
Selection assessment: Excellent; faire un faux de is probably the best translation available; the human translator selected contrefaire only because he rephrased the object (écriture instead of lettre)
Word sense discrimination: 2 senses of letter are selected (out of 4), the right one being one of the two (" a written or printed message sent usu. in an envelope ", L@UM1)
General assessment: EXCELLENT


Predicate argument pair: give_up boyfriend obj

res(abandonner,available(0),nokey)
res(abandonner,codematch(sem1up(2)),BBNC1)
res(abandonner,metalmatch(boyfriend,friend,obj,8),nokey)
res(cesser,available(0),nokey)
res(condamner,available(0),nokey)
res(condamner,codematch(sem1up(2)),BBNC1)
res(consacrer,available(0),nokey)
res(céder,available(0),nokey)
res(délaisser,available(0),nokey)
res(délaisser,codematch(sem1up(2)),BBNC1)
res(délaisser,metalmatch(boyfriend,friend,obj,8),nokey)
res(démissionner de,available(0),nokey)
res(livrer (<to> à),available(0),nokey)
res(livrer (<to> à),codematch(sem1up(2)),BBNC1)
res(ne plus attendre,available(0),nokey)
res(ne plus attendre,codematch(sem1up(2)),BBNC1)
res(ne plus espérer voir,available(0),nokey)
res(ne plus espérer voir,codematch(sem1up(2)),BBNC1)
res(quitter,available(0),nokey)
res(quitter,metalmatch(boyfriend,job,obj,4),nokey)
res(rendre,available(0),nokey)
res(renoncer à,available(0),nokey)
res(renoncer à (résoudre),available(0),nokey)
res(renoncer à (résoudre),codematch(sem1up(2)),BBNC1)
res(se démettre de,available(0),nokey)
res(se retirer de,available(0),nokey)
res(vouer,available(0),nokey)

Number of proposed translations: 17
Human translation: quitter (ami + travail de bureau)
Number of selected translations: 2 (abandonner / délaisser). The two proposed translations are adequate; the human translator preferred a more general translation that would fit both objects (a person as well as an activity)
Basis for selection: metalinguistic slot sharing
Selection assessment: OK
Word sense discrimination: does not apply (boyfriend is monosemic in LDOCE)
General assessment: EXCELLENT


Predicate argument pair: give_up job obj

res(abandonner,available(0),nokey)
res(abandonner,codematch(sem(5)),J@FZ1)
res(abandonner,codematch(sem(5)),J@FZ5)
res(abandonner,metalmatch(job,idea,obj,8),nokey)
res(cesser,available(0),nokey)
res(cesser,codematch(sem(5)),J@FZ1)
res(condamner,available(0),nokey)
res(consacrer,available(0),nokey)
res(céder,available(0),nokey)
res(céder,codematch(sem(5)),J@FZ1)
res(céder,codematch(sem(5)),J@FZ5)
res(céder,wordnet(job,place,10),nokey)
res(céder,metalmatch(job,seat,obj,8),nokey)
res(délaisser,available(0),nokey)
res(délaisser,codematch(sem(5)),J@FZ1)
res(démissionner de,available(0),nokey)
res(démissionner de,codematch(sem(5)),J@FZ1)
res(démissionner de,wordnet(job,appointment,10),nokey)
res(livrer (<to> à),available(0),nokey)
res(ne plus attendre,available(0),nokey)
res(ne plus espérer voir,available(0),nokey)
res(quitter,available(0),nokey)
res(quitter,headmatch(200),nokey)
res(rendre,available(0),nokey)
res(rendre,codematch(sem(5)),J@FZ1)
res(rendre,codematch(sem(5)),J@FZ5)
res(rendre,metalmatch(job,key,obj,4),nokey)
res(renoncer à,available(0),nokey)
res(renoncer à,codematch(sem(5)),J@FZ1)
res(renoncer à,codematch(sem(5)),J@FZ5)
res(renoncer à,metalmatch(job,idea,obj,8),nokey)
res(renoncer à (résoudre),available(0),nokey)
res(renoncer à (résoudre),codematch(sem(5)),J@FZ1)
res(renoncer à (résoudre),codematch(sem(5)),J@FZ5)
res(renoncer à (résoudre),wordnet(job,problem,100),nokey)
res(se démettre de,available(0),nokey)
res(se démettre de,codematch(sem(5)),J@FZ1)
res(se démettre de,codematch(sem(5)),J@FZ5)
res(se retirer de,available(0),nokey)
res(se retirer de,codematch(sem(5)),J@FZ1)
res(se retirer de,wordnet(job,business,30),nokey)
res(vouer,available(0),nokey)

Number of proposed translations: 17
Human translation: quitter (see preceding entry)
Number of selected translations: 1 (quitter)
Basis for selection: the head of the arg (job) occurs in the metalinguistic slot of the trans clause; Selection assessment: OK
Word sense discrimination: though the match in LDOCE codes is not relevant to the selection, it should be noted that only the first of the two word senses of job (J@FZ1: " a piece of work that has been or must be done ") is acceptable (although not really the one we need: employment)
Comments: the decisive score is based on a string match (headmatch), and therefore does not contribute to word sense assignment
General assessment: EXCELLENT


Predicate argument pair: pretend indifference obj

res(avoir des prétentions à qch,available(0),nokey)
res(faire semblant,available(0),nokey)
res(faire semblant (<to> <do> de faire, <that> que),available(0),nokey)
res(faire semblant (<to> <do> de faire, <that> que),vframematch(1),nokey)
res(feindre,available(0),nokey)
res(feindre,vframematch(1),nokey)
res(feindre,metalmatch(indifference,ignorance,obj,8),nokey)
res(prétendre (<that> que),available(0),nokey)
res(prétendre (<that> que),vframematch(1),nokey)
res(prétendre à qch,available(0),nokey)
res(simuler,available(0),nokey)
res(simuler,vframematch(1),nokey)
res(simuler,metalmatch(indifference,ignorance,obj,8),nokey)

Number of proposed translations: 7
Human translation: feindre (avec une indifférence feinte)
Number of selected translations: 2 (feindre / simuler)
Basis for selection: metalinguistic slot sharing
Selection assessment: Excellent (simuler is nearly as good as feindre)
Word sense discrimination: none
General assessment: GOOD


Predicate argument pair: raise lighting obj

res(augmenter,available(0),nokey)
res(augmenter,codematch(sem(5)),LA@U1)
res(augmenter,vframematch(1),nokey)
res(avez-vous réussi à entrer en contact avec quelqu- un par (la) radio?,available(0),nokey)
res(avez-vous réussi à entrer en contact avec quelqu- un par (la) radio?,vframematch(1),nokey)
res(avez-vous réussi à toucher quelqu- un par (la) radio?,available(0),nokey)
res(avez-vous réussi à toucher quelqu- un par (la) radio?,vframematch(1),nokey)
res(bâtir,available(0),nokey)
res(bâtir,codematch(sem(5)),LA@U1)
res(bâtir,vframematch(1),nokey)
res(construire,available(0),nokey)
res(construire,codematch(sem(5)),LA@U1)
res(construire,vframematch(1),nokey)
res(cultiver,available(0),nokey)
res(cultiver,vframematch(1),nokey)
res(enchérir,available(0),nokey)
res(enchérir,vframematch(1),nokey)
res(faire apparaître,available(0),nokey)
res(faire apparaître,codematch(sem(5)),LA@U1)
res(faire apparaître,vframematch(1),nokey)
res(faire monter,available(0),nokey)
res(faire monter,vframematch(1),nokey)
res(faire pousser,available(0),nokey)
res(faire pousser,vframematch(1),nokey)
res(faire une annonce supérieure,available(0),nokey)
res(faire une annonce supérieure,vframematch(1),nokey)
res(faire une mise supérieure,available(0),nokey)
res(faire une mise supérieure,vframematch(1),nokey)
res(lever,available(0),nokey)
res(lever,codematch(sem(5)),LA@U1)
res(lever,vframematch(1),nokey)
res(majorer,available(0),nokey)
res(majorer,codematch(sem(5)),LA@U1)
res(majorer,vframematch(1),nokey)
res(monter,available(0),nokey)
res(monter,vframematch(1),nokey)
res(provoquer,available(0),nokey)
res(provoquer,codematch(sem(5)),LA@U1)
res(provoquer,vframematch(1),nokey)
res(rassembler,available(0),nokey)
res(rassembler,codematch(sem(5)),LA@U1)
res(rassembler,vframematch(1),nokey)
res(relancer,available(0),nokey)
res(relancer,vframematch(1),nokey)
res(relever (<Admin>),available(0),nokey)
res(relever (<Admin>),vframematch(1),nokey)
res(réunir,available(0),nokey)
res(réunir,codematch(sem(5)),LA@U1)
res(réunir,vframematch(1),nokey)
res(se procurer,available(0),nokey)
res(se procurer,codematch(sem(5)),LA@U1)
res(se procurer,vframematch(1),nokey)
res(soulever,available(0),nokey)
res(soulever,codematch(sem(5)),LA@U1)
res(soulever,vframematch(1),nokey)
res(édifier,available(0),nokey)
res(édifier,codematch(sem(5)),LA@U1)
res(édifier,vframematch(1),nokey)
res(élever,available(0),nokey)
res(élever,codematch(sem(5)),LA@U1)
res(élever,vframematch(1),nokey)
res(ériger,available(0),nokey)
res(ériger,codematch(sem(5)),LA@U1)
res(ériger,vframematch(1),nokey)
res(évoquer,available(0),nokey)
res(évoquer,codematch(sem(5)),LA@U1)
res(évoquer,vframematch(1),nokey)

Number of proposed translations: 26
Human translation: firent un peu de lumière (note that Yanuka is said to have been lying in the pitch dark; in French we cannot use a verb whose meaning is broadly increase if the starting point is zero; therefore the translators had to resort to the rephrasing process that we alluded to above)
Number of selected translations: 10
Basis for selection: match in the LDOCE semantic codes; syntactic frame match
Selection assessment: VERY POOR (all translations are wrong)
Word sense discrimination: poor (the second reading of lighting in LDOCE is more appropriate than the first, which is the selected one)
Comments: in our lexical resources, lighting is not related to temperature, standard or level, which are listed in the RC metalinguistic slot. Anyway, even if it had been, we would have faire monter or élever as translations, and these are inadequate in context
General assessment: VERY POOR


Predicate argument pair: resume work obj

res(recommencer,available(0),nokey)
res(recommencer,codematch(sem(5)),WAOA1)
res(recommencer,codematch(sem(5)),WAOA5)
res(recommencer,vframematch(1),nokey)
res(recommencer,wordnet(work,activity,30),nokey)
res(recommencer,metalmatch(work,activity,obj,40),nokey)
res(reprendre,available(0),nokey)
res(reprendre,codematch(sem(5)),WAOA1)
res(reprendre,codematch(sem(5)),WAOA5)
res(reprendre,vframematch(1),nokey)
res(reprendre,wordnet(work,activity,30),nokey)
res(reprendre,metalmatch(work,activity,obj,40),nokey)

Number of proposed translations: 2
Human translation: reprendre
Number of selected translations: 2
Basis for selection: sharing of metalinguistic slots; link in WordNet
Selection assessment: OK, although no selection had to be performed
Word sense discrimination: the first reading of work is selected and is OK; the fifth is also selected and is inappropriate
Comments: note that work occurs as selectional restriction, but as subject in the intransitive reading, as in work should resume
General assessment: GOOD


Predicate argument pair: set_out right obj

res(disposer,available(0),nokey)
res(exposer,available(0),nokey)
res(exposer,codematch(sem(5)),RBDN1)
res(exposer,codematch(sem(5)),RBDP1)
res(exposer,wordnet(right,idea,9),nokey)
res(il a cherché à expliquer pourquoi cela s- était produit,available(0),nokey)
res(il s- est proposé d- expliquer pourquoi cela s- était produit,available(0),nokey)
res(partir (<for> pour, <from> de, <in> <search> <of> à la recherche de),available(0),nokey)
res(présenter,available(0),nokey)
res(présenter,codematch(sem(5)),RBDN1)
res(présenter,codematch(sem(5)),RBDP1)
res(présenter,wordnet(right,idea,9),nokey)
res(se mettre en route (<for> pour),available(0),nokey)

Number of proposed translations: 7
Human translation: indiquer (exposer is probably better !)
Number of selected translations: 2 (exposer / présenter)
Basis for selection: match in the semantic codes and link in WordNet
Selection assessment: OK
Word sense discrimination: LDOCE has an entry for rights, to which defi had no access because of the depluralization performed on the head of the object noun phrase (right instead of rights); if we do not perform depluralization, we do get the right LDOCE entry (RBEH1).
General assessment: GOOD


Predicate argument pair: sweep_out cell obj

res(balayer,available(0),nokey)
res(balayer,codematch(sem(5)),CA@J1a)
res(balayer,codematch(sem(5)),CA@J2)
res(balayer,wordnet(cell,room,30),nokey)

Number of proposed translations: 1
Human translation: balayer
Number of selected translations:1
Basis for selection: WordNet link between cell and room, which is given as selectional restriction
Selection assessment: no selection needed
Word sense discrimination: the right sense of cell is selected (cell as room, not part of living organism or group of people)
General assessment: EXCELLENT


Predicate argument pair: toss head obj

res(démonter,available(0),nokey)
res(démonter,vframematch(1),nokey)
res(désarçonner,available(0),nokey)
res(désarçonner,vframematch(1),nokey)
res(faire sauter,available(0),nokey)
res(faire sauter,vframematch(1),nokey)
res(jeter (<to> à),available(0),nokey)
res(jeter (<to> à),codematch(field(5)),H@QG1a)
res(jeter (<to> à),codematch(sem(5)),H@QG15)
res(jeter (<to> à),codematch(sem(5)),H@QG2)
res(jeter (<to> à),codematch(sem(5)),H@QG7)
res(jeter (<to> à),vframematch(1),nokey)
res(lancer,available(0),nokey)
res(lancer,codematch(field(5)),H@QG1a)
res(lancer,codematch(sem(5)),H@QG15)
res(lancer,codematch(sem(5)),H@QG2)
res(lancer,codematch(sem(5)),H@QG7)
res(lancer,vframematch(1),nokey)
res(projeter en l- air,available(0),nokey)
res(projeter en l- air,vframematch(1),nokey)
res(rejeter en arrière,available(0),nokey)
res(rejeter en arrière,headmatch(200),nokey)
res(rejeter en arrière,vframematch(1),nokey)
res(rejeter en arrière,metalmatch(head,mane,obj,4),nokey)
res(s- agiter,available(0),nokey)
res(se balancer,available(0),nokey)
res(tanguer,available(0),nokey)

Number of proposed translations: 10
Human translation: secouer (translates shake rather than toss)
Number of selected translations: 1
Basis for selection: direct string match between the head of the arg and the selectional restriction
Selection assessment: OK
Word sense discrimination: does not apply, as the four selected senses of head are put on the same footing; it should be noted that the relevant word sense is among the four (H@QG1a: " part of the body... "), but it does not stand out

General assessment: EXCELLENT


Predicate argument pair: wear charm obj

res(afficher,available(0),nokey)
res(afficher,codematch(sem(5)),CAIU1)
res(afficher,codematch(sem(5)),CAIU4)
res(afficher,vframematch(1),nokey)
res(arborer,available(0),nokey)
res(arborer,codematch(sem(5)),CAIU1)
res(arborer,codematch(sem(5)),CAIU4)
res(arborer,vframematch(1),nokey)
res(avoir,available(0),nokey)
res(avoir,codematch(sem(5)),CAIU1)
res(avoir,codematch(sem(5)),CAIU4)
res(avoir,vframematch(1),nokey)
res(creuser peu à peu,available(0),nokey)
res(creuser peu à peu,codematch(sem(5)),CAIU1)
res(creuser peu à peu,codematch(sem(5)),CAIU4)
res(creuser peu à peu,vframematch(1),nokey)
res(faire de l- usage,available(0),nokey)
res(porter,available(0),nokey)
res(porter,codematch(sem(5)),CAIU1)
res(porter,codematch(sem(5)),CAIU2)
res(porter,codematch(sem(5)),CAIU3)
res(porter,codematch(sem(5)),CAIU4)
res(porter,vframematch(1),nokey)
res(résister à l- usure,available(0),nokey)
res(s- user,available(0),nokey)
res(tirer à sa fin,available(0),nokey)
res(tolérer,available(0),nokey)
res(tolérer,vframematch(1),nokey)
res(user,available(0),nokey)
res(user,codematch(field_and_sem(8)),CAIU3)
res(user,codematch(sem(5)),CAIU1)
res(user,codematch(sem(5)),CAIU2)
res(user,codematch(sem(5)),CAIU4)
res(user,vframematch(1),nokey)

Number of proposed translations: 11
Human translation: qui pendait à son cou (rephrasing)
Number of selected translations: 1
Basis for selection: match of LDOCE codes
Selection assessment: WRONG (user ! ! !)
Word sense discrimination: the preferred reading of charm (CAIU3) is selected (" a small ornament worn on a chain (charm bracelet) round the wrist ")
Comments: charm is rightly coded as jw (jewelry) under the relevant reading. Unfortunately, wear translated as porter has watch as nearest selectional restriction and watch has hr (horology), not jw, as field code. Note that stone, which has jw under one of its readings (precious stone), is associated with wear as user, not porter
General assessment: VERY POOR (but word sense discrimination in the head of the arg is good)


CONCLUSIONS

Neither black nor white, but rather white than black: the results are mixed, going from excellent to very poor, but are more often good than bad. The interesting thing to inquire into is the reasons that lead to failure.

The results are understandably poor in cases where the link between the predicate and its argument is also felt by the human reader to be weak, collocationally and/or semantically. In such cases the human translator will often be found to rephrase, thereby providing a stronger link (as is the case in clutch word).

In some other cases where the results are poor (as in wear charm and raise lighting) we believe that the responsibility lies at least as much with the coding as with the algorithm that defi implements (LDOCE field codes and semantic codes are sometimes assigned rather erratically; RC selectional restrictions are illustrative rather than descriptive: their purpose is to give the human user a flavour of the way the word is used, rather than to describe its context).

This being said, it remains true that the reasons for failure are primarily to be found in the immense complexity of the task, once it is attempted in its full scope, and not under the restrictive conditions operative in a micro-world. This is why we decided to work with the full range of our available lexical data and to start from an undoctored and unrestricted piece of natural language as testbed.

Appendix: The source text (extract from Le Carré 1984, p.210)

(the highlighted strings are used to build up our small corpus)

Taking their leave, the interrogators handed Yanuka a printed booklet setting out his rights in English and, with a wink and a pat on the shoulder, a couple of bars of Swiss chocolate. And they called him by his first name, Salim. For an hour, from the adjoining room, they watched him by infra-red light as he lay in the pitch dark, weeping and tossing his head. Then they raised the lighting and barged in cheerfully, calling out, “ Look what we've got for you; come on, wake up, Salim, it's morning. ” It was a letter, addressed to him by name. Postmark Beirut, sent care of the Red Cross, and stamped “ Prison Censor Approved. ” From his beloved sister Fatmeh, who had given him the gilded charm to wear round his neck. Schwili had forged it, Miss Bach had compiled it, Leon's chameleon talent had supplied the authentic pulse of Fatmeh's censorious affection. Their models were the letters Yanuka had received from her while he was under surveillance. Fatmeh sent her love and hoped Salim would show courage when his time came. By “ time ” she seemed to mean the dreaded interrogation. She had decided to give up her boyfriend and her office job, she said, and resume her relief work in Sidon because she could no longer bear to be so far from the border of her beloved Palestine while Yanuka was in such desperate straits. She admired him; she always would; Leon swore to it. To the grave and beyond, Fatmeh would love her gallant, heroic brother; Leon had seen to it. Yanuka accepted the letter with pretended indifference, but when they left him alone again, he fell into a pious crouch, with his head turned nobly sideways and upward like a martyr waiting for the sword, while he clutched Fatmeh's words to his cheek.

“ I demand paper, ” he told the guards, with panache, when they returned to sweep out his cell an hour later.

HUMAN TRANSLATIONS

By Natalie Zimmerman and Lorris Murail (Laffont, 1983, pp. 196-197)

... indiquant ses droits ...
... secouer la tête ...
... firent un peu de lumière ...
... le porte-bonheur doré qui pendait à son cou ...
... avait contrefait l'écriture ...
... quitter son ami et son travail de bureau ...
... reprendre ... son service d'aide ...
... avec une indifférence feinte ...
... en étreignant la lettre de Fatmeh sur sa joue ...
... balayer sa cellule ...


REFERENCES

Corpus

Le Carré, John, The Little Drummer Girl, Pan Books Edition, 1984 (first edition: Hodder and Stoughton, London, 1983)
Le Carré, John, La petite fille au tambour, translated by Natalie Zimmermann and Lorris Murail, Robert Laffont, Paris, 1983

Dictionaries and lexical data bases

COBUILD = Hanks, P. (ed.), Collins Cobuild English Language Dictionary, 1987
LDOCE = P.Procter (ed.), Longman Dictionary of Contemporary English, Longman, London, 1978
OH = M.-H. Corréard and V.Grundy (eds), Le Dictionnaire Hachette-Oxford, OUP and Hachette, 1994
RC = Beryl T. Atkins et al. (eds), Collins Robert French-English, English-French Dictionary, 1978
WordNet = WordNet Prolog package, downloadable from the Princeton University WWW site

Other publications

Aho, A.V., Kernighan, B.W., Weinberger, P.J., The AWK Programming Language, Addison-Wesley, Reading, Mass., 1988
Boguraev, B. and Briscoe, T. (eds) ,Computational Lexicography for Natural Language Processing, Longman, London and New York, 1989
Fontenelle, Th., Turning a Bilingual Dictionary into a Lexical-semantic Database, Unpublished PhD thesis, University of Liège, Liège, 1995
Kilgariff, A. (1992): Polysemy, Ph.D. Thesis, Cognitive Science Research Paper, Number 261, University of Sussex, Brighton.
Marcus, C., Prolog Programming, Addison-Wesley, 1986
Michiels, A., Exploiting a Large Dictionary Data Base, Unpublished PhD thesis, University of Liège, Liège, 1982
Michiels, A., Horatio, a Middle-sized NLP Application in Prolog, L3, Liège, 1995
Miller, G. A., ed. ‘WordNet: An On-Line Lexical Database’. International Journal of Lexicography, Volume 3, Number 4, 1990.
Montemagni, S., Federici, S. and Pirrelli,V. 1996. ‘Example-based Word Sense Disambiguation: a Paradigm-driven Approach, Euralex’96 Proceedings, Göteborg University, 151-160.  
Neff, M. & McCord, M., "Acquiring lexical data from machine-readable dictionary resources for machine translation", in Proceedings of the 3rd International Conference on Theoretical and Methodological Issues in Machine Translation of Natural Language, University of Texas at Austin, 1990, pp.85-90.


Back to the DEFI Home Page