Debugging information in defi results


At the user’s request, defi provides information on the way it reached the final weight that it assigns to a given match. In the case of multi-word units, defi goes through the mwu (or example) as recorded in dic, and shows how each item was matched against the textual chunk. Consider the following S in the standard defi test suite :

 

[4] George has been working his fingers to the bone.

 

265 - 321534, efm, [bone], to work one's fingers to the bone, tr(s'user au travail # se crever à la tâche {coll}), [m(c1,vac,to,0),m(c6,dic(work),txt(working),morph(0),syn(3),30),m(c17,dic(oneßs),txt(his),morph(5),syn(3),5),m(c5,dictxt(fingers),morph(6),syn(5),50),m(c5,dictxt(to),morph(2),syn(5),25),m(c5,dictxt(the),morph(9),syn(0),25),m(c5,dictxt(bone),morph(7),syn(3),50),m(c4,vac,punct,0)]

 

With comments :

 

265                                                                            % Global weight

 - 321534,                                                                  % Identification number in dic
efm,                                                                            % Origin (efm : merged dictionary)

[bone],                                                                       % selected item list

 to work one's fingers to the bone,                               % mwu as recorded in dic

 tr(s'user au travail # se crever à la tâche {coll}),          % translation

[                                                                                 % list of matches, one per item in the recorded mwu or dictionary example
m(c1,vac,to,0),                                                           % to is matched vacuously, i.e. without weight assignment

m(c6,dic(work),txt(working),morph(0),syn(3),30),      % work matched against working, reaping weight 3 on match of syntactic features
m(c17,dic(oneßs),txt(his),morph(5),syn(3),5),             % one’s matched against his, reaping weight 5 on match of morphological features, 3 on syntax
m(c5,dictxt(fingers),morph(6),syn(5),50),                    % fingers matched against fingers
m(c5,dictxt(to),morph(2),syn(5),25),                           % to matched against to
m(c5,dictxt(the),morph(9),syn(0),25),                         % the matched against the
m(c5,dictxt(bone),morph(7),syn(3),50),                      % bone matched against bone
m(c4,vac,punct,0)                                                       % punctuation matched vacuously
]

 

Each m-structure has a first argument referring to the type of match, which enables the defi programmer to see which predicate led to success. For instance, c1 refers to the vacuous matching of the infinitive particle to at the beginning of a mwu functioning as vp. The corresponding Prolog code is the following :

 

%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%

% TO starting a VP node in a dictionary entry

% no weight assigned

%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%

 

% CASE 1

 

smatch(_,vp,

           Hlist,

           [w(0,1,text(to,T1),L,M,S)|RDicwlist],

           RDicwlist,

           DicNplist,

           Twlist,

           Twlist,

           TNplist,

           Idnum,

           0,

           1,

           Onenter,

           Onenter,

           [],

           to,

           [0],

           m(c1,vac,to,0),                                                                       % Debugging info

           0) :-!.

 

 

The last argument of the m-structure is the global weight assigned to the match, independently of the marks reaped for congruence in morphosyntactic features. The global weight is higher when both wordform and lexeme match (50 for fingers/fingers as opposed to 30 for work/working), and is lower when the item matched is a toolword (25 for the/the as opposed to 50 for bone/bone).

 

The assignment of marks for congruence in morphological and syntactic features is based on the enhanced parses obtained for both the mwu in the dictionary (the parser and enhancer were run once on the whole dictionary, dic) and the textual chunk (parser and enhancer run as part of the preprocessing). The congruence is significant only because the same parser and enhancer are used on the dictionary and the user’s text. 

 

 

For single-word lexemes the debugging information concerns properties of the selected word and of its environment :

 

Part of speech

Syntactic environment (envir-structure)
Label matching
Indicator matching

Collocate matching

 

 

Consider the following text chunks in the defi standard test suite :

 

 [56] Taking up clubs and spears and bows, they crouch and advance, ghostly, out of the Stone Age (which they inhabited until the day before yesterday) and into a clicking horde of tourists from a cruise ship that docked on the coast this morning.

Clw = docked

 

40                                                                                          % Global Weight
 -
81733,                                                                                   % Identification number in sdic
efm,                                                                                        % Origin
dock,                                                                                     % Selected word
mettre {qch} à quai,                                                               % Selected translation
[                                                                                             % Debugging info list   
pos(20),                                                                                 % Match in Part of Speech (in casu verb)
vb(15),                                                                                   % Verb Bonus
label(naut,4),                                                                          % Label matching (identity – Nautical)
label(naut,mil,1)                                                                      % Label matching (partial, through association between Nautical and Military)
]

 

 

The i-structures are concerned with indicator matching (see Indicator matching in DEFI). I1 records a match between the indicator and the list of lemmas in the textual chunk. I2 records a match between the indicator and the collocate lists associated with the collocate bearer (this presupposes that the selected item is in the right syntactic relation with the collocate bearer).

 

Indicator matching, like collocate matching, calls on Roget’s Thesaurus (rg), WordNet (wn) and the data base of metalinguistic information derived from the merged bilingual dictionary (mt).

 

The debugging info list ends with a list of elements to be found in the definitions and examples of the monolingual entries consulted in the course of the indicator matching procedure. An example of indicator matching is provided by sentence 60 in the standard DEFI test suite :

 

[60] Of all the communications that Bernard Sands received on the day of his triumph the one which gave him the greatest satisfaction was the Treasury's final confirmation of official financial backing.

Clw = backing

 

35 - 39806, efm, backing, soutien {m},

[

pos(15),vb(0),                                                                        % POS match and verb bonus
label(pol,admin,1),label(pol,4),                                                % Label matching

i1(support,backing,rg(0),mt(0),wn(5)),                                    % i1-structure

i2(support,strike,rg(0),mt(2),wn(0)),                                       % i2-structures (the indicator is support)

i2(support,backing,rg(0),mt(0),wn(5)),

i2(support,problem,rg(0),mt(4),wn(0)),

i2(support,service,rg(0),mt(2),wn(0)),

institution                                                                                % element shared by the monolingual entries looked up in indicator matching

]

 

Sdic lists the following entries for backing (the relevant label and indicator are highlighted) :

 

sdic(39806,'backing','n',sc([nil]),oc([nil]),ind(['support']),lab(['fin','fig','pol']),env([nil]),sst([nil]),head([nil]),sf([nil]),st(nil),xr([nil]),gt(nil),tr($soutien {m}$),rat(2,14),gl(nil),efm).

sdic(39807,'backing','n',sc([nil]),oc([nil]),ind([nil]),lab(['mus']),env([nil]),sst([nil]),head([nil]),sf([nil]),st(nil),xr([nil]),gt(nil),tr($accompagnement {m}$),rat(2,14),gl(nil),efm).

sdic(39808,'backing','n',sc([nil]),oc([nil]),ind(['to stiffen']),lab(['lit','gen']),env([nil]),sst([nil]),head([nil]),sf([nil]),st(nil),xr([nil]),gt(nil),tr($support {m}, renforcement {m}$),rat(2,14),gl(nil),efm).

sdic(39809,'backing','n',sc([nil]),oc([nil]),ind(['reverse layer']),lab([nil]),env([nil]),sst([nil]),head([nil]),sf([nil]),st(nil),xr([nil]),gt(nil),tr($revêtement {m} intérieur$),rat(1,14),gl(nil),ohef).

sdic(39810,'backing','modif',sc([nil]),oc([nil]),ind([nil]),lab(['mus']),env([nil]),sst([nil]),head(['singer','group']),sf([nil]),st(nil),xr([nil]),gt(nil),tr($d'accompagnement$),rat(1,14),gl(nil),ohef).

sdic(39811,'backing','n',sc([nil]),oc(['book']),ind([nil]),lab(['lit']),env([nil]),sst([nil]),head([nil]),sf([nil]),st(nil),xr([nil]),gt(nil),tr($endossure {f}$),rat(1,14),gl(nil),rcef).

sdic(39812,'backing','n',sc([nil]),oc(['picture']),ind([nil]),lab(['lit']),env([nil]),sst([nil]),head([nil]),sf([nil]),st(nil),xr([nil]),gt(nil),tr($entoilage {m}$),rat(1,14),gl(nil),rcef).

sdic(39813,'backing','n',sc([nil]),oc([nil]),ind([nil]),lab(['betting']),env([nil]),sst([nil]),head([nil]),sf([nil]),st(nil),xr([nil]),gt(nil),tr($paris {mpl}$),rat(1,14),gl(nil),rcef).

sdic(39814,'backing','n',sc([nil]),oc(['horse','cart']),ind(['movement']),lab([nil]),env([nil]),sst([nil]),head([nil]),sf(['move']),st(nil),xr([nil]),gt(nil),tr($recul {m}$),rat(1,14),gl(nil),rcef).

sdic(39815,'backing','n',sc([nil]),oc(['boat']),ind(['movement']),lab([nil]),env([nil]),sst([nil]),head([nil]),sf(['move']),st(nil),xr([nil]),gt(nil),tr($nage {f} à culer$),rat(1,14),gl(nil),rcef).

sdic(39816,'backing','n',sc([nil]),oc(['wind']),ind(['movement']),lab([nil]),env([nil]),sst([nil]),head([nil]),sf(['move']),st(nil),xr([nil]),gt(nil),tr($changement {m} de direction en sens inverse des aiguilles d'une montre$),rat(1,14),gl(nil),rcef).

 

The sdic entry for financial refers to backing in its collocate list, but also to service and problem, whose match with the indicator support is attempted and recorded in the i2 structures :

 

sdic(96773,'financial','adj',sc(['adviser','backing','institution','problem','service']),oc([nil]),ind([nil]),lab(['gen']),env([nil]),sst([nil]),head([nil]),sf([nil]),st(nil),xr([nil]),gt(nil),tr($financier/-ière$),rat(2,3),gl(nil),efm).

 

 

 

The link between financial and backing is provided in the parse for the textual chunk (alongside with the link between official and backing) :

 

txt(s,['backing'],[w(0,1,text('of',u),lem('of',u),morph([m(pos,prep,2)]),syn([s(type,nom_of,3,l)])),w(1,2,text('all',l),lem('all',l),morph([m(type,quant,2),m(pos,det,2),m(type,pre,0),m(num,sg_or_pl,1)]),syn([s(type,qn,3,r)])),w(2,3,text('the',l),lem('the',l),morph([m(type,def,3),m(pos,det,2),m(type,central,0),m(type,art,3),m(num,sg_or_pl,1)]),syn([s(type,dn,0,r)])),w(3,4,text('communications',l),lem('communication',l),morph([m(pos,n,5),m(case,nom,0),m(num,pl,1)]),syn([s(type,p,3,l)])),w(4,5,text('that',l),lem('that',l),morph([m(type,nonmod,0),m(type,clb,2),m(type,rel,2),m(pos,pron,2),m(num,sg_or_pl,1)]),syn([s(type,p,3,l)])),w(5,6,text('bernard',u),lem('bernard',u),morph([m(type,proper,3),m(pos,n,5),m(case,nom,0),m(num,sg,2)]),syn([s(type,nn,3,r)])),w(6,7,text('sands',u),lem('sand',u),morph([m(type,u,3),m(pos,n,5),m(case,nom,0),m(num,pl,1)]),syn([s(func,subj,5,_)])),w(7,8,text('received',l),lem('receive',l),morph([m(pos,v,5),m(tense,past,1),m(type,finite,2)]),syn([s(type,main,3,f)])),w(7,8,text('received',l),lem('receive',l),morph([m(pos,edform,2)]),syn([s(type,nom_fmainv,3,l)])),w(8,9,text('on',l),lem('on',l),morph([m(pos,prep,2)]),syn([s(func,adv,5,_)])),w(9,10,text('the',l),lem('the',l),morph([m(type,def,3),m(pos,det,2),m(type,central,0),m(type,art,3),m(num,sg_or_pl,1)]),syn([s(type,dn,0,r)])),w(10,11,text('day',l),lem('day',l),morph([m(pos,n,5),m(case,nom,0),m(num,sg,2)]),syn([s(type,p,3,l)])),w(11,12,text('of',l),lem('of',l),morph([m(pos,prep,2)]),syn([s(type,nom_of,3,l)])),w(12,13,text('his',l),lem('he',l),morph([m(pos,pron,2),m(type,pers,2),m(gender,masc,1),m(case,gen,3),m(num,sg3,1)]),syn([s(type,gn,3,r)])),w(13,14,text('triumph',l),lem('triumph',l),morph([m(pos,n,5),m(case,nom,0),m(num,sg,2)]),syn([s(type,p,3,l)])),w(14,15,text('the',l),lem('the',l),morph([m(type,def,3),m(pos,det,2),m(type,central,0),m(type,art,3),m(num,sg_or_pl,1)]),syn([s(type,dn,0,r)])),w(15,16,text('one',l),lem('one',l),morph([m(pos,pron,2),m(case,nom,0),m(num,sg,2)]),syn([s(func,subj,5,_)])),w(16,17,text('which',l),lem('which',l),morph([m(type,nonmod,0),m(type,rel,2),m(pos,pron,2),m(type,wh,3),m(case,nom,0),m(num,sg_or_pl,1)]),syn([s(func,subj,5,_)])),w(17,18,text('gave',l),lem('give',l),morph([m(pos,v,5),m(tense,past,1),m(type,finite,2)]),syn([s(type,main,3,f)])),w(18,19,text('him',l),lem('he',l),morph([m(type,nonmod,0),m(pos,pron,2),m(type,pers,2),m(gender,masc,1),m(case,acc,3),m(num,sg3,1)]),syn([s(func,i_obj,5,_)])),w(19,20,text('the',l),lem('the',l),morph([m(type,def,3),m(pos,det,2),m(type,central,0),m(type,art,3),m(num,sg_or_pl,1)]),syn([s(type,dn,0,r)])),w(20,21,text('greatest',l),lem('great',l),morph([m(pos,adj,5),m(degree,sup,2)]),syn([s(type,an,3,r)])),w(21,22,text('satisfaction',l),lem('satisfaction',l),morph([m(type,u,3),m(pos,n,5),m(case,nom,0),m(num,sg,2)]),syn([s(func,subj,5,_),s(func,obj,5,_)])),w(22,23,text('was',l),lem('be',l),morph([m(pos,v,5),m(tense,past,1),m(num,sg1_3,1),m(type,finite,2)]),syn([s(type,main,3,f)])),w(23,24,text('the',l),lem('the',l),morph([m(type,def,3),m(pos,det,2),m(type,central,0),m(type,art,3),m(num,sg_or_pl,1)]),syn([s(type,dn,0,r)])),w(24,25,text('treasuryßs',u),lem('treasury',u),morph([m(pos,n,5),m(case,gen,3),m(num,sg,2)]),syn([s(type,gn,3,r)])),w(25,26,text('final',l),lem('final',l),morph([m(pos,adj,5),m(degree,abs,0)]),syn([s(type,an,3,r)])),w(26,27,text('confirmation',l),lem('confirmation',l),morph([m(pos,n,5),m(case,nom,0),m(num,sg,2)]),syn([s(func,subj_cplt,5,_)])),w(27,28,text('of',l),lem('of',l),morph([m(pos,prep,2)]),syn([s(type,nom_of,3,l)])),w(28,29,text('official',l),lem('official',l),morph([m(type,nominal,3),m(pos,adj,5),m(degree,abs,0)]),syn([s(type,an,3,r)])),w(29,30,text('financial',l),lem('financial',l),morph([m(pos,adj,5),m(degree,abs,0)]),syn([s(type,an,3,r)])),w(30,31,text('backing',l),lem('backing',l),morph([m(pos,ingform,2)]),syn([s(type,p,3,l)])),w(30,31,text('backing',l),lem('back',l),morph([m(pos,ingform,2)]),syn([s(type,p,3,l)])),punct(31,32,unkn)],[np(1,5,c(3,4)),np(5,7,c(6,7)),np(9,11,c(10,11)),np(12,14,c(13,14)),np(14,16,c(15,16)),np(16,17,c(16,17)),np(18,19,c(18,19)),np(19,22,c(21,22)),np(23,27,c(26,27)),np(28,31,c(30,31)),np(1,4,c(3,4)),np(4,5,c(4,5)),np(1,7,c(3,4)),np(12,16,c(13,14)),np(14,17,c(15,16)),np(18,22,c(18,19)),np(4,7,c(4,5)),np(9,14,c(10,11)),np(9,16,c(10,11)),np(23,31,c(26,27))],[cprep(0,'of','communication'),cnn('bernard','sand'),csubj('sand','receive'),cprep(8,'on','day'),cdobj('day','receive'),cprep(11,'of','triumph'),ccplt('triumph','day'),csubj('one','give'),csubj('which','give'),ciobj('he','give'),cadj('great','satisfaction'),csubj('satisfaction','be'),ccplt('treasury','confirmation'),cadj('final','confirmation'),cprep(27,'of','backing'),cadj('official','backing'),cadj('financial','backing')],neg(0),passive(0),s,$Of all the communications that Bernard Sands received on the day of his triumph the one which gave him the greatest satisfaction was the Treasury's final confirmation of official financial backing.$)

 

 

 

Collocate matching is illustrated by the following text chunk of the COBUILD-derived test bed :

 

[27] He examined the cut and applied a plaster.

Clw = applied

 

105 - 34904, efm, apply, appliquer, mettre,

[

pos(20),vb(5),

coll(glue,plaster,mt(4),rg(30),wn(0),tot(34)),               % collocate match (glue as collocate against plaster as textual syntactic object)

coll(paint,plaster,mt(20),rg(60),wn(0),tot(80)),            % idem with collocate paint

coll(ointment,plaster,mt(0),rg(50),wn(0),tot(50))          % idem with collocate ointment         

]

 

85 - 34909, ohef, apply, appliquer,

[

pos(20),vb(5),

coll(bandage,plaster,mt(0),rg(60),wn(0),tot(60))        % idem with collocate bandage

]

 

The relevant entries in sdic are the following :

 

sdic(34903,'apply','vtr',sc([nil]),oc(['logic','theory','rule','standard','method','penalty','technology','heat','law']),ind(['use']),lab([nil]),env([e('to','à')]),sst([nil]),head([nil]),sf([nil]),st(nil),xr([nil]),gt(nil),tr($appliquer$),rat(7,17),gl(nil),efm).

 

sdic(34904,'apply','vtr',sc([nil]),oc(['glue','make-up','paint','ointment','dressing']),ind(['spread']),lab([nil]),env([e('to','sur')]),sst([nil]),head([nil]),sf([nil]),st(nil),xr([nil]),gt(nil),tr($appliquer, mettre$),rat(2,17),gl(nil),efm).

 

sdic(34905,'apply','vtr',sc([nil]),oc(['friction','pressure']),ind([nil]),lab([nil]),env([e('to','sur')]),sst([nil]),head([nil]),sf([nil]),st(nil),xr([nil]),gt(nil),tr($exercer$),rat(1,17),gl(nil),ohef).

sdic(34906,'apply','vtr',sc([nil]),oc(['label','term']),ind(['give']),lab([nil]),env([e('to','à')]),sst([nil]),head([nil]),sf([nil]),st(nil),xr([nil]),gt(nil),tr($appliquer$),rat(7,17),gl(nil),ohef).

sdic(34907,'apply','vtr',sc([nil]),oc(['sticker']),ind(['affix']),lab([nil]),env([e('to','sur')]),sst([nil]),head([nil]),sf([nil]),st(nil),xr([nil]),gt(nil),tr($apposer$),rat(1,17),gl(nil),ohef).

sdic(34908,'apply','vtr',sc([nil]),oc(['decoration']),ind([nil]),lab([nil]),env([e('to','sur')]),sst([nil]),head([nil]),sf([nil]),st(nil),xr([nil]),gt(nil),tr($disposer$),rat(1,17),gl(nil),ohef).

 

sdic(34909,'apply','vtr',sc([nil]),oc(['bandage','sequins','sequin']),ind([nil]),lab([nil]),env([e('to','sur')]),sst([nil]),head([nil]),sf([nil]),st(nil),xr([nil]),gt(nil),tr($appliquer$),rat(7,17),gl(nil),ohef).

 

sdic(34910,'apply','vi',sc([nil]),oc([nil]),ind(['request']),lab([nil]),env([nil]),sst([nil]),head([nil]),sf([nil]),st(nil),xr([nil]),gt(nil),tr($faire une demande$),rat(1,17),gl(nil),ohef).

sdic(34911,'apply','vi',sc([nil]),oc([nil]),ind(['seek work']),lab([nil]),env([nil]),sst([nil]),head([nil]),sf([nil]),st(nil),xr([nil]),gt(nil),tr($poser sa candidature$),rat(1,17),gl(nil),ohef).

sdic(34912,'apply','vi',sc([nil]),oc([nil]),ind(['seek entry','to college']),lab([nil]),env([e('to','à')]),sst([nil]),head([nil]),sf([nil]),st(nil),xr([nil]),gt(nil),tr($faire une demande d'inscription$),rat(1,17),gl(nil),ohef).

sdic(34913,'apply','vi',sc([nil]),oc([nil]),ind(['seek entry','to club, society']),lab([nil]),env([e('to','à')]),sst([nil]),head([nil]),sf([nil]),st(nil),xr([nil]),gt(nil),tr($faire une demande d'adhésion$),rat(1,17),gl(nil),ohef).

sdic(34914,'apply','vi',sc(['definition','term']),oc([nil]),ind(['be valid']),lab([nil]),env([e('to','à')]),sst([nil]),head([nil]),sf([nil]),st(nil),xr([nil]),gt(nil),tr($s'appliquer$),rat(1,17),gl(nil),ohef).

sdic(34915,'apply','vi',sc(['ban','rule','penalty']),oc([nil]),ind([nil]),lab([nil]),env([nil]),sst([nil]),head([nil]),sf([nil]),st(nil),xr([nil]),gt(nil),tr($être en vigueur$),rat(1,17),gl(nil),ohef).

sdic(34916,'apply','vtr',sc([nil]),oc(['theory']),ind([nil]),lab([nil]),env([e('to','à')]),sst([nil]),head([nil]),sf([nil]),st(nil),xr([nil]),gt(nil),tr($appliquer, mettre en pratique / en application$),rat(1,17),gl(nil),rcef).

sdic(34917,'apply','vi',sc([nil]),oc([nil]),ind([nil]),lab([nil]),env([e('to sb for sth','à qn pour obtenir qch')]),sst([nil]),head([nil]),sf([nil]),st(nil),xr([nil]),gt(nil),tr($s'adresser, avoir recours$),rat(1,17),gl(nil),rcef).

 

The link between apply and plaster is provided by the parse :

 

txt(s,['applied'],[w(0,1,text('he',u),lem('he',u),morph([m(type,nonmod,0),m(pos,pron,2),m(type,pers,2),m(gender,masc,1),m(case,nom,0),m(num,sg3,1),m(func,subj,3)]),syn([s(func,subj,5,_)])),w(1,2,text('examined',l),lem('examine',l),morph([m(pos,v,5),m(tense,past,1),m(type,finite,2)]),syn([s(type,main,3,f)])),w(2,3,text('the',l),lem('the',l),morph([m(type,def,3),m(pos,det,2),m(type,central,0),m(type,art,3),m(num,sg_or_pl,1)]),syn([s(type,dn,0,r)])),w(3,4,text('cut',l),lem('cut',l),morph([m(pos,n,5),m(case,nom,0),m(num,sg,2)]),syn([s(func,obj,5,_)])),w(4,5,text('and',l),lem('and',l),morph([m(pos,cconj,2)]),syn([s(type,cc,0,_)])),w(5,6,text('applied',l),lem('apply',l),morph([m(pos,v,5),m(tense,past,1),m(type,finite,2)]),syn([s(type,main,3,f)])),w(6,7,text('a',l),lem('a',l),morph([m(type,indef,3),m(pos,det,2),m(type,central,0),m(type,art,3),m(num,sg,2)]),syn([s(type,dn,0,r)])),w(7,8,text('plaster',l),lem('plaster',l),morph([m(pos,n,5),m(case,nom,0),m(num,sg,2)]),syn([s(func,obj,5,_)])),punct(8,9,unkn)],[np(0,1,c(0,1)),np(2,4,c(3,4)),np(6,8,c(7,8))],[csubj('he','examine'),cdobj('cut','examine'),cdobj('plaster','apply')],neg(0),passive(0),s,$He examined the cut and applied a plaster.$).

 


Back to the DEFI Home Page