Ambiguity Management in Deep Grammar Engineering.ppt
《Ambiguity Management in Deep Grammar Engineering.ppt》由会员分享,可在线阅读,更多相关《Ambiguity Management in Deep Grammar Engineering.ppt(44页珍藏版)》请在麦多课文档分享上搜索。
1、Ambiguity Management in Deep Grammar Engineering,Tracy Holloway King,Ambiguity: bug or feature?,Bug in computer programming languages Feature in natural language People good at resolving ambiguity in context Ambiguity consequently often unperceived“Readjust paper holding clip”even though thousand-fo
2、ld ambiguities are common Ambiguity promotes conciseness Computers cant resolve ambiguity like humansIf we are going to build large-scale, linguistically sophisticated grammars, we need ways to handle ambiguity,Talk Outline,Sources of ambiguity Grammar engineering approaches Shallow markup (Dis)pref
3、erence marks Stochastic disambiguation Efficiency in ambiguity management,Sources of Ambiguity,Phonetic: “I scream” or “ice cream” Tokenization: “I like Jan.” - |Jan|. Or |Jan.|. (abbrev January) Morphological: “walks” - plural noun or 3sg verb “untieable knot” - un(tieable) or (untie)able Lexical:
4、“bank” - river bank or financial institution Syntactic: “The turkeys are ready to eat.” - fattened or hungry Semantic: “Two boys ate fifteen pizzas.” - 15 each or 15 total Pragmatic: “Sue won. Ed gave her a good luck charm.” - cause or result,PP Attachment A classic example of syntactic ambiguity,PP
5、 adjuncts can attach to VPs and NPs Strings of PPs in the VP are ambiguous I see the girl with the telescope. I see the girl with the telescope.I see the girl with the telescope. Ambiguities proliferate exponentially I see the girl with the telescope in the park I see the girl with the telescope in
6、the park I see the girl with the telescope in the park I see the girl with the telescope in the park I see the girl with the telescope in the park I see the girl with the telescope in the park The syntax has no way to determine the attachment, even if humans can.,Coverage entails ambiguity,I fell in
7、 the park. + I know the girl in the park.I see the girl in the park.,Ambiguity can be explosive,If alternatives multiply within or across components,Tokenize,Morphology,Syntax,Semantics,Discourse,Ambiguity figures,Deep grammars are massively ambiguous Example: 700 from section 23 of WSJ average # of
8、 words: 19.6 average # of optimal parses: 684 for 1-10 word sentences: 3.8 for 11-20 word sentences: 25.2 for 50-60 word sentences: 12,888,Managing Ambiguity,Grammar engineering approaches Trim early with shallow markup (Dis)preference marks on rules Choose most probable parse for applications that
9、need a single input Use packing to parse and manipulate the ambiguities efficiently,Talk Outline,Sources of ambiguity Grammar engineering approaches Shallow markup (Dis)preference marks Stochastic disambiguation Efficiency in ambiguity management,Shallow markup,Part of speech marking as filter I saw
10、 her duck/VB. accuracy of tagger (v. good for English) can use partial tagging (verbs and nouns) Named entities Goldman, Sachs & Co. bought IBM. good for proper names and times hard to parse internal structure Fall back technique if fail slows parsing accuracy vs. speed,Example shallow markup: Named
11、 entities,Allow tokenizer to accept marked up input:parse Mr. Thejskt Thejs arrived.tokenized string:Mr. Thejskt Thejs TB +NEperson Mr(TB). TB Thejskt TB Thejs,Add lexical entries and rules for NE tags,Resulting C-structure,Resulting F-structure,Results for shallow markup,Kaplan and King 2003,(Dis)p
12、reference marks (OT marks),Want to (dis)prefer certain constructions prefer: use when possible disprefer: do not use unless no other analysis Implementation Put marks in rules and lexical entries Rank those marks ranking can be different for different grammars/corpora Use most prefered parse(s) can
13、use as a two pass system for robust parsing,Ungrammatical input,Real world text contains ungrammatical input Deep grammars tend to only cover grammatical output Common errors can be coded in the rules may want to know that error occurred(e.g., provide feedback in CALL grammars) Disprefer parses of u
14、ngrammatical structures tools for grammar writer to rank rules two+ pass system standard rules rules for known ungrammatical constructions default fall back rules,Sample ungrammatical structures,Mismatched subject-verb agreementVerb3Sg = SUBJ PERS = 3SUBJ NUM = sg|BadVAgr Missing copulaVPcop = Vcop:
15、 =!|e: ( PRED)=NullBeMissingCopularVerb NP: ( XCOMP)=!|AP: ( XCOMP)=!| ,Dispreferred grammatical structures,Prefer subcategorized infinitives to adverbials I want it. I finished up (in order) to leave. I want it to leave.VP V(NP: ( OBJ)=!)(VPinf: ( XCOMP)=! +InfSubcat|! $ ( ADJUNCT) InfAdjunct ).Pos
16、t-copular gerunds He is a boy. (His) going is difficult. He is going.,OT Mark summary,Use (dis)preference marks to (dis)prefer constructions or words Allows inclusion of marginal/ungrammatical constructions Issues: Only works with ambiguities with known preferences (not PP attachment) Hard to determ
17、ine ranking for many marks Two-pass parsing can be slow,Talk Outline,Sources of ambiguity Grammar engineering approaches Shallow markup (Dis)preference marks Stochastic disambiguation Efficiency in ambiguity management,Packing & Pruning in XLE,XLE produces (too) many candidates All valid (with respe
18、ct to grammar and OT marks) Not all equally likely Some applications require a single best parse or at most just a handful (n best) Grammar writer cant specify correct choices Many implicit properties of words and structures with unclear significance,Pruning in XLE,Appeal to probability model to cho
- 1.请仔细阅读文档,确保文档完整性,对于不预览、不比对内容而直接下载带来的问题本站不予受理。
- 2.下载的文档,不会出现我们的网址水印。
- 3、该文档所得收入(下载+内容+预览)归上传者、原创作者;如果您是本文档原作者,请点此认领!既往收益都归您。
下载文档到电脑,查找使用更方便
2000 积分 0人已下载
下载 | 加入VIP,交流精品资源 |
- 配套讲稿:
如PPT文件的首页显示word图标,表示该PPT已包含配套word讲稿。双击word图标可打开word文档。
- 特殊限制:
部分文档作品中含有的国旗、国徽等图片,仅作为作品整体效果示例展示,禁止商用。设计者仅对作品中独创性部分享有著作权。
- 关 键 词:
- AMBIGUITYMANAGEMENTINDEEPGRAMMARENGINEERINGPPT

链接地址:http://www.mydoc123.com/p-378237.html