A Machine Learning Approach to Coreference Resolution of Noun .ppt
《A Machine Learning Approach to Coreference Resolution of Noun .ppt》由会员分享,可在线阅读,更多相关《A Machine Learning Approach to Coreference Resolution of Noun .ppt(16页珍藏版)》请在麦多课文档分享上搜索。
1、A Machine Learning Approach to Coreference Resolution of Noun Phrases,By W.M.Soon, H.T.Ng, D.C.Y.Lim Presented by Iman Sen,Outline,Introduction Process Overview Pipeline Process to find Markables Feature Selection The Decision Tree Results for MUC-6, MUC-7 & error analysis Conclusions,Introduction,C
2、oreference for general noun phrases from unrestricted text. Learns using the decision tree method from a small annotated corpus. First learning based system that performed comparably with the best non-learning systems.,Process Overview,Markables are the union of all the noun phrases, named entities
3、and nested noun phrases found. Find markables using a pipeline of NLP modules Form feature vectors for appropriate pairs of markables. These are the training examples. Train the decision tree classifier on these examples. For testing, determine pairs of markables in test document and present to the
4、classifier. Stop after first successful coreference.,Tokenization & Sentence Segmentation,Morphological Processing,Free Text,POS tagger,NP Identification,Named Entity Recognition,Nested Noun Phrase Extraction,Semantic Class Determination,Markables,Pipelined NLP modules,Standard HMM based tagger,HMM
5、Based, uses POS tags from previous module,HMM based, recognizes organization, person, location, date, time, money, percent,2 kinds: prenominals such as (wage) reduction) and possessive NPs such as (his) dog).,More on this in a bit!,Determining the Markables for training,Sentence 1 1. (Eastern Airlin
6、es)a2 executives notified (union)el leaders that the carrier wishes to discuss selective ( (wage)c2 reductions)d2 on (Feb. 3)b2. 2. (Eastern Airlines)5 executives)6 notified ( (union)7 leaders)8 that (the carrier)9 wishes to discuss (selective (wage)10 reductions)11 on (Feb. 3)12. Sentence 2 1. ( (U
7、nion)e2 representatives who could be reached)f1 said (they)f2 hadnt decided whether (they)f3 would respond. 2. ( (Union)13 representatives)14 who could be reached said (they)15 hadnt decided whether (they)16 would respond.The first version of each sentence is the manual coreference annotation, the s
8、econd is the result of the pipeline modules. The letters in the 1st sentence denote coreference chains We make up pairs (i, j) as training examples We take only those NPs in a coreference chain where the NP boundaries match (shown in blue).,Determining the markables for training continued,In general
9、, if a1, a2, a3 is a coreference chain correctly identified, then make up (a1,a2), (a2,a3) as +ve examples, and for all NPs found in between, say, a2 & a3, called e, make up ve examples (e, a3). Then a feature vector is generated for each pair,Markables for testing,For testing, every antecedent i, b
10、efore j, is tried. Start with the immediate preceding i, and go backwards. Stop when you find the first +ve coreference. For nested NPs, we avoid the current markable.For example, in (his) daughter), we do not try to see if “his” corefers to “his daughter”.,Feature Selection,The authors selected the
11、 following 12 features: Distance Feature (DIST): If (i,j) are in the same sentence then equal to 0, if one sentence apart, then equal to 1 and so on. i-Pronoun Feature (I_PRONOUN): Values are true or false. Return true if i in (i , j) is a pronoun. j-Pronoun Feature (J_PRONOUN): Tests if j is a pron
12、oun in (i,j). String Match Feature (STR_MATCH): Returns true or false. Removes articles and demonstrative pronouns (such as “that”, “those”, etc) and tests for a match. Definite NP Feature (DEF_NP): If j starts with “the” return true, else false. Demonstrative Noun Phrase Feature (DEM_NP): If j star
13、ts with “this, that, these, those” then return true, else false. Number Agreement Feature (NUMBER): Morphological root is used to determine if noun is singular or plural (if not a pronoun), returns true or false.,Feature Selection continued,Semantic Class Agreement Feature (SEMCLASS): returns true,
14、false or unknown. Classes are “male, female, person, organization, location, date, time, money, percent, object”. Decided by the semantic module (pick 1st sense from WordNet), and is true if same or child of the other. For ex, male, female are persons, the others are objects. If either is unknown, c
- 1.请仔细阅读文档,确保文档完整性,对于不预览、不比对内容而直接下载带来的问题本站不予受理。
- 2.下载的文档,不会出现我们的网址水印。
- 3、该文档所得收入(下载+内容+预览)归上传者、原创作者;如果您是本文档原作者,请点此认领!既往收益都归您。
下载文档到电脑,查找使用更方便
2000 积分 0人已下载
下载 | 加入VIP,交流精品资源 |
- 配套讲稿:
如PPT文件的首页显示word图标,表示该PPT已包含配套word讲稿。双击word图标可打开word文档。
- 特殊限制:
部分文档作品中含有的国旗、国徽等图片,仅作为作品整体效果示例展示,禁止商用。设计者仅对作品中独创性部分享有著作权。
- 关 键 词:
- AMACHINELEARNINGAPPROACHTOCOREFERENCERESOLUTIONOFNOUNPPT

链接地址:http://www.mydoc123.com/p-373161.html