Advanced databases Inferring new knowledge from data(.ppt
《Advanced databases Inferring new knowledge from data(.ppt》由会员分享,可在线阅读,更多相关《Advanced databases Inferring new knowledge from data(.ppt(52页珍藏版)》请在麦多课文档分享上搜索。
1、 1Berendt: Advanced databases, winter term 2007/08, http:/www.cs.kuleuven.be/berendt/teaching/2007w/adb/1Advanced databases Inferring new knowledge from data(bases): Knowledge Discovery in DatabasesBettina BerendtKatholieke Universiteit Leuven, Department of Computer Sciencehttp:/www.cs.kuleuven.be/
2、berendt/teaching/2007w/adb/ Last update: 15 November 20072Berendt: Advanced databases, winter term 2007/08, http:/www.cs.kuleuven.be/berendt/teaching/2007w/adb/2AgendaMotivation: Application examplesThe process of knowledge discoveryOrigins and contextMajor issues in knowledge discoveryA short overv
3、iew of key techniques3Berendt: Advanced databases, winter term 2007/08, http:/www.cs.kuleuven.be/berendt/teaching/2007w/adb/3What is the impact of genetically modified organisms?4Berendt: Advanced databases, winter term 2007/08, http:/www.cs.kuleuven.be/berendt/teaching/2007w/adb/4Is our school syst
4、em good for immigrants and/or children from poor backgrounds?5Berendt: Advanced databases, winter term 2007/08, http:/www.cs.kuleuven.be/berendt/teaching/2007w/adb/5What are the effects of teaching in English at universities?6Berendt: Advanced databases, winter term 2007/08, http:/www.cs.kuleuven.be
5、/berendt/teaching/2007w/adb/6What makes people happy?7Berendt: Advanced databases, winter term 2007/08, http:/www.cs.kuleuven.be/berendt/teaching/2007w/adb/7What do men and women like?8Berendt: Advanced databases, winter term 2007/08, http:/www.cs.kuleuven.be/berendt/teaching/2007w/adb/8Is this a ma
6、n or a woman?clicked on9Berendt: Advanced databases, winter term 2007/08, http:/www.cs.kuleuven.be/berendt/teaching/2007w/adb/9Primary Tasks of Data MiningClassification Deviation andchange detection SummarizationClusteringDependency ModelingRegressionfinding the descriptionof several predefined cla
7、sses and classify a data item into one of them.maps a data item to a real-valued prediction variable.identifying a finite set of categories or clusters to describe the data.finding a compact description for a subset of datafinding a model which describes significant dependencies between variables.di
8、scovering the most significant changes in the data 10Berendt: Advanced databases, winter term 2007/08, http:/www.cs.kuleuven.be/berendt/teaching/2007w/adb/10AgendaMotivation: Application examplesThe process of knowledge discoveryOrigins and contextMajor issues in knowledge discoveryA short overview
9、of key techniques11Berendt: Advanced databases, winter term 2007/08, http:/www.cs.kuleuven.be/berendt/teaching/2007w/adb/11Data mining“ and knowledge discovery“n (informal definition):data mining is about discovering knowledge in (huge amounts of) datan Therefore, it is clearer to speak about “knowl
10、edge discovery in data(bases)”12Berendt: Advanced databases, winter term 2007/08, http:/www.cs.kuleuven.be/berendt/teaching/2007w/adb/12Recall: Data, information, and knowledgeData represents a fact or statement of event without relation to other things.n Ex: It is raining.Information embodies the u
11、nderstanding of a relationship of some sort, possibly cause and effect.n Ex: The temperature dropped 15 degrees and then it started raining.Knowledge represents a pattern that connects and generally provides a high level of predictability as to what is described or what will happen next.n Ex: If the
12、 humidity is very high and the temperature drops substantially the atmospheres is often unlikely to be able to hold the moisture so it rains.(This is from knowledge-management theory. If you want to know about wisdom, check the Web page:G. Bellinger, D. Castro, & A. Mills: Data, Information, Knowled
13、ge, and Wisdom. http:/www.systems-thinking.org/dikw/dikw.htm )13Berendt: Advanced databases, winter term 2007/08, http:/www.cs.kuleuven.be/berendt/teaching/2007w/adb/13Why Data Mining? The Explosive Growth of Data: from terabytes to petabytesn Data collection and data availabilityl Automated data co
14、llection tools, database systems, Web, computerized societyn Major sources of abundant datal Business: Web, e-commerce, transactions, stocks, l Science: Remote sensing, bioinformatics, scientific simulation, l Society and everyone: news, digital cameras, We are drowning in data, but starving for kno
15、wledge! “Necessity is the mother of invention” Data mining Automated analysis of massive data sets14Berendt: Advanced databases, winter term 2007/08, http:/www.cs.kuleuven.be/berendt/teaching/2007w/adb/14Background: Evolution of Database Technology1960s:n Data collection, database creation, IMS and
16、network DBMS1970s: n Relational data model, relational DBMS implementation1980s: n RDBMS, advanced data models (extended-relational, OO, deductive, etc.) n Application-oriented DBMS (spatial, scientific, engineering, etc.)1990s: n Data mining, data warehousing, multimedia databases, and Web database
17、s2000sn Stream data management and miningn Data mining and its applicationsn Web technology (XML, data integration) and global information systems 15Berendt: Advanced databases, winter term 2007/08, http:/www.cs.kuleuven.be/berendt/teaching/2007w/adb/15The KDD processThe non-trivial process of ident
18、ifying valid, novel, potentially useful, and ultimately understandable patterns in data - Fayyad, Platetsky-Shapiro, Smyth (1996) non-trivial processMultiple processvalid Justified patterns/modelsnovel Previously unknownuseful Can be used understandable by human and machine16Berendt: Advanced databa
19、ses, winter term 2007/08, http:/www.cs.kuleuven.be/berendt/teaching/2007w/adb/16The process part of knowledge discoveryCRISP-DM CRoss Industry Standard Process for Data Mining a data mining process model that describes commonly used approaches that expert data miners use to tackle problems.17Berendt
20、: Advanced databases, winter term 2007/08, http:/www.cs.kuleuven.be/berendt/teaching/2007w/adb/17Knowledge discovery, machine learning, data miningn Knowledge discovery= the whole process n Machine learningthe application of induction algorithms and other algorithms that can be said to learn.“= mode
21、ling“ phasen Data miningl sometimes = KD, sometimes = ML18Berendt: Advanced databases, winter term 2007/08, http:/www.cs.kuleuven.be/berendt/teaching/2007w/adb/18The KDD ProcessData organized by function Create/selecttarget databaseSelect samplingtechnique and sample dataSupply missing valuesNormali
- 1.请仔细阅读文档,确保文档完整性,对于不预览、不比对内容而直接下载带来的问题本站不予受理。
- 2.下载的文档,不会出现我们的网址水印。
- 3、该文档所得收入(下载+内容+预览)归上传者、原创作者;如果您是本文档原作者,请点此认领!既往收益都归您。
下载文档到电脑,查找使用更方便
2000 积分 0人已下载
下载 | 加入VIP,交流精品资源 |
- 配套讲稿:
如PPT文件的首页显示word图标,表示该PPT已包含配套word讲稿。双击word图标可打开word文档。
- 特殊限制:
部分文档作品中含有的国旗、国徽等图片,仅作为作品整体效果示例展示,禁止商用。设计者仅对作品中独创性部分享有著作权。
- 关 键 词:
- ADVANCEDDATABASESINFERRINGNEWKNOWLEDGEFROMDATAPPT

链接地址:http://www.mydoc123.com/p-378073.html