Data warehouse is an integrated repository derived from .ppt
《Data warehouse is an integrated repository derived from .ppt》由会员分享,可在线阅读,更多相关《Data warehouse is an integrated repository derived from .ppt(16页珍藏版)》请在麦多课文档分享上搜索。
1、Data warehouse is an integrated repository derived from multiple distributed source databases. Created by replicating or transforming source data to new representation. Some data can be web-database or regular databases (relational, files, etc.). Warehouse creation involves reading, cleaning, aggreg
2、ating, and storing data. Warehouse data is used for strategic analysis, decision making, market research types of applications. Open access to third party users.,Basics,Examples:,Human genome databases. Drug-drug interactions database created by thousands of doctors in hundreds of hospitals. Stock p
3、rices, analyst research. Teaching material (slides, exercises, exams, examples). Census data or similar statistics collected by government.,Ideas for Security,Replication Aggregation and Generalization Exaggeration and Mutilation Anonymity User Profiles, Access Permissions,Anonymity,User privacy and
4、 warehouse data privacy. User does not know the source of data. Warehouse system does not store the results and even the access path for the query. Separation of storage system and audit query system*. Non-intrusive auditing and monitoring. Distribution of query processing, logs, auditing activity.
5、Secure multi-party computation. Mental poker (card distribution).,One can divulge information to a third party without revealing where it came from and without necessarily revealing the system has done so.,* Research project of Atallah and Prabhakar at Purdue.,Witness (Permission Inference)User can
6、execute query Q if there is an equivalent query Q for which the user has permission. Security is on result and not computation. Create views over mutually suspicious organizations by filtering out sensitive data.,Similarity Depends on Application,Two objects might be similar to a K-12 student, but n
7、ot a scientist. 1999 and 1995 annual reports of the CS department might be similar to a graduate school applicant, but not to a faculty applicant.,Similarity Based Replication*,Distinct functions used to determine how similar two objects are (Distinct Preserving Transformations). Precision: fraction
8、 of retrieved data as needed (relevant) for the user query. False Positive: object retrieved that is similar to the data needed by query, but it is not. False Negative: object is needed by the query, but not retrieved.,SOME DEFINITIONS:,* Bhargava/Annamalia, Defining Data Equivalence, IDPT, 1996,Acc
- 1.请仔细阅读文档,确保文档完整性,对于不预览、不比对内容而直接下载带来的问题本站不予受理。
- 2.下载的文档,不会出现我们的网址水印。
- 3、该文档所得收入(下载+内容+预览)归上传者、原创作者;如果您是本文档原作者,请点此认领!既往收益都归您。
下载文档到电脑,查找使用更方便
2000 积分 0人已下载
下载 | 加入VIP,交流精品资源 |
- 配套讲稿:
如PPT文件的首页显示word图标,表示该PPT已包含配套word讲稿。双击word图标可打开word文档。
- 特殊限制:
部分文档作品中含有的国旗、国徽等图片,仅作为作品整体效果示例展示,禁止商用。设计者仅对作品中独创性部分享有著作权。
- 关 键 词:
- DATAWAREHOUSEISANINTEGRATEDREPOSITORYDERIVEDFROMPPT

链接地址:http://www.mydoc123.com/p-372906.html