欢迎来到麦多课文档分享! | 帮助中心 海量文档,免费浏览,给你所需,享你所想!
麦多课文档分享
全部分类
  • 标准规范>
  • 教学课件>
  • 考试资料>
  • 办公文档>
  • 学术论文>
  • 行业资料>
  • 易语言源码>
  • ImageVerifierCode 换一换
    首页 麦多课文档分享 > 资源分类 > PPT文档下载
    分享到微信 分享到微博 分享到QQ空间

    The ADEPTDigital Library Architecture.ppt

    • 资源ID:373037       资源大小:121KB        全文页数:42页
    • 资源格式: PPT        下载积分:2000积分
    快捷下载 游客一键下载
    账号登录下载
    微信登录下载
    二维码
    微信扫一扫登录
    下载资源需要2000积分(如需开发票,请勿充值!)
    邮箱/手机:
    温馨提示:
    如需开发票,请勿充值!快捷下载时,用户名和密码都是您填写的邮箱或者手机号,方便查询和重复下载(系统自动生成)。
    如需开发票,请勿充值!如填写123,账号就是123,密码也是123。
    支付方式: 支付宝扫码支付    微信扫码支付   
    验证码:   换一换

    加入VIP,交流精品资源
     
    账号:
    密码:
    验证码:   换一换
      忘记密码?
        
    友情提示
    2、PDF文件下载后,可能会被浏览器默认打开,此种情况可以点击浏览器菜单,保存网页到桌面,就可以正常下载了。
    3、本站不支持迅雷下载,请使用电脑自带的IE浏览器,或者360浏览器、谷歌浏览器下载即可。
    4、本站资源下载后的文档和图纸-无水印,预览文档经过压缩,下载后原文更清晰。
    5、试题试卷类文档,如果标题没有明确说明有答案则都视为没有答案,请知晓。

    The ADEPTDigital Library Architecture.ppt

    1、The ADEPT Digital Library Architecture,JAMES FREW & GREG JANE Alexandria Digital Library Project University of California, Santa Barbara,ADEPT Introduction,Alexandria Digital Earth ProtoType (ADEPT) is: Distributed digital library for geo-referenced information Services supporting DL federation and

    2、interoperation Large geospatial collectionsGoal: an Internet “library” layer Organization Persistence Accessibility Scalability Lots of collections Big collections small collections Heterogeneous contents,Outline,Core library architecture Metadata interoperability Other features Query translation Co

    3、llection discovery,ADEPT Core Library Architecture,Architectural Elements,Item structured descriptions (“reports”) contents (optional)Library set of collections client (public) servicesCollection set of items library (internal) services= a distributed catalog system,The Big Picture,library (middlewa

    4、re server),client,item,item,item,item,library (middleware server),proxy,collection,collection,collection,ADEPT,Role of the Middleware,collection,collection,logical view,collection discovery service,client,thesaurus/ vocabulary,Middleware Interfaces,ClientsConfiguration collection-id Collection(colle

    5、ction-id) report Query(query) query-id Results(query-id) holding-id Metadata(collection-id, holding-id, view) reportLibrariesCollection report Query(query, accumulator) query-thread Metadata(holding-id, view) report Collections,Metadata Reports,Collection Metadata that applies to entire collection B

    6、ucket Items bucket metadata Scan Brief (“1-line”) subset of bucket report Full All the items metadata, in whatever format Browse URL(s) reduced-resolution graphics Access URL(s) content (if available),A Complete System: The Boxology,HTTP transport,SDLIP proxy, other clients,web browser,generic DB dr

    7、iver,query translator,M I D D L E W A R E,C L I E N T,S E R V E R,web intermediary/ XMLHTML converter,RDBMS,JDBC,configuration files, Python scripts,RMI transport,proxy driver,HTTP,HTTP,XML,XML,group driver,thesauri,OR,paradigm library,Z39.50 driver,Metadata Interoperability,ADEPTs Interoperability

    8、Problem,Distributed, heterogeneous collections locally, autonomously created and managedMinimize impact on collection providers allow use of native metadataUniform client services common high-level interface across collections discover and exploit collection-specific interfacesAssumptions items have

    9、 metadata items have sufficient, “good” metadata i.e., this is a metadata interoperability problem,Bucket Concept,Abstract metadata category Strongly typed Well-defined search semantics query terms query operators Explicitly mapped from source metadata (FGDC, 1.3, “Time period of content”, “2001-09-

    10、08”)Bucket-level search uniform across all collections e.g.: search all collections for items whose Originator bucket contains the phrase “geological survey”,Bucket Properties,name Coverage date semantic definition The time period to which the item is relevant. data type (strictly observed) calendar

    11、 date or range of calendar dates syntactic representation (strictly observed) ISO 8601,What is a bucket? (2/2),Source metadata is mapped to buckets buckets hold not just simple values “2001-09-08” but rather, explicit representations of mappings (FGDC, 1.3, “Time period of content”, “2001-09-08”) ma

    12、y have multiple values per bucket Bucket definition includes search semantics query terms ISO 8601 date range query operators contains, overlaps, is-contained-in Some semantics are fuzzy to accommodate multiple implementations,Collection-level aggregation,Collection-level metadata describes buckets

    13、supported by the collection item-level metadata mappings statistical overviews item counts spatiotemporal coverage histograms Example (de-XML-ized) in collection foo, the Originator bucket is supported and the following item fields are mapped to it: (FGDC, 1.1/8.1, “Citation/Originator”) 973 items (

    14、USGS DOQ, PRODUCER, “Producer”) 973 items (DC, Creator, “Creator”) 1249 items unknown 6 items,Searching collections,Bucket-level uniform across all collections example search all collections for items whose Originator bucket contains the phrase “geological survey” Field-level collection-specific but

    15、 discovery and invocation mechanisms are uniform functionally equivalent to searching the entire bucket plus additional constraint example search collection foo for items whose FGDC 1.1/8.1 field within the Originator bucket contains the phrase,Bucket types (1/7),6 bucket types: spatial, temporal, h

    16、ierarchical, textual, qualified textual, numeric Type captures the portion of the bucket definition that has functional implications data type & syntactic representation query terms query operators Complete bucket definition name semantic definition bucket type,Bucket types (2/7),Spatial data type:

    17、any of several types of geometric regions defined in WGS84 latitude/longitude coordinates syntax: defined by ADEPT query terms: WGS84 box or polygon operators: contains, overlaps, is-contained-in example query:,Bucket types (3/7),Temporal data type: calendar date or range of calendar dates syntax: I

    18、SO 8601 query term: range of calendar dates operators: contains, overlaps, is-contained-in example query: ,Bucket types (4/7),Hierarchical data type: term drawn from a controlled vocabulary (thesaurus, etc.) one-to-one relationship between hierarchical buckets and vocabularies query term: vocabulary

    19、 term operator: is-a example query: ,Bucket types (5/7),Textual data type: text query term: text operators: contains-all-words, contains-any-words, contains-phrase example query: ,Bucket types (6/7),Qualified textual data type: text with optional associated namespace query term: same query operator:

    20、 matches example query: ,Bucket types (7/7),Numeric data type: real number query term: real number query operators: standard relational operators example query: ,Bucket types vs. buckets,Bucket types are defined architecturally Buckets in use are defined by collections and items need standard bucket

    21、s, defined conventionally, to support cross-collection uniformity ADL core buckets simple; universal; easily useful Bucket descriptions in the following slides: bucket type semantic definition effective treatment of multiple values in searching comparison to Dublin Core,ADL core buckets (1/6),Subjec

    22、t-related text Title Assigned term Originator Geographic location Coverage date Object type Feature type Format Identifier,ADL core buckets (2/6),Subject-related text type: textual description: text indicative of items subject not necessarily from controlled vocabularies superset of Title and Assign

    23、ed term multiple values: concatenated compare: DC.SubjectTitle type: textual description: items title subset of Subject-related text multiple values: concatenated compare: DC.Title,ADL core buckets (3/6),Assigned term type: textual description: subject-related terms from controlled vocabularies subs

    24、et of Subject-related text multiple values: concatenated compare: qualified DC.SubjectOriginator type: textual description: names of entities related to items origin multiple values: concatenated compare: DC.Creator + DC.Publisher,ADL core buckets (4/6),Geographic location type: spatial description:

    25、 subset of Earths surface related to item multiple values: union compare: DC.Coverage.SpatialCoverage date type: temporal description: calendar dates related to item multiple values: union compare: DC.Coverage.Temporal,ADL core buckets (5/6),Object type type: hierarchical vocabulary: ADL Object Type

    26、 Thesaurus (image, map, thesis, sound recording, etc.) multiple values: unioned compare: DC.TypeFeature type type: hierarchical vocabulary: ADL Feature Type Thesaurus (river, mountain, park, city, etc.) multiple values: unioned compare: none,ADL core buckets (6/6),Format type: hierarchical vocabular

    27、y: ADL Object Format Thesaurus (loosely based on MIME) multiple values: union compare: DC.FormatIdentifier type: qualified textual description: unique identifiers for item multiple values: treated separately compare: DC.Identifier,Bucket Summary,Strongly typed, abstract metadata category, with defin

    28、ed search semantics, to which source metadata is mapped.Supports discovery/search across distributed, heterogeneous collections that use metadata structures of their choosing.Supports “drill-down” searching of item-level metadata elements.,Challenges,Metadata is like life: refuses to follow the rule

    29、s unknown semantics inconsistent typing/syntax unknown or unidentifiable sources poor/inconsistent quality proliferation of overlapping vocabularies . Reality check: Dublin Core won adapt buckets to qualified Dublin Core incorporate fallback mechanism or polymorphism e.g, treat fields as thesauri/co

    30、ntrolled vocabularies or as text,Query Translation,ADEPT query language,Domain collection of items item = (unique ID, field, ) field = (name, value) bucket = (name, union or concatenation of fields) Queries atomic constraint: (attribute name, operator, target) return items that have at least 1 value

    31、 for the attribute, for which at least one value matches the target arbitrary Boolean combinations AND, OR, AND NOT,The problem,Algorithmically translate ADEPT queries to SQL accommodate all possible SQL implementations configurable by mere mortals generate “reasonable” SQL make up for DB deficienci

    32、es stupid things like order of tables & conditions matter incorporate optimizer hints and directives,Approach,Python-based translator 1500 lines Extensible “paradigms” describe atomic translation techniques 15 paradigms Each paradigm 100 lines (50 Python code, 20 assertions, 30 documentation) Intrin

    33、sic etc. Configuration file describes: buckets, fields, paradigms, paradigm configuration Boolean override rules misc: external identifier table, optimizer clauses,Collection Discovery,The problem,Distributed queries: necessary evil necessary to achieve scalability performance autonomy introduce sca

    34、lability, performance, and reliability problemsAmelioration strategies increase server performance/reliability replication, DIENST connectivity regions turn into offline problem Web search engines, OAI harvesting model identify relevant collections to query (ADEPT) analogous to Web search engine,App

    35、roach,Build on collection-level metadata spatial & temporal density histograms item counts per collection categorization schemesUpload periodically to central server Use Euler histograms to support range queries,Challenges,Relevance not necessarily Boolean worldwide, petabyte, 1cm resolution database vs. world map drawn on napkin weight by resolution or minimum feature size but sometimes you want the napkinThe JOIN problem statistics computed independentlyText overviews STARTS?,Thats All, Folks,


    注意事项

    本文(The ADEPTDigital Library Architecture.ppt)为本站会员(progressking105)主动上传,麦多课文档分享仅提供信息存储空间,仅对用户上传内容的表现方式做保护处理,对上载内容本身不做任何修改或编辑。 若此文所含内容侵犯了您的版权或隐私,请立即通知麦多课文档分享(点击联系客服),我们立即给予删除!




    关于我们 - 网站声明 - 网站地图 - 资源地图 - 友情链接 - 网站客服 - 联系我们

    copyright@ 2008-2019 麦多课文库(www.mydoc123.com)网站版权所有
    备案/许可证编号:苏ICP备17064731号-1 

    收起
    展开