NISO RP-2005-2003 Metasearch Initiative - Search and Retrieval Citation Level Data Elements《搜索和检索引用级别数据元》.pdf
《NISO RP-2005-2003 Metasearch Initiative - Search and Retrieval Citation Level Data Elements《搜索和检索引用级别数据元》.pdf》由会员分享,可在线阅读,更多相关《NISO RP-2005-2003 Metasearch Initiative - Search and Retrieval Citation Level Data Elements《搜索和检索引用级别数据元》.pdf(6页珍藏版)》请在麦多课文档分享上搜索。
1、NISO RP-2005-03 NISO Metasearch Initiative Search and Retrieval Citation Level Data Elements A Recommended Practice of the National Information Standards Organization Standards Committee BC / Task Group 3 Version 1.0 September 13, 2005 Published by the National Information Standards Organization Bet
2、hesda, MD 2005 NISO 1 Summary The NISO Metasearch Initiative, Task Group 3/SubGroup 3, on Required Citation Metadata has discussed the issues around citation metadata and its relation to metasearch. Citation references have been devised in a paper world, assuming page numbers and enveloping journals
3、 and publishers. But searchers will use metasearch engines to search, find, and retrieve individual articles. There are a number of extant issues that must be addressed to allow smooth and seamless metasearching across multiple resources. The Google Scholar approach is to access the full-text conten
4、t of all available journals and provide a heterogeneous data store. Unfortunately, for researchers, they need fine tuning of their search experience with relevant metadata so as not to be swamped by irrelevant references. Our proposed approach is simply to have a consistency in the format and conten
5、t of citation metadata. Issues Inconsistent Citation Styles The reference styles for citations tend to differ according to discipline. There are tens, if not hundreds of styles. As an example, one vendor has seventeen citation formats across twelve databases. The ISO and NISO standards are not in th
6、emselves a sufficient guide to all the variations. From the Dublin Core Metadata Initiative Citation Working Group, we get the following list of variations: The order of elements (especially elements such as initials) The mandatoriness of elements (e.g. many chemistry styles leave out the article ti
7、tle, but biology and medicine wouldnt) The punctuation between the elements Capitalization. E.g. of titles - some styles use “title case“ (i.e. initial capitals for all main words) and some use “sentence case“ (i.e. initial capitals for first word and proper nouns and adjectives only) Acceptable abb
8、reviations (especially regarding journal title abbreviations, but also element indicators such as “chapter/chap/ch“, “editor(s)/edited by/ed(s)“, “edition/edn/ed“ Character formatting (i.e. what goes in italic, bold, etc.) Refer to http:/epub.mimas.ac.uk/DC/citstyles.html for more discussion and a l
9、ist of citation styles. One of the reasons behind this plethora of styles is that data vendors purchase data from different publishers, each using potentially different styles. Complex Technology Required Due to the wide and varying citation formats returned by various vendors, metasearch engines mu
10、st choose how to parse each citation. With “random” fields, even the parsed results are unreliable and inconsistent, oftentimes producing bad OpenURLs which can make it difficult for users to get to the full-text or article that was originally published. Vendor Branding Vendors and publishers desire
11、 to maintain their branding and identity in results sent to users, even after being massages by a metasearch engine. Either a vendor produces a proprietary OpenURL that will only point back to their own sources, or a vendor or publishers reference is lost from the metadata. The vendor wants more exp
12、osure, renewed subscriptions, and possibly pay-per-view of full-text. Mapping of Metadata One issue that causes confusion and difficulty in de-duping records is the process in which multiple metadata items get placed into databases. A typical scenario goes as follows: the primary publisher creates a
13、 human readable citation field; the human readable citation field is dumped into a single database field; and the record in the database is sold to an aggregator. Since many different formats may 2005 NISO 2 be managed by one citation aggregator, it is difficult to tell which format they used for ea
14、ch citation. When the record is searched, it may be displayed as created by the publisher and not the authors. Requirements The requirements to enable effective and seamless metasearch across multiple databases and resource types are surprisingly simple. There are basically two audiences to the resu
15、lts of a metasearch: a metasearch engine, and the end-user. The combined minimum requirements end up being as follows: Minimum metadata to allow a metasearch engine to compare results from multiple resources: Unambiguous metadata Enough to be able to Sort/Merge/Dedupe (OpenURL) Display (Brief/Full)
16、minimum for the user Produce OpenURL/Link Ranking: Need searched fields: Subject/Description/Abstract To create a “Brief” Record, you need, at a minimum: Genre what “type“ of item is it? Creator who created the original article? Title how is this article referred to? ID what ID(s), such as PII, SICI
17、, DOI, etc., is this article known by? Context what enveloping publication or proceeding, etc., is this article found in? To create a “Full Display“ Record, and to enable ranking and full-text analysis of the metadata, you need: Subject for cataloged subject headings Description some text describing
18、 what this item Proposed Solution A detailed table describing the minimum data elements needed for citation metadata follows this summary; an XML version of the table is available on the NISO Metasearch Initiative website (http:/www.niso.org/committees/MS_initiative.html). This set is taken extensiv
19、ely from Dublin Core 0.1, qualified for citations from the citation working group, however, it adds the descriptive components needed for “Full Display“ and text analysis done by metasearch engines. A quick overview follows. As expected, it closely matches the Requirements listed above. “genre“ elem
20、ent that describes WHAT kind of object we have an “authors“ field, as in OpenURL “titles“ field that has Journal Title and Article Title “dates“ field that has the date of publication, and other chronological information if present “context“ field that gives volume, issue, pages, etc. “citationID“ f
21、or ISBN, ISSN, SICI, etc. “publisher“ field, if available “fulltextURI“ to point to the full-text, if available 2005 NISO 3 For full display information, add the following. (If the information is requested by a metasearch server that is doing independent ranking of results, then this information is
22、highly recommended to aid in the ranking of results.) “description“ as in Dublin Core, for description or abstract “subject“, as in Dublin Core, for subject headings “vendorData“ to include, in “free form“ with a schema pointer, whatever else they want to add (This allows vendors to preserve brandin
23、g.) Links For comparison and related links, here are other, similar standards, and a few discussions of interest: Dublin Core Metadata Initiative Citation Working Group http:/dublincore.org/groups/citation/ Guidelines for Encoding Bibliographic Citation Information in Dublin Core Metadata http:/www.
24、dublincore.org/documents/dc-citation-guidelines/ IMS Resource List Interoperability (RLI) Information Model, e-Learning metadata http:/www.imsglobal.org/rli/rliv1p0/imsrli_infov1p0.html Digital Objects Requirements: Metadata, California Digital Library http:/www.cdlib.org:8081/inside/diglib/guidelin
- 1.请仔细阅读文档,确保文档完整性,对于不预览、不比对内容而直接下载带来的问题本站不予受理。
- 2.下载的文档,不会出现我们的网址水印。
- 3、该文档所得收入(下载+内容+预览)归上传者、原创作者;如果您是本文档原作者,请点此认领!既往收益都归您。
下载文档到电脑,查找使用更方便
10000 积分 0人已下载
下载 | 加入VIP,交流精品资源 |
- 配套讲稿:
如PPT文件的首页显示word图标,表示该PPT已包含配套word讲稿。双击word图标可打开word文档。
- 特殊限制:
部分文档作品中含有的国旗、国徽等图片,仅作为作品整体效果示例展示,禁止商用。设计者仅对作品中独创性部分享有著作权。
- 关 键 词:
- NISORP20052003METASEARCHINITIATIVESEARCHANDRETRIEVALCITATIONLEVELDATAELEMENTS 搜索 检索 引用 级别 数据 PDF

链接地址:http://www.mydoc123.com/p-1008904.html