1、 Collection of SANS standards in electronic format (PDF) 1. Copyright This standard is available to staff members of companies that have subscribed to the complete collection of SANS standards in accordance with a formal copyright agreement. This document may reside on a CENTRAL FILE SERVER or INTRA
2、NET SYSTEM only. Unless specific permission has been granted, this document MAY NOT be sent or given to staff members from other companies or organizations. Doing so would constitute a VIOLATION of SABS copyright rules. 2. Indemnity The South African Bureau of Standards accepts no liability for any
3、damage whatsoever than may result from the use of this material or the information contain therein, irrespective of the cause and quantum thereof. ICS 01.140.20 ISBN 0-626-17253-5 SANS 5964:2005Edition 1ISO 5964:1985Edition 1SOUTH AFRICAN NATIONAL STANDARD Documentation Guidelines for the establishm
4、ent and development of multilingual thesauri This national standard is the identical implementation of ISO 5964:1985 and is adopted with the permission of the International Organization for Standardization. Published by Standards South Africa 1 dr lategan road groenkloof private bag x191 pretoria 00
5、01 tel: 012 428 7911 fax: 012 344 1568 international code +27 12 www.stansa.co.za Standards South Africa SANS 5964:2005 Edition 1 ISO 5964:1985 Edition 1 Table of changes Change No. Date Scope Abstract This standard should be used in conjunction with SANS 2788, and regarded as an extension of the sc
6、ope of the monolingual guidelines. These guidelines are restricted in scope to the problems of multilingualism which can arise during the construction of a “conventional” thesaurus, i.e. a thesaurus displaying terms selected from more than one natural language, these terms then constituting the voca
7、bulary of a controlled indexing language. Multilingual thesauri are relatively recent developments in the field of documentation, and it is inevitable, therefore, that the present guidelines should display certain limitations. Keywords documentations, information retrieval, multilingual, subject ind
8、exing, thesauri. National foreword This South African standard was approved by National Committee StanSA TC 46, Information and documentation, in accordance with procedures of Standards South Africa, in compliance with annex 3 of the WTO/TBT agreement. International Standard INTERNATIONAL ORGANIZATI
9、ON FOR STANDARDIZATION.MEYHAPOHAR OPI-AHM3AMfl I-IO CTAHAAPTM3AMM.ORGANISATION INTERNATIONALE DE NORMALISATION Documentation - Guidelines for the establishment and development of multilingual thesauri Documentation - Principes directeurs pour l% tablissemen t et Ie developpemen t de thesaurus multil
10、ingues First edition - 1985-02-15 UDC 025.48 Ref. No. ISO 59644985 (E) Descriptors : documentation, subject indexing, information retrieval, thesauri, multilingual thesauri, preparation, rules (instructions). Price based on 61 pages Foreword ISO (the International Organization for Standardization) i
11、s a worldwide federation of national Standards bodies (ISO member bedies). The work of preparing International Standards is normally carried out through ISO technical committees. Esch member body interested in a subject for which a technical committee has been established has the right to be represe
12、nted on that committee. International organizations, govern- mental and non-governmental, in liaison with ISO, also take part in the work. Draft International Standards adopted by the technical committees are circulated to the member bodies for approval before their acceptance as International Stand
13、ards by the ISO Council. They are approved in accordance with ISO procedures requiring at least 75 % approval by the member bodies voting. International Standard ISO 5964 was prepared by Technical Committee ISO/TC 46, Documen ta tion. 0 International Organkation for Standardkation, 1985 Printed in S
14、witzerland ii Contents 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 Page Introduction 1 Scope and field of application . 1 References . 2 Definitions . 2 General 3 Abbreviations and Symbols . 4 Vocabulary control 5 The establishment of a multilingual thesaurus: general Problems . 5 The establishment of a m
15、ultilingual thesaurus: management decisions 6 The establishment of a multilingual thesaurus: language Problems. . 7 Establishing equivalent terms in different languages 11 Other language Problems. 20 Relationships between terms in a multilingual thesaurus. 25 Display of terms and relationships . 27
16、Form and contents of a multilingual thesaurus . 31 Organization of work 32 Annexes A Symbolization of thesaural relationships 35 B Examplesofdisplays . 36 . . . Ill This page intentionally left blank INTERNATIONAL STANDARD ISO 5964-1985 (E) Documentation - Guidelines for the establishment and develo
17、pment of multilingual thesauri 0 Introduction A trend towards the international exchange of information, fully supported by the UNISIST* Programme of UNESCO, and exemplified by Systems like the International information System for the agricultural sciences and technology (AGRIS) and the Inter- natio
18、nal Nuclear Information System (INIS), clearly calls for a higher commitment to multilingual cooperation. Information Systems are expanding across language boundaries, leading to a notable increase in the Provision sf indexing and retrieval tools which are either language-independent (the Broad Syst
19、em of Ordering), or multilingual. Aids of this kind are essential if retrieval of documents indexed in more than one language is not to depend on the acquisition and use of a Single, dominant language. Indexers or searchers should, where possible, be able to work in their mother tongues, or at least
20、 in a language with which they are already familiar. Within this context it is considered that multilingual thesauri have a significant part to play in improving the bibliographic control 0% literature on a global scale. The standardization of procedures for the construction of a multilingual thesau
21、rus is seen as a primary step in achieving compatibility between thesauri produced by indexing agencies using terms selected from different natura1 languages. The recording of these pro- cedures will also enable indexers engaged in this task to benefit from the experience sf others, and to work in a
22、 logical and consistent fashion, using recommended practices which have been established in the course of discussions at an international level. 1 Scope and field of application 1 .l The guidelines given in this International Standard should be used in conjunction with ISO 2788, and regarded as an e
23、xtension of the scope of the monolingual guidelines. lt is considered that the majority of procedures and recommendations contained in ISO 2788 are equally valid for a multilingual thesaurus. This applies particularly to general procedures concerning, for example, the forms of terms, the basic thesa
24、ural relationships, and management operations such as evaluation and maintenance. Except when it appears to be necessary, the procedures described in ISO 2788 are not repeated here, and it is therefore essential to refer to both of these International Standards when constructing a multilingual thesa
25、urus. 1.2 These guidelines are restricted in scope to the Problems of multilingualism which tan arise during the construction of a “con- ventional” thesaurus, i.e. a thesaurus displaying terms selected from more than one natura1 language, these terms then constituting the vocabulary of a controlled
26、indexing language. Throughout this International Standard, a distinction is made between preferred terms and non-preferred terms (sec definitions in clause 3). These guidelines are not applicable to indexing languages in which con- cepts are expressed entirely as Symbols (for example mathematical eq
27、uations or Chemical formulae), nor to Systems which are based on the automatic analysis and searching of free text. lt is considered, however, that a weil-constructed multilingual thesaurus tan play a significant part in improving retrieval from a free-text System which covers documents in more than
28、 one language. 1.3 M ultilingual thesauri are relatively recent devel present guidelines should display certain limitations. opments in the field of documentation, and it is inevitable, therefore, that the a) The examples used to illustrate Problems encountered in the establishment of term equivalen
29、ces have been drawn largely from the fields of science (including the social sciences) and technology. As far as possible, however, examples were Chosen which illustrate general Problems and procedures, i.e. those which should apply in any field of knowledge. b) lt is realized that the procedures de
30、scribed in these guidelines may not be entirely appropriate for all languages. The examples have been selected, for entirely pragmatic reasons, from three of the major languages, i.e. English, French and German, but this does not imply that these languages are regarded as dominant in the field of do
31、cumentation. As far as possible the procedures considered here, together with their accompanying examples, relate to Problems which may be encountered in any language. * Intergovernmental Programme for Co-Operation in the field of scientific and technological information. ISO 5964-1985 (E) 2 Referen
32、ces ISO/R 639, Symbols for languages, countrles and authorities. ISO 1086, Documentation - Title-leaves of a book. ISO 2788, Documentation - Guidelines for the establishmen t and developmen t of monolingual thesauri 3 Definitions For the purposes of this International Standard, the following definit
33、ions apply: 3.1 coined term: A neologism especially created in a target language to express a concept which is denoted by an existing and recognized term in a Source language, but which has no t previously been expressed n the target language. 32 could compound term : An indexing be expressed, or re
34、-expressed, term (sec 3.8) which tan be factored morphologicall y into separa te componen as a noun that is capabl e of serving independently as an indexi ng term. NOTE - The Parts of the great majority of compound terms tan be distinguished as follows: ts, each sf which the focus or head, i.e. the
35、noun component which identifies the general class of concepts to which the term as a whole refers; b) the diffe su bclasses. rence or modifier, i.e. one or more further components which serve to narrow the extension of the focus by specifying one of its In French, English and simila r languages, com
36、pound terms usual expressed by a Single word in German and some other languages. ly consist of separate words, whereas the same concept would frequently be Examples: a) English German SYSTElVlS ANALYSIS = SYSTEMANALYSE Frenc h PONT EN BETON = German BETONBR CKE In example (a) the English word “analy
37、sis” and the German component “analyse” both represent foci, and the modifying differentes are represented by “Systems” (English) and “System” (German). Despite these surface structural differentes, however, the terms “Systems analysis” and “Systemanalyse” are both regarded as compound terms for the
38、 purposes of this International Standard. 3.3 dominant language: An exchange language (sec 3.5) which is also used for indexing and retrieval in Systems which, for policy reasons, do not give equal Status to all the languages in the System. Every concept recognized in the System must necessarily be
39、represented by a preferred term in the dominant language. In some cases, however, an equivalent expression may be lacking in one more sf the other languages. These other languages are then known as secondary languages. 34 by an equal Status: Languages equivalent preferred term in a multilingual thes
40、aurus in all other lang uages. have equal Status when every preferred term in one language is matched 3.5 exchange language: The language used as a medium for data exchange in those multilingual Systems which, as a matter of policy, decide to use terms selected from only one language for this purpos
41、e. The exchange language may also be used for indexing and/or retrieval, and the multilingual thesaurus then functions principally as a means Bor translating the local languages sf indexers and enquirers into, or out sf, the exchange language. The different languages in such a System would still be
42、recognized as having equal Status (sec 3.4) if equivalents are established reciprocally between the preferred terms in the exchange language and the pre- ferred terms in all other languages. 36 useful feedback: The act sf changing the form or stru cture sf solution to a Problem encountered in a targ
43、et Ia nguage. a term in a Source ianguage in Order to achieve an easier or a more Example: Let us assume that a German thesaurus is used as a Source language and contains the term “Lehrerbildungsgesetz”. Direct translation of this term into English or French would call for a complicated Paraphrase,
44、“Law of education sf teachers”, or “Loi sur Ia formation des enseignants”. Neither sf these phrases would be regarded as a satisfactory indexing term. A shorter expression, which is closer to the German construction, tan be achieved in English, i.e. “Teacher education law”, but this cannot be done i
45、n French. 2 ISO 5964-1985 (E) Feedback would operate if, in response to these Problems, the original German compound term is factored into its separate com- ponents, each expressed as a noun, i.e. “Bildung”, “Gesetz” and “Lehrer”, and if these are henceforth accepted as indexing terms in German and
46、assigned to documents dealing with this subject. Translation into English and French could then be carried out on this new and simpler basis, i.e. Gesetz = Law = Loi Lehrer = Teachers = Enseignant Bildung = Education = Formation The German compound term “Lehrerbildungsgesertz” may still be retained
47、in the German thesaurus if it is likely to be sought by users, but its Status would be changed to that of a non-preferred term, and the user would be redirected to the combination of separate nouns which represents this complex concept, for example: Lehrerbildungsgesetz BS LEHRER + BILDUNG + GESETZ
48、37 . indexing languag subjects of documents. e: A controlled set of terms selected from natura1 language to represent, in summary form, the. NOTE - In a post-coordinate System these terms are used as “keywords” for retrieval purposes, usually without attempting to indicate their syntac- tical relati
49、onships. Syntactical relationships tan be indicated in various ways in a pre-coordinated index, for example by printing terms in entries in an Order which suggests their relative roles, and so allows the user to perceive the subject as a whole. Despite these differentes, however, both kinds of System tan be based on controlled vocabularies of terms displayed and organized in a thesaurus. 3.8 indexing term: The representation of a concept, preferabiy in the form of a noun or noun Phrase. NOTE - An indexing term tan consist of more than one word, and is then kn a term is designated