1、 INCITS/ISO/IEC 24824-1:20072010 (ISO/IEC 24824-1:2007, IDT) Information technology - Generic applications of ASN.1: Fast infoset Reaffirmed as INCITS/ISO/IEC 24824-1:2007 R2015INCITS/ISO/IEC 24824-1:20072010 PDF disclaimer This PDF file may contain embedded typefaces. In accordance with Adobes lice
2、nsing policy, this file may be printed or viewed but shall not be edited unless the typefaces which are embedded are licensed to and installed on the computer performing the editing. In downloading this file, parties accept therein the responsibility of not infringing Adobes licensing policy. The IS
3、O Central Secretariat accepts no liability in this area. Adobe is a trademark of Adobe Systems Incorporated. Details of the software products used to create this PDF file can be found in the General Info relative to the file; the PDF-creation parameters were optimized for printing. Every care has be
4、en taken to ensure that the file is suitable for use by ISO member bodies. In the unlikely event that a problem relating to it is found, please inform the Central Secretariat at the address given below. Adopted by INCITS (InterNational Committee for Information Technology Standards) as an American N
5、ational Standard. Date of ANSI Approval: 11/17/2010 Published by American National Standards Institute, 25 West 43rd Street, New York, New York 10036 Copyright 2010 by Information Technology Industry Council (ITI). All rights reserved. These materials are subject to copyright claims of International
6、 Standardization Organization (ISO), International Electrotechnical Commission (IEC), American National Standards Institute (ANSI), and Information Technology Industry Council (ITI). Not for resale. No part of this publication may be reproduced in any form, including an electronic retrieval system,
7、without the prior written permission of ITI. All requests pertaining to this standard should be submitted to ITI, 1101 K Street NW, Suite 610, Washington DC 20005. Printed in the United States of America ii ITIC 2010 All rights reserved ISO/IEC 24824-1:2007(E) ISO/IEC 2007 All rights reserved iiiCON
8、TENTS Page 1 Scope . 1 2 Normative references 1 2.1 Identical Recommendations | International Standards . 2 2.2 Additional references. 2 3 Definitions 3 3.1 ASN.1 terms 3 3.2 ECN terms. 3 3.3 ISO/IEC 10646 terms. 3 3.4 Additional definitions 3 4 Abbreviations 4 5 Notation . 4 6 Principles of vocabul
9、ary table construction and use . 5 7 ASN.1 type definitions 6 7.1 General . 6 7.2 The Document type . 6 7.3 The Element type . 11 7.4 The Attribute type . 12 7.5 The ProcessingInstruction type 12 7.6 The UnexpandedEntityReference type 13 7.7 The CharacterChunk type 13 7.8 The Comment type . 14 7.9 T
10、he DocumentTypeDeclaration type 14 7.10 The UnparsedEntity type 15 7.11 The Notation type . 15 7.12 The NamespaceAttribute type 16 7.13 The IdentifyingStringOrIndex type 16 7.14 The NonIdentifyingStringOrIndex type 17 7.15 The NameSurrogate type 18 7.16 The QualifiedNameOrIndex type 19 7.17 The Enco
11、dedCharacterString type 20 8 Construction and processing of a fast infoset document 21 8.1 Conceptual ordering of components of an abstract value of the Document type . 22 8.2 The restricted alphabet table 22 8.3 The encoding algorithm table 22 8.4 The dynamic string tables . 23 8.5 The dynamic name
12、 tables and name surrogates . 23 9 Built-in restricted alphabets 24 9.1 The “numeric“ restricted alphabet. 24 9.2 The “date and time“ restricted alphabet 24 10 Built-in encoding algorithms. 24 10.1 General . 24 10.2 The “hexadecimal“ encoding algorithm . 25 10.3 The “base64“ encoding algorithm. 25 1
13、0.4 The “short“ encoding algorithm . 25 10.5 The “int“ encoding algorithm. 26 10.6 The “long“ encoding algorithm 26 10.7 The “boolean“ encoding algorithm 26 10.8 The “float“ encoding algorithm 27 10.9 The “double“ encoding algorithm . 27 10.10 The “uuid“ encoding algorithm 27 ISO/IEC 24824-1:2007(E)
14、 iv ISO/IEC 2007 All rights reservedPage 10.11 The “cdata“ encoding algorithm . 28 11 Restrictions on the supported XML infosets and other simplifications. 28 12 Bit-level encoding of the Document type. 29 Annex A ASN.1 module and ECN modules for fast infoset documents 31 A.1 ASN.1 module definition
15、 31 A.2 ECN module definitions . 33 Annex B The MIME media type for fast infoset documents . 53 Annex C Description of the encoding of a fast infoset document. 55 C.1 Fast infoset document 55 C.2 Encoding of the Document type . 55 C.3 Encoding of the Element type . 57 C.4 Encoding of the Attribute t
16、ype . 58 C.5 Encoding of the ProcessingInstruction type 58 C.6 Encoding of the UnexpandedEntityReference type 59 C.7 Encoding of the CharacterChunk type . 59 C.8 Encoding of the Comment type . 59 C.9 Encoding of the DocumentTypeDeclaration type 59 C.10 Encoding of the UnparsedEntity type . 60 C.11 E
17、ncoding of the Notation type . 60 C.12 Encoding of the NamespaceAttribute type 61 C.13 Encoding of the IdentifyingStringOrIndex type 61 C.14 Encoding of the NonIdentifyingStringOrIndex type starting on the first bit of an octet . 61 C.15 Encoding of the NonIdentifyingStringOrIndex type starting on t
18、he third bit of an octet . 62 C.16 Encoding of the NameSurrogate type . 62 C.17 Encoding of the QualifiedNameOrIndex type starting on the second bit of an octet 62 C.18 Encoding of the QualifiedNameOrIndex type starting on the third bit of an octet . 63 C.19 Encoding of the EncodedCharacterString ty
19、pe starting on the third bit of an octet . 63 C.20 Encoding of the EncodedCharacterString type starting on the fifth bit of an octet 64 C.21 Encoding of the length of a sequence-of type 64 C.22 Encoding of the NonEmptyOctetString type starting on the second bit of an octet 64 C.23 Encoding of the No
20、nEmptyOctetString starting on the fifth bit of an octet 65 C.24 Encoding of the NonEmptyOctetString type starting on the seventh bit of an octet . 65 C.25 Encoding of integers in the range 1 to 220starting on the second bit of an octet 65 C.26 Encoding of integers in the range 0 to 220starting on th
21、e second bit of an octet 66 C.27 Encoding of integers in the range 1 to 220starting on the third bit of an octet 66 C.28 Encoding of integers in the range 1 to 220starting on the fourth bit of an octet. 66 C.29 Encoding of integers in the range 1 to 256 67 Annex D Examples of encoding XML infosets a
22、s fast infoset documents . 68 D.1 Introduction of examples 68 D.2 Size of example documents (including redundancy-based compression). 68 D.3 UBL order example . 69 D.4 UBL Order fast infoset document with an external vocabulary. 71 D.5 UBL order fast infoset document without an initial vocabulary 79
23、 Annex E Assignment of object identifier values. 90 BIBLIOGRAPHY 91 ISO/IEC 24824-1:2007(E) ISO/IEC 2007 All rights reserved vForeword ISO (the International Organization for Standardization) and IEC (the International Electrotechnical Commission) form the specialized system for worldwide standardiz
24、ation. National bodies that are members of ISO or IEC participate in the development of International Standards through technical committees established by the respective organization to deal with particular fields of technical activity. ISO and IEC technical committees collaborate in fields of mutu
25、al interest. Other international organizations, governmental and non-governmental, in liaison with ISO and IEC, also take part in the work. In the field of information technology, ISO and IEC have established a joint technical committee, ISO/IEC JTC 1. International Standards are drafted in accordan
26、ce with the rules given in the ISO/IEC Directives, Part 2. The main task of the joint technical committee is to prepare International Standards. Draft International Standards adopted by the joint technical committee are circulated to national bodies for voting. Publication as an International Standa
27、rd requires approval by at least 75 % of the national bodies casting a vote. Attention is drawn to the possibility that some of the elements of this document may be the subject of patent rights. ISO and IEC shall not be held responsible for identifying any or all such patent rights. ISO/IEC 24824-1
28、was prepared by Joint Technical Committee ISO/IEC JTC 1, Information technology, Subcommittee SC 6, Telecommunications and information exchange between systems, in collaboration with ITU-T. The identical text is published as ITU-T Rec. X.891. ISO/IEC 24824 consists of the following parts, under the
29、general title Information technology Generic applications of ASN.1: Part 1: Fast infoset Part 2: Fast Web Services The following part is under preparation: Part 3: Fast infoset security ISO/IEC 24824-1:2007(E) vi ISO/IEC 2007 All rights reservedIntroduction This Recommendation | International Standa
30、rd specifies a representation of an instance of the W3C XML Information Set using binary encodings (specified using the ASN.1 notation and the ASN.1 Encoding Control Notation). The encoding specified in this edition of this Recommendation | International Standard is identified by the version number
31、1 (see 12.9). The technology specified in this Recommendation | International Standard is named Fast Infoset. It provides an alternative to W3C XML syntax as a means of representing instances of the W3C XML Information Set. This representation generally provides smaller encoding sizes and faster pro
32、cessing than a W3C XML representation. The representation of an instance of the W3C XML Information Set specified in this Recommendation | International Standard is called a fast infoset document. Each fast infoset document is an encoding of an abstract value of an ASN.1 data type (the Document type
33、 see 7.2) representing an instance of the W3C XML Information Set. This Recommendation | International Standard specifies the use of several techniques that minimize the size of a fast infoset document and that maximize the speed of creating and processing such documents. These techniques are based
34、on the use of vocabulary tables, which allow typically-small integer values (vocabulary table indexes) to be used instead of character strings that form (for example) the names of elements or attributes in an XML 1.0 serialization of an instance of the W3C XML Information Set. There are a number of
35、vocabulary tables (see clause 8), of which the most basic (the eight character string tables) map typically-small integers to strings of characters. There are, however, also vocabulary tables (the element name table and the attribute name table) that provide a further level of indirection, with a vo
36、cabulary table index mapping to a set of three vocabulary table indexes, identifying a prefix, a namespace name, and a local name. Another important technique is the use of a restricted alphabet vocabulary table. This contains entries that list a subset of ISO/IEC 10646 characters. If a character st
37、ring needs to be encoded for which there is an entry in this table, then it can be encoded by identifying that this vocabulary table is being used, giving the vocabulary table index, and then encoding each character in the minimum number of bits needed for that particular subset of ISO/IEC 10646 cha
38、racters. There are a number of built-in restricted alphabets that always form the first few entries of this table, covering such commonly occurring strings as dates and times, and numeric values. A further important optimization uses the encoding algorithm vocabulary table. This table identifies spe
39、cialized encodings that can be employed for commonly occurring strings, again with a number of built-in algorithms. For example, if there is a string which looks like the decimal representation of an integer in the range 32768 to 32767, then that string can be encoded by identifying that this vocabu
40、lary table is being used, giving the vocabulary table index, and then encoding the integer as a two-octet signed integer. Floating-point numbers and arrays of such numbers are supported in the same way. In order to ensure fast processing without sacrificing compactness, many components of a fast inf
41、oset document (such as character strings and components representing information items of the XML infoset) are octet-aligned, while other components (such as lengths and vocabulary table indexes) are not necessarily octet-aligned but always end on the last bit of an octet. To provide a formal specif
42、ication of these optimized encodings, the ASN.1 Encoding Control Notation (defined in ITU-T Rec. X.692 | ISO/IEC 8825-3) is used (see A.2), but use of ECN tools for implementation is not necessary and a complete description of the encoding is provided (see Annex C). The vocabulary tables for a parti
43、cular fast infoset document can be initialized by information at the head of the document, and are normally added to dynamically, providing flexibility for an encoder. The initial vocabulary tables can be provided by a reference to the set of final vocabulary tables of some other identified fast inf
44、oset document (or by other means). This vocabulary reference can then be supplemented by further table additions to provide the initial vocabulary tables for this document. Further dynamic additions are normally made to the tables during the creation or the processing of the document. Finally, a mec
45、hanism is provided for the generator of a fast infoset document to include data (called additional processing data) related to optional additional processing of the fast infoset document, together with a URI that identifies a complete specification of the form and semantics of that additional proces
46、sing data. The optional additional processing data is ignored by any subsequent processor of the fast infoset document if the URI is not known, or the processing that it specifies is not supported or not required. NOTE An example of such additional processing data would be data that provides indexes
47、 that enable immediate access to parts of the fast infoset document, so that the whole document need not be processed if the only interest is in those parts of the fast infoset document that correspond to a specific XML tag. Annex A forms an integral part of this Recommendation | International Stand
48、ard, and contains an ASN.1 module (see ITU-T Rec. X.680 | ISO/IEC 8824-1) and two ECN modules (EDM and ELM see ITU-T Rec. X.692 | ISO/IEC 8825-3) which together specify the abstract content and the bit-level encoding of a value of the Document type, which conveys the value of an instance of the W3C
49、XML Information Set. ISO/IEC 24824-1:2007(E) ISO/IEC 2007 All rights reserved viiAnnex B forms an integral part of this Recommendation | International Standard, and contains the specification of a MIME media type identifying a fast infoset document. Annex C does not form an integral part of this Recommendation | International Standard, and provides a complete description of the encodings formally specified in clause 12 and A.2. Annex D does not form an i