换一换

麦多课文档分享 > 资源分类 > PDF文档下载

预览

ECMA 407-2014 Scalable Sparse Spatial Sound System (S5) - Base S5 Coding (1st Edition).pdf

资源ID：704840 资源大小：1.01MB 全文页数：34页
资源格式： PDF 下载积分：10000积分

快捷下载

账号登录下载

微信登录下载

微信扫一扫登录

下载资源需要10000积分（如需开发票，请勿充值！）

邮箱/手机：
温馨提示：	如需开发票，请勿充值！快捷下载时，用户名和密码都是您填写的邮箱或者手机号，方便查询和重复下载（系统自动生成）。如需开发票，请勿充值！如填写123，账号就是123，密码也是123。
支付方式：
验证码：	换一换

加入VIP,交流精品资源

账号：
密码：
验证码：	换一换
当日自动登录忘记密码？

友情提示

1、下载资料失败解决办法

2、PDF文件下载后，可能会被浏览器默认打开，此种情况可以点击浏览器菜单，保存网页到桌面，就可以正常下载了。

3、本站不支持迅雷下载，请使用电脑自带的IE浏览器，或者360浏览器、谷歌浏览器下载即可。

4、本站资源下载后的文档和图纸-无水印,预览文档经过压缩，下载后原文更清晰。

5、试题试卷类文档，如果标题没有明确说明有答案则都视为没有答案，请知晓。

ECMA 407-2014 Scalable Sparse Spatial Sound System (S5) - Base S5 Coding (1st Edition).pdf

1、 Reference number ECMA-123:2009 Ecma International 2009 , ECMA-407 1st Edition / June 2014 Scalable Sparse Spatial Sound System (S5) Base S5 Coding COPYRIGHT PROTECTED DOCUMENT Ecma International 2014 Ecma International 2014 i Contents Page 1 Scope 1 2 Conformance . 1 3 Normative references 1 4 Term

2、s, definitions and acronyms 1 5 S5 Overview . 2 6 Inverse Coding . 4 7 Configuration Data 7 7.1 Syntax of Configuration Data (S5Config) 7 7.2 Configuration Identifier (S5ConfigID) 7 7.3 Window Size for the Calculation of Synchronization Tags (S5SyncTagWindow) 7 7.4 Accuracy of the Calculation of Syn

3、chronization Tags (S5SyncTagAccuracy) . 7 7.5 Downmix Configuration (S5DownmixConfig). 8 7.6 Output Channel Configuration (S5ChannelConfig) . 8 7.7 Upmix Configuration (S5UpmixConfig) . 8 8 Inverse Coding Parameter Data . 8 8.1 Syntax of Inverse Coding Parameter Data (S5InvCodeData) 8 8.2 Synchroniz

4、ation Elements (S5SyncTag, S5SyncTag-1, S5SyncTag-2) . 9 8.3 Number of Parameter Sets (S5ParameterSetCount) 9 8.4 Inverse Coding Parameter Data Set ID (S5ParameterSetID) . 9 8.5 Parameter Data Set Type (S5ParameterSetType) 10 8.6 Inverse Coding Parameter Data Set (S5ParameterSet) . 10 9 Downmix . 11

5、 10 Upmix 11 10.1 Synchronization of Inverse Coding Parameter Data . 11 10.2 Expanding of S5AbrParameterSet . 11 10.3 Default Values of Inverse Coding Parameter Data . 11 10.4 Default values of S5UpmixConfig 11 Annex A (normative) Channel Positions and Configurations 13 Annex B (normative) Syntax fo

6、r S5UpmixConfig 15 Annex C (informative) Channel Configuration and Position Tables 19 Annex D (informative) Loudness Adjustment 23 Annex E (informative) Multiplexing . 25 ii Ecma International 2014 Introduction S5 denotes a scalable multichannel coding system for spatial audio data compression, whic

7、h can be applied to provide 3D audio experience with little overhead. Such system may incorporate a wide range of state-of-the-art audio codecs and can be applied to provide 3D audio experience. By using an audio codec, which may offer encapsulation capacity for external data, S5 data may be carried

8、 within the audio coder stream with little overhead and maintain a compatible bit stream syntax. This Standard specifies the base S5 encoder and decoder in terms of configuration data, downmix, inverse coding parameter data and upmix. It provides reference and guidance on how to incorporate further

9、components to form a scalable multichannel coding system for audio data compression. The base S5 codec achieves data compression of multichannel audio information by mapping the audio information on to a downmix signal and to sparse spatial data, which refers to the parameter values of a mathematica

10、l model to reconstruct localization and ambiance. A specific method, denoted as inverse coding, is used for upmixing from the audio downmix and its associated parameter values. Compressing the downmix audio by a state-of-the-art audio codec will further increase the coding efficiency of S5. An overv

11、iew is given on how the base S5 encoder/decoder may be extended by incorporation of an audio codec and other components; however, the components themselves and their interfaces are not specified. Such specific S5 codecs are subject to separate standards, which share the base S5 coding standard as th

12、eir common basis. This Ecma Standard has been adopted by the General Assembly of June 2014. Ecma International 2014 iii “COPYRIGHT NOTICE 2014 Ecma International This document may be copied, published and distributed to others, and certain derivative works of it may be prepared, copied, published, a

13、nd distributed, in whole or in part, provided that the above copyright notice and this Copyright License and Disclaimer are included on all such copies and derivative works. The only derivative works that are permissible under this Copyright License and Disclaimer are: (i) works which incorporate al

14、l or portion of this document for the purpose of providing commentary or explanation (such as an annotated version of the document), (ii) works which incorporate all or portion of this document for the purpose of incorporating features that provide accessibility, (iii) translations of this document

15、into languages other than English and into different formats and (iv) works by making use of this specification in standard conformant products by implementing (e.g. by copy and paste wholly or partly) the functionality therein. However, the content of this document itself may not be modified in any

16、 way, including by removing the copyright notice or references to Ecma International, except as required to translate it into languages other than English or into a different format. The official version of an Ecma International document is the English language version on the Ecma International webs

17、ite. In the event of discrepancies between a translated version and the official version, the official version shall govern. The limited permissions granted above are perpetual and will not be revoked by Ecma International or its successors or assigns. This document and the information contained her

18、ein is provided on an “AS IS“ basis and ECMA INTERNATIONAL DISCLAIMS ALL WARRANTIES, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO ANY WARRANTY THAT THE USE OF THE INFORMATION HEREIN WILL NOT INFRINGE ANY OWNERSHIP RIGHTS OR ANY IMPLIED WARRANTIES OF MERCHANTABILITY OR FITNESS FOR A PARTICULAR PU

19、RPOSE.“ iv Ecma International 2014 Ecma International 2014 1 Scalable Sparse Spatial Sound System (S5) Base S5 Coding 1 Scope This Standard specifies the base S5 encoder and decoder in terms of configuration data, downmix, inverse coding parameter data and upmix. In addition it provides reference an

20、d guidance on how to incorporate further components to form a scalable multichannel coding system for audio data compression. 2 Conformance Conformant base S5 encoders generate dataflows as specified in Clauses 7, 8, and 9. Conformant base S5 decoders generate the upmix as specified in Clause 10 by

21、processing dataflows as specified in Clauses 7, 8, and 9. 3 Normative references ISO/IEC 23001-8, Information technology - MPEG systems technologies - Part 8: Coding-independent code points IETF RFC 5234, Augmented BNF for syntax specifications: ABNF 4 Terms, definitions and acronyms For the purpose

22、s of this document, the following terms and definitions apply. 4.1 downmix reduced number of audio channels from an input signal 4.2 upmix increased number of audio channels from a downmix 4.3 base S5 encoder encoding unit providing the downmix, the inverse coding parameter data and the upmix config

23、uration 4.4 base S5 decoder decoding unit providing the upmix based on the downmix, the inverse coding parameter data and the upmix configuration 4.5 base audio codec audio codec component providing lossless or lossy compression and decompression of the downmix 4.6 loudness perceived level of an aud

24、io programme 4.7 Mid (M) signal non-directional input signal to a Mid-Side (MS) decoder 2 Ecma International 2014 4.8 Side (S) signal directional input signal to a Mid-Side (MS) decoder 4.9 uimsbf unsigned integer, most significant bit first 4.10 Q format fixed point binary format for fractional num

25、bers, where the number of fractional bits and the number of integer bits is specified 4.11 uqmsbf unsigned Q format most significant bit first NOTE This Standard uses for uqmsbf the Q format notation Qm.n, where “m” designates the number of bits of the integer part and “n” denotes the number of bits

26、 of the fractional portion to the right of the binary point. The width “w” of the corresponding bitfield is w = m + n bits. The value range covers 0 to 2m - 2-n with a constant resolution of 2-n. To convert a number from unsigned Q format to a decimal number take the Q bitfield as an integer and mul

27、tiply it by 2-n. 4.12 sqmsbf signed Q format most significant bit first NOTE This Standard uses for sqmsbf the 2s complement with Q format notation Qm.n, where “m” designates the number of bits of the integer part without the sign bit and “n” the number of bits of the fractional portion to the right

28、 of the binary point. The width “w” of the corresponding bitfield is w = m + n + 1 bits, which includes the sign bit as most significant bit. The value range covers 2-m to 2m - 2-n with a constant resolution of 2-n. To convert a number from signed Q format to a decimal number take the Q bitfield as

29、an 2s complement integer and multiply it by 2-n. 5 S5 Overview S5 denotes a scalable multichannel coding system for spatial audio data compression, which can be applied to provide 3D audio experience with little overhead. Such a system may incorporate a wide range of state-of-the-art audio codecs an

30、d can be applied to provide a 3D audio experience. By using an audio codec, which may offer encapsulation capacity for external data, S5 data may be carried within the base audio coder stream with little overhead and maintain a compatible bit stream syntax. The system of an S5 codec can be determine

31、d by the functional block diagrams of the S5 encoder, as depicted in Figure 1, and of the S5 decoder, as depicted in Figure 2. An S5 encoder shall at least consist of a base S5 encoder; and a S5 decoder shall at least consist of a base S5 decoder. The base S5 encoder shall achieve compression of mul

32、tichannel audio information by downmixing the f-channel signal to g channels and shall produce sparse spatial data, which is a parametric encoding of a mathematical model to reconstruct from the downmix an upmix having the localization and ambiance approaching that of the original signal. A specific

33、 method, denoted as inverse coding (see Clause 6), is used to construct an upmix of h channels from the audio downmix and its associated spatial data. Compressing the downmix audio by a state-of-the-art base audio coder can further increase the coding efficiency of S5. The various bitstreams produce

34、d by the functional units of an S5 encoder may be encapsulated into a single bitstream by the functional unit Multiplexer (see Annex F). Ancillary data may be conveyed from the S5 encoder to the S5 decoder and may be used to encapsulate data other than coding parameters, for example, loudness parame

35、ters, which may be used to adjust the perceived level of audio signals. For the loudness parameters, see Annex D. This Standard specifies the base S5 encoder/decoder and their interfaces only. All other components and their interfaces are not specified. Such specific S5 codecs are subject to separat

36、e standards. As the base S5 Ecma International 2014 3 encoder/decoder is agnostic to the other system components, the base S5 coding standard shall represent the common base for all S5 specific standards. Figure 1 Functional block diagram of the S5 encoder Figure 2 Functional block-diagram of the S5

37、 decoder 4 Ecma International 2014 The subsequent clauses of this Standard specify the syntax of data streams by using Augmented Backus Naur Form (ABNF) as is defined in IETF RFC 5234. In addition to this notation, the code of the data stream elements is denoted by the format and the length of their

38、 bit fields. Note, that syntax and final encoding of a data stream are strictly separated. For the same syntax of a data stream, an external encoding e.g. by a multiplexer may vary according to the constraints of the storage or transmission environment. Examples are byte alignment or error protectio

39、n. However, external encoding details are beyond the scope of this Standard and are subject to specific S5 standards or other specifications. 6 Inverse Coding Inverse coding denotes a mathematical method for upmixing a channel-based audio signal while preserving to a high degree the localization and

40、 ambiance information of the audio source. Inverse coding is based on the spatial representation of a left and a right signal by a real-valued composite signal, the mid (M) signal, and a real-valued differential signal, the side (S) signal. A mid-side (MS) decoder maps without information loss the s

41、amples of the MS signals on to the left and the right channel. The mapping follows the equations below: Left = (M +S) 12 Right = (M S) 12 Inverse coding assumes that the S signal can be approximated by processing the M signal with two specific gains P, P and two specific delays L, L. Figure 4 depict

42、s inverse coding as a signal processing unit. The corresponding functions L, L, P, P and their parameters refer to Table 1: Table 1 Formulae of inverse coding gains and delays Delay L L = L = ( f()2sin + f2()4sin2 + f2() f() f() sinsin ) Delay L L = L = ( f()2sin + f2()4sin2 + f2() + f() f() sinsin

43、) Gain P P =f2()4sin2 + f2() f() f() sinsin Gain P P =f2()4sin2+ f2() + f() f() sinsin The discriminant relationship of these gains and delays induces sound source separation for sound sources, even at the same frequency. It relies on an inverse problem solution, which takes into account the directi

44、vity pattern together with angular assumptions with regard to the main angle of incidence and a left opening angle and a right opening angle . These parameters of such discriminant relationship define the overall sound stage of the resulting MS signal. Figure 3 depicts this relationship for an M sig

45、nal showing a cardioid polar pattern. Ecma International 2014 5 Figure 3 Angular assumptions and directivity pattern with inverse coding The following inverse coding parameters are applied: the directivity pattern f() is the ascertained polar diagram with f() = 1 2 + 2 sin with 0 2 and 0 n 2 the asc

46、ertained main angle of incidence between sound source and the polar main axis of the directivity pattern (such polar axis corresponding with a microphones main axis), with -/2 /2 the stipulated left opening angle adjoining the polar main axis of the directivity pattern on the left, with, 0 /2. For a

47、 positive main angle of incidence , the condition shall be satisfied. the stipulated right opening angle adjoining the polar main axis of the directivity pattern on the right, with 0 /2. For a negative main angle of incidence , the condition | | shall be satisfied. the time scaling factor for genera

48、ting a S-signal, with 0.029s 0.146s the side signal ratio gain to control the S-signal level, with, 0 1, leading to signals that may be seamlessly varied between a degree of correlation of 1 and +1. 6 Ecma International 2014 These parameters are applied for inverse coding as shown by the signal proc

49、essing circuit of Figure 4: Figure 4 Functional block diagram for an inverse coding function According to Figure 4, the samples of left and right channels shall be derived from the previous equations as given in Table 2. Table 2 Formulae of inverse coding functions Left = 12 (M + (P delay(M,L ) + P delay(M,L ) Right = 12 (M (P delay

注意事项: 本文（ECMA 407-2014 Scalable Sparse Spatial Sound System (S5) - Base S5 Coding (1st Edition).pdf）为本站会员（ideacase155）主动上传，麦多课文档分享仅提供信息存储空间，仅对用户上传内容的表现方式做保护处理，对上载内容本身不做任何修改或编辑。若此文所含内容侵犯了您的版权或隐私，请立即通知麦多课文档分享（点击联系客服），我们立即给予删除！