1、 Reference number ISO/IEC 23000-12:2010/Amd.2:2012(E) ISO/IEC 2012INTERNATIONAL STANDARD ISO/IEC 23000-12 First edition 2010-07-15 AMENDMENT 2 2012-05-01 Information technology Multimedia application format (MPEG-A) Part 12: Interactive music application format AMENDMENT 2: Compact representation of
2、 dynamic volume change and audio equalization Technologies de linformation Format pour application multimdia (MPEG-A) Partie 12: Format dapplication musicale interactive AMENDEMENT 2: Reprsentation compacte de changement de volume dynamique et galisation audio ISO/IEC 23000-12:2010/Amd.2:2012(E) COP
3、YRIGHT PROTECTED DOCUMENT ISO/IEC 2012 All rights reserved. Unless otherwise specified, no part of this publication may be reproduced or utilized in any form or by any means, electronic or mechanical, including photocopying and microfilm, without permission in writing from either ISO at the address
4、below or ISOs member body in the country of the requester. ISO copyright office Case postale 56 CH-1211 Geneva 20 Tel. + 41 22 749 01 11 Fax + 41 22 749 09 47 E-mail copyrightiso.org Web www.iso.org Published in Switzerland ii ISO/IEC 2012 All rights reservedISO/IEC 23000-12:2010/Amd.2:2012(E) ISO/I
5、EC 2012 All rights reserved iiiForeword ISO (the International Organization for Standardization) and IEC (the International Electrotechnical Commission) form the specialized system for worldwide standardization. National bodies that are members of ISO or IEC participate in the development of Interna
6、tional Standards through technical committees established by the respective organization to deal with particular fields of technical activity. ISO and IEC technical committees collaborate in fields of mutual interest. Other international organizations, governmental and non-governmental, in liaison w
7、ith ISO and IEC, also take part in the work. In the field of information technology, ISO and IEC have established a joint technical committee, ISO/IEC JTC 1. International Standards are drafted in accordance with the rules given in the ISO/IEC Directives, Part 2. The main task of the joint technical
8、 committee is to prepare International Standards. Draft International Standards adopted by the joint technical committee are circulated to national bodies for voting. Publication as an International Standard requires approval by at least 75 % of the national bodies casting a vote. Attention is drawn
9、 to the possibility that some of the elements of this document may be the subject of patent rights. ISO and IEC shall not be held responsible for identifying any or all such patent rights. Amendment 2 to ISO/IEC 23000-12:2010 was prepared by Joint Technical Committee ISO/IEC JTC 1, Information techn
10、ology, Subcommittee SC 29, Coding of audio, picture, multimedia and hypermedia information. ISO/IEC 23000-12:2010/Amd.2:2012(E) ISO/IEC 2012 All rights reserved 1Information technology Multimedia application format (MPEG-A) Part 12: Interactive music application format AMENDMENT 2: Compact represent
11、ation of dynamic volume change and audio equalization In 6.6.5: Preset Box, replace: if(preset_type = 0) for(i=0; inum_preset_elements; i+) unsigned int(8) preset_volume_element; if(preset_type = 1) unsigned int(8) num_input_channelnum_preset_elements; unsigned int(8) output_channel_type; for (i=0;
12、inum_preset_elements; i+) for (j=0; jnum_input_channeli; j+) for (k=0; knum_output_channel; k+) unsigened int(8) preset_volume_element; if(preset_type = 2) / dynamic track volume preset unsigned int(16) num_updates; for(i=0; inum_updates; i+) unsigned int(16) updated_sample_number; for(j=0; jnum_pre
13、set_elements; j+) unsigned int(8) preset_volume_element; if(preset_type = 3) / dynamic object volume preset unsigned int(16) num_updates; unsigned int(8) num_input_channelnum_preset_elements; unsigned int(8) output_channel_type; ISO/IEC 23000-12:2010/Amd.2:2012(E) 2 ISO/IEC 2012 All rights reservedf
14、or(i=0; inum_updates; i+) unsigned int(16) updated_sample_number; for(j=0; jnum_preset_elements; j+) for(k=0; knum_input_channelj; k+) for(m=0; mnum_output_channel; m+) unsigned int(8) preset_volume_element; with: if(preset_type inum_preset_elements; i+) unsigned int(8) preset_volume_element; if(pre
15、set_type for(j=0; jnum_eq_filters; j+) unsigned int(8) filter_type; unsigned int(16) filter_reference_frequency; unsigned int(8) filter_gain; unsigned int(8) filter_bandwidth; if(preset_type unsigned int(8) output_channel_type; for (i=0; inum_preset_elements; i+) for (j=0; jnum_input_channeli; j+) f
16、or (k=0; knum_output_channel; k+) unsigned int(8) preset_volume_element; if(preset_type for(k=0; knum_eq_filters; k+) unsigned int(8) filter_type; unsigned int(16) filter_reference_frequency; unsigned int(8) filter_gain; unsigned int(8) filter_bandwidth; if(preset_type for(i=0; inum_updates; i+) uns
17、igned int(16) updated_sample_number; ISO/IEC 23000-12:2010/Amd.2:2012(E) ISO/IEC 2012 All rights reserved 3for(j=0; jnum_preset_elements; j+) unsigned int(8) preset_volume_element; if(preset_type for(k=0; knum_eq_filters; k+) unsigned int(8) filter_type; unsigned int(16) filter_reference_frequency;
18、unsigned int(8) filter_gain; unsigned int(8) filter_bandwidth; if(preset_type unsigned int(8) num_input_channelnum_preset_elements; unsigned int(8) output_channel_type; for(i=0; inum_updates; i+) unsigned int(16) updated_sample_number; for(j=0; jnum_preset_elements; j+) for(k=0; knum_input_channelj;
19、 k+) for(m=0; mnum_output_channel; m+) unsigned int(8) preset_volume_element; if(preset_type for(m=0; mnum_eq_filters; m+) unsigned int(8) filter_type; unsigned int(16) filter_reference_frequency; unsigned int(8) filter_gain; unsigned int(8) filter_bandwidth; if(preset_type for(i=0; inum_updates; i+
20、) unsigned int(16) start_sample_number; unsigned int(16) duration_update; for(j=0; jnum_preset_elements; j+) unsigned int(8) end_preset_volume_element; if(preset_type for(k=0; knum_eq_filters; k+) unsigned int(8) filter_type; unsigned int(16) filter_reference_frequency; unsigned int(8) end_filter_ga
21、in; unsigned int(8) filter_bandwidth; ISO/IEC 23000-12:2010/Amd.2:2012(E) 4 ISO/IEC 2012 All rights reserved if(preset_type unsigned int(8) num_input_channelnum_preset_elements; unsigned int(8) output_channel_type; for(i=0; inum_updates; i+) unsigned int(16) start_sample_number; unsigned int(16) dur
22、ation_update; for(j=0; jnum_preset_elements; j+) for(k=0; knum_input_channelj; k+) for(m=0; mnum_output_channel; m+) unsigned int(8) end_preset_volume_element; if(preset_type for(m=0; mnum_eq_filters; m+) unsigned int(8) filter_type; unsigned int(16) filter_reference_frequency; unsigned int(8) end_f
23、ilter_gain; unsigned int(8) filter_bandwidth; In 6.6.5: Preset Box, replace: preset_type is an integer that indicates the preset type. Static track volume preset has the time invariant volume information related to each track involved in the preset. In this case, the output channel type is the same
24、as channel type of the track which has the largest number of channels among tracks involved in the preset. Type value is 0. Static object volume preset has the time invariant volume information related to each object which is individual channel (i.e. mono) of the track involved in the preset. Type v
25、alue is 1. Dynamic track volume preset has the time variant volume information related to each track involved in the preset. In this case, the output channel type is the same as channel type of the track which has the largest number of channels among tracks involved in the preset. Type value is 2. D
26、ynamic object volume preset has the time variant volume information related to each object which is individual channel (i.e. mono) of the track involved in the preset. Type value is 3. preset_type Meaning 0 static track volume preset 1 static object volume preset 2 dynamic track volume preset 3 dyna
27、mic object volume preset ISO/IEC 23000-12:2010/Amd.2:2012(E) ISO/IEC 2012 All rights reserved 5with: preset_type is an integer that indicates the preset type. A preset can contain volume and/or audio equalization (EQ) information. The last three bits (0b00000111) of preset_type represent the volume
28、related information, and the fourth last bit (0b00001000) represents EQ related information. Static track volume preset has the time invariant volume with or without EQ information related to each track involved in the preset. In this case, the output channel type is the same as channel type of the
29、track which has the largest number of channels among tracks involved in the preset. Type value is 0 without EQ, or 8 with EQ. Static object volume preset has the time invariant volume with or without EQ information related to each object which is individual channel (i.e. mono) of the track involved
30、in the preset. Type value is 1 without EQ, or 9 with EQ. Dynamic track volume preset has the time variant volume with or without EQ information related to each track involved in the preset. In this case, the output channel type is the same as channel type of the track which has the largest number of
31、 channels among tracks involved in the preset. Type value is 2 without EQ, or 10 with EQ. Dynamic object volume preset has the time variant volume with or without EQ information related to each object which is individual channel (i.e. mono) of the track involved in the preset. Type value is 3 withou
32、t EQ, or 11 with EQ. Dynamic track approximated volume preset has the time variant approximated volume with or without EQ information related to each track involved in the preset. In this case, the output channel type is the same as channel type of the track which has the largest number of channels
33、among tracks involved in the preset. Type value is 4 without EQ, or 12 with EQ. Dynamic object approximated volume preset has the time variant approximated volume with or without EQ information related to each object which is individual channel (i.e. mono) of the track involved in the preset. Type v
34、alue is 5 without EQ, or 13 with EQ. preset_type Meaning 0 static track volume preset 1 static object volume preset 2 dynamic track volume preset 3 dynamic object volume preset 4 dynamic track approximated volume preset 5 dynamic object approximated volume preset 6 Value reserved 7 Value reserved 8
35、static track volume preset with EQ 9 static object volume preset with EQ 10 dynamic track volume preset with EQ 11 dynamic object volume preset with EQ 12 dynamic track approximated volume preset with EQ 13 dynamic object approximated volume preset with EQ In 6.6.5: Preset Box, replace: num_updates
36、is an integer that gives the number of updates on preset_volume. with: num_updates is an integer that gives the number of updates on preset_volume or filter_gain. In the case of preset_type = 4, 5, 12 and 13, it indicates an integer that gives the number of updates on end_preset_volume or end_filter
37、_gain. ISO/IEC 23000-12:2010/Amd.2:2012(E) 6 ISO/IEC 2012 All rights reservedIn 6.6.5: Preset Box,previous to “preset_name is a null-terminated string in UTF-8 characters which gives a human-readable name for the preset.”, add: start_sample_number is an integer that indicates the time when the gradu
38、al volume or EQ update takes place. duration_update is an integer that indicates the number of samples (time duration) that the gradual volume or EQ update to incur. The volume level or filter gain level is changing linearly in time in the time duration from the previous volume level or filter gain
39、level before the update to the new volume level or filter gain level. When duration_update has the value 0, it indicates that the volume or EQ update incurs instantly. end_preset_volume_element is an integer that indicates the new volume at the end of the gradual volume update. In dynamic presets (p
40、reset_type equals to 2, 3, 4, 5, 10, 11, 12 and 13), the first volume update should be at the beginning time (where updated_sample_number equals to 0) so that volume levels are defined for the whole time period of the music. num_eq_filters is an integer representing the number of filters used for ea
41、ch preset element. filter_type is an integer representing the type of filter. There are 5 filter types: Low pass filter (LPF), High pass filter (HPF), Low shelf filter (LSF), High shelf filter (HSF) and Peaking filter. The filter types and corresponding value of filter_type is listed in the table be
42、low. filter_type 1 2 3 4 5 Filter type LPF HPF LSF HSF Peaking filter_reference_frequency, filter_gain, end_filter_gain, and filter_bandwidth are integers representing the parameters of the filters. Exact meanings of the parameters depend on the type of filter, as specified in filter_type. The meani
43、ngs of the parameters are listed in the table below. Filter type filter_reference_frequency filter_gain/ end_filter_gain filter_bandwidth LPF Cut-off frequency (F in Hz) Undefined Slope (S in dB/octave) HPF Cut-off frequency (F in Hz) Undefined Slope (S in dB/octave) LSF Corner frequency (F in Hz) G
44、ain (G in dB) Slope (S in dB/octave) HSF Corner frequency (F in Hz) Gain (G in dB) Slope (S in dB/octave) Peaking Center frequency (F in Hz) Gain (G in dB) Quality factor (Q) filter_reference_frequency is a 16-bit unsigned integer. The frequency value F is exactly the value of filter_reference_frequ
45、ency: F=filter_reference_frequency (Hz). The frequency range is from 0Hz to 65535Hz, which covers the frequency range of 96kHz sampled audio. filter_gain is an 8-bit unsigned integer. The gain value G represented by filter_gain is computed by: G=filter_gain/5-41 (dB). A range between -41.0dB to 10.0
46、dB with 0.2dB resolution can be represented. In the cases of low pass filter (LPF) and high pass filter (HPF), filter_gain is undefined. For LPF and HPF, filter gain is undefined. ISO/IEC 23000-12:2010/Amd.2:2012(E) ISO/IEC 2012 All rights reserved 7end_filter_gain is an 8-bit unsigned integer that
47、indicates the EQ filter gain at the end of the gradual EQ update. The gain value G represented by end_filter_gain is computed by: G=filter_gain/5-41 (dB). Thus a range between -41.0dB to 10.0dB with 0.2dB resolution can be represented. In the cases of low pass filter (LPF) and high pass filter (HPF)
48、, end_filter_gain is undefined. filter_bandwidth is an 8-bit unsigned integer, which indicates slope of filter for LPF, HPF, LSF and HSF. The slope value S in dB/octave is computed by: S=filter_bandwidth*6 (dB/octave). filter_bandwidth indicates the quality factor Q for peaking filter. The value Q i
49、s computed by: Q=filter_bandwidth/10. Add the following Annexes after Annex D: ISO/IEC 23000-12:2010/Amd.2:2012(E) 8 ISO/IEC 2012 All rights reservedAnnex E (informative) Compact Dynamic Volume Change Representation E.1 Description of compact dynamic volume change representation In dynamic volume preset of IM AF, t