1、 ETSI TS 126 191 V14.0.0 (2017-04) Digital cellular telecommunications system (Phase 2+) (GSM); Universal Mobile Telecommunications System (UMTS); LTE; Speech codec speech processing functions; Adaptive Multi-Rate - Wideband (AMR-WB) speech codec; Error concealment of erroneous or lost frames (3GPP
2、TS 26.191 version 14.0.0 Release 14) TECHNICAL SPECIFICATION ETSI ETSI TS 126 191 V14.0.0 (2017-04)13GPP TS 26.191 version 14.0.0 Release 14Reference RTS/TSGS-0426191ve00 Keywords GSM,LTE,UMTS ETSI 650 Route des Lucioles F-06921 Sophia Antipolis Cedex - FRANCE Tel.: +33 4 92 94 42 00 Fax: +33 4 93 6
3、5 47 16 Siret N 348 623 562 00017 - NAF 742 C Association but non lucratif enregistre la Sous-Prfecture de Grasse (06) N 7803/88 Important notice The present document can be downloaded from: http:/www.etsi.org/standards-search The present document may be made available in electronic versions and/or
4、in print. The content of any electronic and/or print versions of the present document shall not be modified without the prior written authorization of ETSI. In case of any existing or perceived difference in contents between such versions and/or in print, the only prevailing document is the print of
5、 the Portable Document Format (PDF) version kept on a specific network drive within ETSI Secretariat. Users of the present document should be aware that the document may be subject to revision or change of status. Information on the current status of this and other ETSI documents is available at htt
6、ps:/portal.etsi.org/TB/ETSIDeliverableStatus.aspx If you find errors in the present document, please send your comment to one of the following services: https:/portal.etsi.org/People/CommiteeSupportStaff.aspx Copyright Notification No part may be reproduced or utilized in any form or by any means, e
7、lectronic or mechanical, including photocopying and microfilm except as authorized by written permission of ETSI. The content of the PDF version shall not be modified without the written authorization of ETSI. The copyright and the foregoing restriction extend to reproduction in all media. European
8、Telecommunications Standards Institute 2017. All rights reserved. DECTTM, PLUGTESTSTM, UMTSTMand the ETSI logo are Trade Marks of ETSI registered for the benefit of its Members. 3GPPTM and LTE are Trade Marks of ETSI registered for the benefit of its Members and of the 3GPP Organizational Partners.
9、GSM and the GSM logo are Trade Marks registered and owned by the GSM Association. ETSI ETSI TS 126 191 V14.0.0 (2017-04)23GPP TS 26.191 version 14.0.0 Release 14Intellectual Property Rights IPRs essential or potentially essential to the present document may have been declared to ETSI. The informatio
10、n pertaining to these essential IPRs, if any, is publicly available for ETSI members and non-members, and can be found in ETSI SR 000 314: “Intellectual Property Rights (IPRs); Essential, or potentially Essential, IPRs notified to ETSI in respect of ETSI standards“, which is available from the ETSI
11、Secretariat. Latest updates are available on the ETSI Web server (https:/ipr.etsi.org/). Pursuant to the ETSI IPR Policy, no investigation, including IPR searches, has been carried out by ETSI. No guarantee can be given as to the existence of other IPRs not referenced in ETSI SR 000 314 (or the upda
12、tes on the ETSI Web server) which are, or may be, or may become, essential to the present document. Foreword This Technical Specification (TS) has been produced by ETSI 3rd Generation Partnership Project (3GPP). The present document may refer to technical specifications or reports using their 3GPP i
13、dentities, UMTS identities or GSM identities. These should be interpreted as being references to the corresponding ETSI deliverables. The cross reference between GSM, UMTS, 3GPP and ETSI identities can be found under http:/webapp.etsi.org/key/queryform.asp. Modal verbs terminology In the present doc
14、ument “shall“, “shall not“, “should“, “should not“, “may“, “need not“, “will“, “will not“, “can“ and “cannot“ are to be interpreted as described in clause 3.2 of the ETSI Drafting Rules (Verbal forms for the expression of provisions). “must“ and “must not“ are NOT allowed in ETSI deliverables except
15、 when used in direct citation. ETSI ETSI TS 126 191 V14.0.0 (2017-04)33GPP TS 26.191 version 14.0.0 Release 14Contents Intellectual Property Rights 2g3Foreword . 2g3Modal verbs terminology 2g3Foreword . 4g31 Scope 5g32 Normative references . 5g33 Definitions and abbreviations . 5g33.1 Definitions 5g
16、33.2 Abbreviations . 5g34 General . 6g35 Requirements 6g35.1 Error detection 6g35.2 Erroneous or lost speech frames . 6g35.3 First lost SID frame 6g35.4 Subsequent lost SID frames 6g36 Example ECU/BFH Solution . 6g36.1 State Machine . 6g36.2 Substitution and muting of erroneous/lost speech frames 8g
17、36.2.1 BFI = 0, prevBFI = 0, State = 0 or 1 . 8g36.2.2 BFI = 0, prevBFI = 1, State = 0 to3 9g36.2.3 BFI = 1, prevBFI = 0 or 1, State = 1.6 9g36.2.3.1 LTP gain 2 presented to TSG for approval; 3 Indicates TSG approved document under change control. y the second digit is incremented for all changes of
18、 substance, i.e. technical enhancements, corrections, updates, etc. z the third digit is incremented when editorial only changes have been incorporated in the specification; ETSI ETSI TS 126 191 V14.0.0 (2017-04)53GPP TS 26.191 version 14.0.0 Release 141 Scope This specification defines an error con
19、cealment procedure, also termed frame substitution and muting procedure, which shall be used by the AMR-WB speech codec receiving end when one or more erroneous/lost speech or lost Silence Descriptor (SID) frames are received. The requirements of this document are mandatory for implementation in all
20、 networks and User Equipment (UE)s capable of supporting the AMR-WB speech codec. It is not mandatory to follow the bit exact implementation outlined in this document and the corresponding C source code. 2 Normative references The following documents contain provisions which, through reference in th
21、is text, constitute provisions of the present document. References are either specific (identified by date of publication, edition number, version number, etc.) or non-specific. For a specific reference, subsequent revisions do not apply. For a non-specific reference, the latest version applies. In
22、the case of a reference to a 3GPP document (including a GSM document), a non-specific reference implicitly refers to the latest version of that document in the same Release as the present document. 1 3GPP TS 26.202“AMR Wideband Speech Codec; Interface to RAN“. 2 3GPP TS 26.190“AMR Wideband Speech Co
23、dec; Transcoding functions“. 3 3GPP TS 26.193“AMR Wideband Speech Codec; Source Controlled Rate operation“. 4 3GPP TS 26.201“AMR Wideband Speech Codec; Frame structure“. 3 Definitions and abbreviations 3.1 Definitions For the purposes of this document, the following definition applies: N-point media
24、n operation: Consists of sorting the N elements belonging to the set for which the median operation is to be performed in an ascending order according to their values, and selecting the (int (N/2) + 1) -th largest value of the sorted set as the median value. Further definitions of terms used in this
25、 document can be found in the references. 3.2 Abbreviations For the purposes of this document, the following abbreviations apply: AMR-WB Adaptive Multi Rate - WideBand AN Access Network BFI Bad Frame Indication from AN BSI_netw Bad Sub-block Indication obtained from AN interface CRC checks prevBFI B
26、ad Frame Indication of previous frame RX Receive SCR Source Controlled Rate (operation) SID Silence Descriptor frame (Background noise) CRC Cyclic Redundancy Check ECU Error Concealment Unit ETSI ETSI TS 126 191 V14.0.0 (2017-04)63GPP TS 26.191 version 14.0.0 Release 14BFH Bad Frame Handling medianN
27、 N-point median operation 4 General The purpose of the error concealment procedure is to conceal the effect of erroneous/lost AMR-WB speech frames. The purpose of muting the output in the case of several erroneous/lost frames is to indicate the breakdown of the channel to the user and to avoid gener
28、ating possible annoying sounds as a result from the error concealment procedure. The network shall indicate erroneous/lost speech or lost SID frames by setting the RX_TYPE values 3 to SPEECH_BAD, SID_BAD or SPEECH_LOST. If these flags are set, the speech decoder shall perform parameter substitution
29、to conceal errors. The example solution provided in paragraph 6 apply only to bad frame handling on a complete speech frame basis. Sub-frame based error concealment may be derived using similar methods. 5 Requirements 5.1 Error detection If the most sensitive bits of the AMR-WB speech data (class A
30、in 4) are received in error, the network shall indicate RX_TYPE = SPEECH_BAD in which case the BFI flag is set. When the frame is not received, the network shall indicate RX_TYPE = RX_SPEECH_LOST in which case the BFI flag is set as well. If a SID frame is received in error, the network shall indica
31、te RX_TYPE = SID_BAD 5.2 Erroneous or lost speech frames Normal decoding of erroneous/lost speech frames would result in very unpleasant noise effects. In order to improve the subjective quality, erroneous/lost speech frames shall be substituted with either a repetition or an extrapolation of the pr
32、evious good speech frame(s). This substitution is done so that it gradually will decrease the output level, resulting in silence at the output. Subclause 6 provides example solution. 5.3 First lost SID frame A lost SID frame shall be substituted by using the SID information from earlier received val
33、id SID frames and the procedure for valid SID frames be applied as described in 3. 5.4 Subsequent lost SID frames For many subsequent lost SID frames, a muting technique shall be applied to the comfort noise that will gradually decrease the output level. For subsequent lost SID frames, the muting of
34、 the output shall be maintained. Subclause 6 provides example solutions. 6 Example ECU/BFH Solution 6.1 State Machine This example solution for substitution and muting is based on a state machine with seven states (Figure 1). The system starts in state 0. Each time a bad frame is detected, the state
35、 counter is incremented by one and is saturated when it reaches 6. Each time a good speech frame is detected, the state counter is right-shifted by one. The state ETSI ETSI TS 126 191 V14.0.0 (2017-04)73GPP TS 26.191 version 14.0.0 Release 14indicates the quality of the channel: the larger the value
36、 of the state counter, the worse the channel quality is. The control flow of the state machine can be described by the following C code (BFI = bad frame indicator, State = state variable): if(BFI != 0 ) State = State + 1; if(State 6) State = 6; else State = State 1; In addition to this state machine
37、, the Bad Frame Flag from the previous frame is checked (prevBFI). The processing depends on the value of the State-variable. In states 0 and 6, the processing depends on the BFI flag. ETSI ETSI TS 126 191 V14.0.0 (2017-04)83GPP TS 26.191 version 14.0.0 Release 14The procedure can be described as fo
38、llows: Figure 1: State machine for controlling the bad frame substitution 6.2 Substitution and muting of erroneous/lost speech frames 6.2.1 BFI = 0, prevBFI = 0, State = 0 or 1 No error is detected in the received or in the previous received speech frame. The received speech parameters are used norm
39、ally in the speech synthesis. The current frame of speech parameters is saved. STATE = 0BFI = 0PrevBFI = 0 or 1Good frame (BFI=0)Bad frame (BFI=1)STATE = 1(BFI, prevBFI) =(1,0) or (0,1) or(0,0)STATE = 2(BFI, prevBFI) =(1,1) or (1,0) or(0,1)STATE = 3(BFI, prevBFI) =(1,1) or (1,0) or(0,1)STATE = 4BFI
40、= 1prevBFI = 0 or 1STATE = 5BFI = 1prevBFI = 1STATE = 6BFI = 1prevBFI = 1ETSI ETSI TS 126 191 V14.0.0 (2017-04)93GPP TS 26.191 version 14.0.0 Release 146.2.2 BFI = 0, prevBFI = 1, State = 0 to3 No error is detected in the received speech frame but the previous received speech frame was bad. The LTP
41、gain is used normally in the speech synthesis and fixed codebook gain are limited below the values used for the last received good subframe: =otherwisengnggorggngcccreceivedcreceivedcreceivedc,)1(*25.125.1)1( 100,)( (1) where creceivedg= current decoded fixed codebook-gain )1( ngc= fixed codebook ga
42、in used for the last good subframe (BFI = 0) )(ngc= fixed codebook gain to be used for the current frame. The rest of the received speech parameters are used normally in the speech synthesis. The current frame of speech parameters is saved. 6.2.3 BFI = 1, prevBFI = 0 or 1, State = 1.6 An error is de
43、tected in the received speech frame and the substitution and muting procedure is started. 6.2.3.1 LTP gain & fixed codebook gain concealment when RX_FRAMETYPE = SPEECH_BAD The LTP gain pg and fixed codebook gain cg are replaced by attenuated values from the previous subframes: )5(),.,1(5)( = ngngmed
44、ianstatePgpppp(2)=2VAD_HIST ,)5(),.,1(52VAD_HIST,)5(),.,1(5)(ngngmedianngngmedianstatePgcccccc(3) where: gp= current decoded LTP gain, cg = current decoded fixed codebook gain, )5(),.,1( ngngpp= LTP gains used for the last 5 subframes, )5(),.,1( ngngcc= fixed codebook gains used for the last 5 subfr
45、ames, median5() = 5-point median operation, )(statePp= attenuation factor (Pp(1) = 0.98, Pp(2) = 0.96, Pp(3) = 0.75, Pp(4) = 0.23, Pp(5) = 0.05, Pp(6) = 0.01), )(statePc= attenuation factor (Pc(1) = 0.98, Pc(2) = 0.98, Pc(3) = 0.98, Pc(4) = 0.98, Pc(5) = 0.98, Pc(6) = 0.70), state = state number 06,
46、 VAD_HIST is number of consecutive VAD=0 decisions. The higher the state value is, the more the gains are attenuated. Also the memory of the predictive fixed codebook gain is updated by using the average value of the past four values in the memory: () ()341041=iinenerener (4) ETSI ETSI TS 126 191 V1
47、4.0.0 (2017-04)103GPP TS 26.191 version 14.0.0 Release 146.2.3.2 LTP gain & fixed codebook gain concealment when RX_FRAMETYPE = SPEECH_LOST The LTP gain pg and fixed codebook gain cg are replaced by attenuated values from the previous subframes: )5(),.,1(5)( = ngngmedianstatePgpppp(5)=2VAD_HIST ,)5(
48、),.,1(52VAD_HIST,)5(),.,1(5)(ngngmedianngngmedianstatePgcccccc(6) where: gp= current decoded LTP gain, cg = current decoded fixed codebook gain, )5(),.,1( ngngpp= LTP gains used for the last 5 subframes, )5(),.,1( ngngcc= fixed codebook gains used for the last 5 subframes, median5() = 5-point median
49、 operation, )(statePp= attenuation factor (Pp(1) = 0.95, Pp(2) = 0.90, Pp(3 ) = 0.75, Pp(4) = 0.23, Pp(5) = 0.05, Pp(6) = 0.01), )(statePc= attenuation factor (Pc(1) = 0.50, Pc(2) = 0.25, Pc(3) = 0.25, Pc(4) = 0.25, Pc(5) = 0.15, Pc(6) = 0.01), state = state number 06, VAD_HIST is number of consecutive VAD=0 decisions. The higher the state value is, the more the gains are attenuated. Also the memory of the predictive fixed codebook gain is updated by using the average value of the