ITU-R BS 1693-2004 Procedure for the performance test of automated query-by-humming systems《自动query-by-humming系统的性能测试手续》.pdf
《ITU-R BS 1693-2004 Procedure for the performance test of automated query-by-humming systems《自动query-by-humming系统的性能测试手续》.pdf》由会员分享,可在线阅读,更多相关《ITU-R BS 1693-2004 Procedure for the performance test of automated query-by-humming systems《自动query-by-humming系统的性能测试手续》.pdf(6页珍藏版)》请在麦多课文档分享上搜索。
1、 Rec. ITU-R BS.1693 1 RECOMMENDATION ITU-R BS.1693 Procedure for the performance test of automated query-by-humming systems (Question ITU-R 8/6) (2004) The ITU Radiocommunication Assembly, considering a) that metadata will be accompanying most audio broadcast transmissions in the future; b) that the
2、 automatic generation of metadata will be necessary to offer a complete, cost-efficient service in future; c) that query-by-humming systems enable a natural way to query audio databases; d) that different schemes for extraction of audio metadata are developed today; e) that Recommendation ITU-R BS.1
3、657 Procedure for the performance testing of automated audio identification systems, describes a procedure for the performance test of automated audio identification systems; f) that ISO/IEC JTC 1/SC 29 WG 11 is currently finalizing schemes for the coding of metadata for multimedia data; g) that no
4、quality assessment procedures for audio metadata extraction schemes regarding melody recognition have been standardized until now, recommends 1 that the procedure described in Annex 1 should be used to evaluate the performance of automated query-by-humming systems. Annex 1 Procedure for the performa
5、nce test of automated query-by-humming systems 1 Introduction In a time of ever-increasing databases filled with musical content, be it genuine audio material or associated metadata (data about data), the demand for tools to maintain these masses of data is also growing more urgent day by day. This
6、desire is not only expressed by professionals, but also by the common Internet user and music lover, who searches the Web on numerous errands for her or his preferred musical style. In order to facilitate the retrieval of the desired material, two different levels of abstraction can be discerned her
7、e: The search for high-level metadata as a human listener would describe contents, e.g. melody, rhythm, timbre, instrumentation or genre. An example application for this would be a query-by-humming system, which can be used in recommendation engines. 2 Rec. ITU-R BS.1693 Extracting mid-level metadat
8、a for automatic identification of certain interpretations of musical contents. Descriptions of the technical features of the audio data (spectral contents, etc.) are distilled and compared to a database of known material, thus creating a link to relevant metadata such as artist, song name, etc. For
9、an overview of current state-of-the-art query-by-humming systems please refer to document, ISMIR 2002 (3rd International Conference on Music Information Retrieval, IRCAM Centre Pompidou Paris, France, October, 2002). 2 Motivation To meet the demands of the music business, the recognition rate of the
10、 applied query-by-humming technology must be high and withstand common alterations to the stored representations in the song database. This problem is tackled by a number of different, often proprietary, solutions that have arisen recently (Clarisse et al., 2002, Ghias et al., 1995, Haus and Pollast
11、ri, 2001, Heinz and Brckmann, 2003). All methods, however, face the same problems regarding their robustness to modifications of the original material. This leads to the proposition that automated query-by-humming systems should ideally be as precise and tolerant to signal modifications as human per
12、ception and recognition. Therefore, a sophisticated query-by-humming system has to be robust against different distortions regarding signal quality and variations from ideal melody inputs. Also a reliable handling of large song databases consisting of several thousands of songs should be provided. C
13、onsequently, in order to assess the quality of an automated query-by-humming system, a test environment has to be defined that covers different types of signal modifications and describes how to determine other essential system parameters. To allow the objective evaluation of query-by-humming system
14、s, a unified test procedure is needed. 3 Quality parameters For the evaluation of query-by-humming systems the following quality parameters have to be considered: Required audio input: Is it necessary to sing a certain part of the song or is it possible to sing any part? What is the minimal length o
15、f the input to provide a reliable result? Size of data representation: How many data (bytes) per song have to be stored in the music database? Size of the music database: How many songs can be held in the music database? Rec. ITU-R BS.1693 3 Mode of identification: How does the kind of input, such a
16、s singing in mother language, humming or singing modes like “na na na”, etc., any kind of instrument, influence the recognition rate and performance? Melody recognition speed: How long does it take to identify a melody? How does this scale with the number of songs in the music database? How does thi
17、s scale with the quality of the input data? To assess these properties in a sensible fashion and thereby to show the suitability of a system for real-world application, a test environment must exhibit constant boundary conditions regarding the characteristics under test. Relevant test conditions are
18、: the size and content of the music database (see 4); size of the query input (referring to the playing duration) and number of the test items (see 4); exact modification rules for the test items (see 5 and 6); and computing platform, which includes specification of the central processing unit (CPU)
19、, memory, and operating system (see 7). 4 Selection of test material and size of the music database A reference music sample database on which all systems perform their query should be defined. It should contain a mixture of different musical styles (pop songs from different countries, classic, .) w
20、ith worldwide prevalence in numbers on the most familiar songs. Special care should be taken to avoid duplicate items within the database (cover versions, etc.). A music database size of 500-1 000 songs is suggested for a statistically reliable and relevant evaluation. As the preparation of abstract
- 1.请仔细阅读文档,确保文档完整性,对于不预览、不比对内容而直接下载带来的问题本站不予受理。
- 2.下载的文档,不会出现我们的网址水印。
- 3、该文档所得收入(下载+内容+预览)归上传者、原创作者;如果您是本文档原作者,请点此认领!既往收益都归您。
下载文档到电脑,查找使用更方便
10000 积分 0人已下载
下载 | 加入VIP,交流精品资源 |
- 配套讲稿:
如PPT文件的首页显示word图标,表示该PPT已包含配套word讲稿。双击word图标可打开word文档。
- 特殊限制:
部分文档作品中含有的国旗、国徽等图片,仅作为作品整体效果示例展示,禁止商用。设计者仅对作品中独创性部分享有著作权。
- 关 键 词:
- ITURBS16932004PROCEDUREFORTHEPERFORMANCETESTOFAUTOMATEDQUERYBYHUMMINGSYSTEMS 自动 QUERYBYHUMMING 系统 性能

链接地址:http://www.mydoc123.com/p-790332.html