Applying the Hidden Markov Model Methodology for .ppt
《Applying the Hidden Markov Model Methodology for .ppt》由会员分享,可在线阅读,更多相关《Applying the Hidden Markov Model Methodology for .ppt(44页珍藏版)》请在麦多课文档分享上搜索。
1、AIDA01-1,Applying the Hidden Markov Model Methodology for Unsupervised Learning of Temporal Data,Cen Li1 and Gautam Biswas21Department of Computer Science Middle Tennessee State University2Department of EECS Vanderbilt University Nashville, TN 37235. USA. biswasvuse.vanderbilt.edu biswashttp:/www.vu
2、se.vanderbilt.edu/biswas June 21, 2001,AIDA01-2,Problem Description,Most real world systems are dynamic, e.g., Physical plants and Engineering systems Human Physiology Economic systems Systems are complex: hard to understand, model, and analyze But our ability to collect data on these systems has in
3、creased tremendouslyTask: Use data to automatically build models, extend incomplete models, and verify and validate existing models using the data available.Why models?Formal, abstract representation of phenomena or processEnables systematic analysis and prediction Our goal:Build models, i.e., creat
4、e structure from data using exploratory techniquesChallenge:Systematic and useful clustering algorithms for temporal data,AIDA01-3,Outline of Talk,Example Problem Related work on temporal data clustering Motivation for using Hidden Markov Model (HMM) representation Bayesian HMM clustering methodolog
5、y Experimental results Synthetic Data Real world ecological data Conclusions and future work,AIDA01-4,Example of Dynamic System,FiO2,RR,MAP,PaO2,Temporal features: FiO2, RR, MAP, PaO2, PIP, PEEP, PaCO2, TV .,ARDS patients response to respiratory therapy:,Goal : To construct profiles of patient respo
6、nse patterns for better understanding the effects of various therapies,AIDA01-5,Problem Description,Unsupervised Classification (Clustering)Given data objects described by multiple temporal features:(1) to assign category information to individual objects by objectively partitioning the objects into
7、 homogeneous groups such that the within group object similarity and the between group object dissimilarity are maximized,(2) to form succinct description for each category derived.,AIDA01-6,Related Work on Temporal Data Clustering,AIDA01-7,Motivation for using HMM Representation,What are our modeli
8、ng objectives ? Why do we choose the HMM representation ? The hidden states valid stages of a dynamic process Direct probabilistic links probabilistic transitions among different stages,Mathematically HMM Model made up of three components : initial state probabilities A: transition matrix B: emissio
9、n matrix ,AIDA01-8,Continuous Speech Signal,Rabiner and Huang, 1993,Spectral sequence feature extraction statistical pattern recognitionIsolated Word Recognition 1 HMM per word 2-10 states per HMM direct continuous density HMMs produce best resultsContinuous Word Recognition Number of words (?)Utter
10、ance boundaries (?) Word boundaries: fuzzy/non unique Matching word reference patterns to Words: exponential problem,Whole word models typically not used for continuous speech recognition,AIDA01-9,Continuous Speech Recognizer using sub-words,Recognized Sentence,Sentence (Sw):,W1,W2,silence,silence,W
11、ord (W1):,U1(W1),U2(W1),UL(W1)(W1),Sub-word Unit (PLU):,Rabiner and Huang, 1993,AIDA01-10,Continuous Speech Recognition Characteristics,Lot of background knowledge available, e.g., Phonemes & syllables for isolated word recognition Sub-word units (Phonelike units (PLUs), Syllable-like units, Acousti
12、c units) for word segmentation Grammar and semantics for language models As a result, HMM structure usually well-defined Recognition task dependent on learning model parameters Clustering techniques employed to derive efficient state representation Single speaker versus speaker independent recogniti
13、on Question: What about domains where structure not well known & Data not as well-defined,AIDA01-11,Past Work on HMM Clustering,Past Work: Rabiner et al., 1989 HMM Likelihood Clustering, HMM Threshold Clustering Baum Welch, Viterbi methods for parameter estimation Lee, 1990 Agglomerative procedure B
14、ahl et al., 1986, Normandin et al., 1994 HMM parameter estimation by Maximal Mutual Information (MMI) Kosaka et al. 1995 HMnet composition and clustering (CCL); Bhattacharyya distance measure Dermatas & Kokkinakis, 1996 Extended Rabiner et al.s work more sophisticated clustering structure clustering
15、 by recognition error Smyth, 1997 Clustering with finite mixture models Limitations: No objective criterion for cluster partition evaluation and selection Apply uniform, pre-defined HMM model size for clustering,Applied to Speech Recognition,AIDA01-12,Original work on HMM parameter estimation (Rabin
16、er, et al., 1989),Maximum Likelihood Method (Baum Welch): Parameter estimation by the E(xpectation)-M(aximization) method Assign initial parameters E-step: compute expected values of the necessary statistics M-step: update model parameter values to maximize likelihood Use forward-backward procedure
17、to cut down on the computational requirements. Baum Welch method very sensitive to initialization: use Viterbi method to find most likely path, and use that to initialize parameters. Extensions to method (estimate model size): state splitting, state merging, etc.,AIDA01-13,AIDA01-14,Our Approach to
18、HMM Clustering,Four nested search loops:Loop 1: search for the optimal number of clusters in the partition, Loop 2: search for the optimal object distribution to the clusters in the partition, Loop 3: search for the optimal HMM model size for each cluster in the partition, and Loop 4: search for the
19、 optimal model parameter configuration for each model .Search space:,HMM Learning,AIDA01-15,Introducing Heuristics to HMM Clustering,Four nested levels of search:Loop 1: search for the optimal number of clusters in the partition,Loop 2: search for the optimal object distribution to clusters in the p
20、artition,Loop 3: search for the optimal HMM model size for each cluster in the partition, andLoop 4: search for the optimal model parameter configuration for each model .,Bayesian Model Selection,HMM Learning,AIDA01-16,Bayesian Model Selection,(Marginal Likelihood),Model Posterior Probability:,Goal:
21、 Select Model M that maximizes P(M|D),AIDA01-17,Computing Marginal Likelihhods,Issue of complete versus incomplete dataApproximation Techniques Monte Carlo methods (Gibbs sampling) Candidate Method (variation of Gibbs sampling) Chickering Laplace approximation (multivariate Gaussian, quadratic Taylo
22、r series approx.)Bayesian Information Criteria derived from Laplace approx. Very efficientCheeseman-Stutz approx.,AIDA01-18,Marginal Likelihood Approximations,Bayesian Information Criterion (BIC):Cheeseman-Stutz (CS) Approximation:,Likelihood,Model ComplexityPenalty,Likelihood,Model PriorProbability
23、,AIDA01-19,Introducing Heuristics to HMM Clustering,Four nested levels of search: Step 1: search for the optimal number of clusters in the partition,Step 2: search for the optimal object distribution to clusters in the partition,Step 3: search for the optimal HMM model size for each cluster in the p
24、artition, andStep 4: search for the optimal model parameter configuration for each model .,Bayesian Model Selection,AIDA01-20,Bayesian Model Selection for HMM Model Size Selection,Goal: To select the optimal number of states, K, for a HMMthat maximizes,AIDA01-21,BIC and CS for HMM Model Size Selecti
- 1.请仔细阅读文档,确保文档完整性,对于不预览、不比对内容而直接下载带来的问题本站不予受理。
- 2.下载的文档,不会出现我们的网址水印。
- 3、该文档所得收入(下载+内容+预览)归上传者、原创作者;如果您是本文档原作者,请点此认领!既往收益都归您。
下载文档到电脑,查找使用更方便
2000 积分 0人已下载
下载 | 加入VIP,交流精品资源 |
- 配套讲稿:
如PPT文件的首页显示word图标,表示该PPT已包含配套word讲稿。双击word图标可打开word文档。
- 特殊限制:
部分文档作品中含有的国旗、国徽等图片,仅作为作品整体效果示例展示,禁止商用。设计者仅对作品中独创性部分享有著作权。
- 关 键 词:
- APPLYINGTHEHIDDENMARKOVMODELMETHODOLOGYFORPPT

链接地址:http://www.mydoc123.com/p-378518.html