欢迎来到麦多课文档分享! | 帮助中心 海量文档,免费浏览,给你所需,享你所想!
麦多课文档分享
全部分类
  • 标准规范>
  • 教学课件>
  • 考试资料>
  • 办公文档>
  • 学术论文>
  • 行业资料>
  • 易语言源码>
  • ImageVerifierCode 换一换
    首页 麦多课文档分享 > 资源分类 > PPT文档下载
    分享到微信 分享到微博 分享到QQ空间

    A Unified Model for Stable and Temporal Topic Detection from.ppt

    • 资源ID:373198       资源大小:1.48MB        全文页数:46页
    • 资源格式: PPT        下载积分:2000积分
    快捷下载 游客一键下载
    账号登录下载
    微信登录下载
    二维码
    微信扫一扫登录
    下载资源需要2000积分(如需开发票,请勿充值!)
    邮箱/手机:
    温馨提示:
    如需开发票,请勿充值!快捷下载时,用户名和密码都是您填写的邮箱或者手机号,方便查询和重复下载(系统自动生成)。
    如需开发票,请勿充值!如填写123,账号就是123,密码也是123。
    支付方式: 支付宝扫码支付    微信扫码支付   
    验证码:   换一换

    加入VIP,交流精品资源
     
    账号:
    密码:
    验证码:   换一换
      忘记密码?
        
    友情提示
    2、PDF文件下载后,可能会被浏览器默认打开,此种情况可以点击浏览器菜单,保存网页到桌面,就可以正常下载了。
    3、本站不支持迅雷下载,请使用电脑自带的IE浏览器,或者360浏览器、谷歌浏览器下载即可。
    4、本站资源下载后的文档和图纸-无水印,预览文档经过压缩,下载后原文更清晰。
    5、试题试卷类文档,如果标题没有明确说明有答案则都视为没有答案,请知晓。

    A Unified Model for Stable and Temporal Topic Detection from.ppt

    1、A Unified Model for Stable and Temporal Topic Detection from Social Media Data,Hongzhi Yin, Bin Cui, Hua Lu, Yuxin Huang and Junjie Yao Peking University Aalobrg University,Outline,Motivation Problem Formulation A Basic Solution A User-Temporal Mixture Model Enhancement of the basic solution Regular

    2、ization Technique Burst-Weighted Boosting Experiments Q/A,Outline,Motivation Problem Formulation A Basic Solution A User-Temporal Mixture Model Enhancement of the basic solution Regularization Technique Burst-Weighted Boosting Experiments Q/A,Motivation,Motivation (Cont.),Two different types of topi

    3、cs are mixed up in the social media platforms such as Twitter, Weibo and Delicious; Temporal Topics are temporally coherent meaningful themes. They are time-sensitive and often on popular real-life events or hot spots, i.e., breaking events in the real world. Stable Topics are often on users regular

    4、 interests and their daily routine discussions, e.g., their moods and statuses.,One Example in Twitter,Temporal Topic : Dead pigs in Shanghai,Stable Topic : Big Data,Another Example in Twitter,Temporal Topic: Independence Day,Stable Topic: Animal Adoption,We can tell the difference between temporal

    5、and Stable topics from their temporal distributions and their description words.,Motivation (Cont.),Discovering different topics of events that are coherent in temporal space Detecting bursty events, such as disaster (e.g., earthquakes), politics (e.g., election), and public events (e.g., Olympics)

    6、Analyzing topic trends Extracting stable topics that are coherent in user-interest space. Finding user intrinsic interests and better modeling user preference,Outline,Motivation Problem Formulation A Basic Solution A User-Temporal Mixture Model Enhancement of the basic solution Regularization Techni

    7、que Burst-Weighted Smoothing Experiments Q/A,Problem Formulation,A user-time-associated document d is a text document associated with a time stamp and a user. A temporal topic is a temporally coherent theme. In other words, the words that are emerging in the close time dimension are clustered in a t

    8、opic. An example of temporal topics: Given a collection of user-time-associated tweets, the desired temporal topics are the events happening in different times. Formally, a temporal/stable topic is represented by a word distribution where,Problem Formulation (Cont.),A topic distribution in time dime

    9、nsion is the distribution of topics given a specific time interval. Formally, is the probability of temporal topic given time interval t. A topic distribution in user space is the distribution of topics given a specific user. Formally, is the probability of stable topic given user u.,Problem Formula

    10、tion (Cont.),A User-Time-Keyword Matrix M is a hyper-matrix whose three dimensions refer to user, time and keyword. A cell in Mu, t, w stores the frequency of word w generated by user u within time interval t. Given a collection of user-time-associated documents C, we first formulate matrix M Detect

    11、ing Temporal Topics Extracting Stable Topics,Task 1,Task 2,Problem Formulation (Cont.),Detecting a set of temporal topics that are event-driven. Detecting bursty events, such as disaster (e.g., earthquakes), politics (e.g., election), and public events (e.g., Olympics) Analyzing topic trends Extract

    12、ing a set of stable topics that are interest-driven. Finding user intrinsic interests and better modeling user preference,Outline,Motivation Problem Formulation A Basic Solution A User-Temporal Mixture Model Enhancement of the basic solution Regularization Technique Burst-Weighted Boosting Experimen

    13、ts Q/A,A User-Time Mixture Model,Main InsightsTo find both temporal and stable topics in a unified manner, we propose a topic model that simultaneously captures two observations: Words generated around the same time are more likely to have the same event-driven temporal topicWords generated by the s

    14、ame user are more likely to have the same interest-driven stable topic. The former helps find event-driven temporal topics while the latter helps identify interest-driven stable topics.,Combine user and time information We assume that when a user u generates a word w at time t, he/she is probably in

    15、fluenced by two factors: the breaking news/events occurring in time t and his/her intrinsic interests. Breaking events are modeled by temporal topics and user intrinsic interests are modeled by stable topics.,The likelihood that user u generates word w at time t is as follows:Parameters and are mixi

    16、ng weights controlling the motivation factor choice, also denoting the proportions of temporal topics and stable topics in the dataset. It is worth mentioning that they are learnt automatically, instead of being fixed.,Parameter Estimation,The log-likelihood of the whole user-time-associated documen

    17、t collection C is E-M algorithm to estimate,E-Step,M-Step,Compute expectation,Maximize, closed form solution,Please refer to the details of E-M algorithm in Section 4.2,Parameter Estimation,E-step:M-step:,Outline,Motivation Problem Formulation A Basic Solution A User-Temporal Mixture Model Enhanceme

    18、nt of the basic solution Regularization Technique Burst-Weighted Boosting Experiments Q/A,Spatial Regularization,Intuitions If two users are connected in the social network space, they are more likely to enjoy same/similar interests/topics. A topic is interest-coherent if people who are interested i

    19、n this topic also close in the network space.,22,DB,DB,DB,?,More likely to be an DB person or an IR person?,Intuition: users interests are similar to their neighbors,Spatial Regularization,Topic Model With Spatial Regularization A regularized data likelihood is defined as follows:,Regularizer,The Sp

    20、atial Regularizer plays the role of spatial smoothing for user interests.,Parameter Estimation,24,Maximize, using Newton-Raphson,Smooth using a spatial regularizer; in each iteration, a user interest issmoothed by his/her spatial neighbors.,Outline,Motivation Problem Formulation A Basic Solution A U

    21、ser-Temporal Mixture Model Enhancement of the basic solution Regularization Technique Burst-Weighted Boosting Experiments Q/A,Insights,In topic models, the words with high occurrence rate, i.e., popular words, enjoy high probabilities to appear at top positions in each discovered topic. These popula

    22、r words are mostly general words, denoting abstract concepts. In stable topics, they can illustrate the domain of topics at the first glimpse. However, in temporal topics, words with notable bursty feature are superior in expressing temporal information since users are more interested in bursty word

    23、s than in abstract concepts when browsing temporal topic,Example: Michael Jacksons Death,In this temporal topic, we expect that bursty words “mj”, “michael jackson” “moonwalk” become the dominant words rather than the general words “world”, “news” and “death”.But they cannot be removed as stop words

    24、, since they can help illustrate the stable topics.,Burst-Weighted Boosting,We implement a bursty boosting step to escalate the probability of these bursty words during the procedure of detecting temporal topics. We first compute the bursty-degree of each word in each time interval. (Yao et al. ICDE

    25、2010) A boosting step is then taken after each few E-M iterations, as follows. In this step, a word w will have its generation probability boosted in a temporal topic only if ws bursty period overlaps with that of the topic.,Outline,Motivation Problem Formulation A Basic Solution A User-Temporal Mix

    26、ture Model Enhancement of the basic solution Regularization Technique Burst-Weighted Boosting Experiments Q/A,Data Sets,Twitter Data set (Mar. 2009 to Oct.2009) Delicious Data set (Feb.2008 to Dec. 2009) Sina Weibo (2011),Data Sets,Twitter: People in this platform often discuss many social events an

    27、d their daily life. It contains 9,884,640 tweets posted by 456,024 users in the period of Mar. 2009 to Oct.2009. Each user in this data set at least published 200 posts. We first removed all the stop words. Delicious: Delicious is a collaborative tagging system on which users can upload and tag web

    28、pages. We collected 200,000 users and their tagging behaviors from the period of Feb.2008 to Dec. 2009. The dataset contains 7,103,622 tags. Topics on technology and electronic cover more than half of tags. Breaking news also co-exists.,Compared Methods,Our models BUT is the basic model EUTS is the

    29、model enhanced with spatial regularization EUTB is the model enhanced with both spatial regularization and burst-weighted boosting. PLSA Model on Time Slices (Mei et al. KDD05) Individual Detection Method (Wang et al. KDD07) Topic Over Time Model (TOT) (Wang et al. KDD06) TimeUserLDA (Diao et al. AC

    30、L12),Time Stamp Prediction Comparison,Time Stamp Prediction Comparison,Topic Quality Comparison,Excellent: a nicely presented temporal topic; Good: a topic containing bursty features; Poor: a topic without obviousbursty features,Stable Topics Detected in Delicious,Temporal Topics Detected in Delicio

    31、us,Stable Topics Detected in Twitter,Temporal Topics Detected in Twitter,Stable Topics (Sina Weibo),Temporal Topics (Sina Weibo),Temporal Topics (Sina Weibo),Temporal Topic Trends Analysis,Temporal Topic Trends Analysis,Outline,Motivation Problem Formulation A Basic Solution A User-Temporal Mixture Model Enhancement of the basic solution Regularization Technique Burst-Weighted Boosting Experiments Q/A,Thank You!,Any Question ?,Email: ,


    注意事项

    本文(A Unified Model for Stable and Temporal Topic Detection from.ppt)为本站会员(lawfemale396)主动上传,麦多课文档分享仅提供信息存储空间,仅对用户上传内容的表现方式做保护处理,对上载内容本身不做任何修改或编辑。 若此文所含内容侵犯了您的版权或隐私,请立即通知麦多课文档分享(点击联系客服),我们立即给予删除!




    关于我们 - 网站声明 - 网站地图 - 资源地图 - 友情链接 - 网站客服 - 联系我们

    copyright@ 2008-2019 麦多课文库(www.mydoc123.com)网站版权所有
    备案/许可证编号:苏ICP备17064731号-1 

    收起
    展开