Advanced Topics in Data Mining Special focus- Social .ppt
《Advanced Topics in Data Mining Special focus- Social .ppt》由会员分享,可在线阅读,更多相关《Advanced Topics in Data Mining Special focus- Social .ppt(38页珍藏版)》请在麦多课文档分享上搜索。
1、Advanced Topics in Data Mining Special focus: Social Networks,Reminders,By the end of this week/ beginning of next we need to have a tentative presentation scheduleEach one of you should send me an email about a theme by Friday, February 22.,What did we learn in the last lecture?,What did we learn i
2、n the last lecture?,Degree distribution What are the observed degree distributions Clustering coefficient What are the observed clustering coefficients? Average path length What are the observed average path lengths?,What are we going to learn in this lecture?,How to generate graphs that have the de
3、sired properties Degree distribution Clustering coefficient Average path lengthWe are going to talk about generative models,What is a network model?,Informally, a network model is a process (radomized or deterministic) for generating a graph Models of static graphs input: a set of parameters , and t
4、he size of the graph n output: a graph G(,n) Models of evolving graphs input: a set of parameters , and an initial graph G0 output: a graph Gt for each time t,Families of random graphs,A deterministic model D defines a single graph for each value of n (or t)A randomized model R defines a probability
5、 space Gn,P where Gn is the set of all graphs of size n, and P a probability distribution over the set Gn (similarly for t) we call this a family of random graphs R, or a random graph R,Erds-Renyi Random graphs,Paul Erds (1913-1996),Erds-Renyi Random Graphs,The Gn,p model input: the number of vertic
6、es n, and a parameter p, 0 p 1 process: for each pair (i,j), generate the edge (i,j) independently with probability pRelated, but not identical: The Gn,m model process: select m edges uniformly at random,Graph properties,A property P holds almost surely (or for almost every graph), ifEvolution of th
7、e graph: which properties hold as the probability p increases?Threshold phenomena: Many properties appear suddenly. That is, there exist a probability pc such that for ppc the property holds a.s.What do you expect to be a threshold phenomenon in random graphs?,The giant component,Let z=np be the ave
8、rage degree If z 1, then almost surely, the largest component has size (n). The second largest component has size O(ln n) if z =(ln n), then the graph is almost surely connected.,The phase transition,When z=1, there is a phase transition The largest component is O(n2/3) The sizes of the components f
9、ollow a power-law distribution.,Random graphs degree distributions,The degree distribution follows a binomialAssuming z=np is fixed, as n, B(n,k,p) is approximated by a Poisson distributionHighly concentrated around the mean, with a tail that drops exponentially,Random graphs and real life,A beautif
10、ul and elegant theory studied exhaustivelyRandom graphs had been used as idealized network modelsUnfortunately, they dont capture reality,A random graph example,Departing from the Random Graph model,We need models that better capture the characteristics of real graphs degree sequences clustering coe
11、fficient short paths,Graphs with given degree sequences,input: the degree sequence d1,d2,dnCan you generate a graph with nodes that have degrees d1,d2,dn ? ,Graphs with given degree sequences,The configuration model input: the degree sequence d1,d2,dn process: Create di copies of node i Take a rando
12、m matching (pairing) of the copies self-loops and multiple edges are allowedUniform distribution over the graphs with the given degree sequence,Example,Suppose that the degree sequence isCreate multiple copies of the nodesPair the nodes uniformly at random Generate the resulting network,4,1,3,2,Grap
13、hs with given degree sequences,How about simple graphs ? No self loops No multiple edges,Graphs with given degree sequences,Realizability of degree sequences Lemma: A degree sequence d = d(1),d(n) with d(1)d(2) d(n) and d(1)+d(2)+d(n) even is realizable if and only if for every 1k n-1 it holds that,
- 1.请仔细阅读文档,确保文档完整性,对于不预览、不比对内容而直接下载带来的问题本站不予受理。
- 2.下载的文档,不会出现我们的网址水印。
- 3、该文档所得收入(下载+内容+预览)归上传者、原创作者;如果您是本文档原作者,请点此认领!既往收益都归您。
下载文档到电脑,查找使用更方便
2000 积分 0人已下载
下载 | 加入VIP,交流精品资源 |
- 配套讲稿:
如PPT文件的首页显示word图标,表示该PPT已包含配套word讲稿。双击word图标可打开word文档。
- 特殊限制:
部分文档作品中含有的国旗、国徽等图片,仅作为作品整体效果示例展示,禁止商用。设计者仅对作品中独创性部分享有著作权。
- 关 键 词:
- ADVANCEDTOPICSINDATAMININGSPECIALFOCUSSOCIALPPT

链接地址:http://www.mydoc123.com/p-378083.html