欢迎来到麦多课文档分享! | 帮助中心 海量文档,免费浏览,给你所需,享你所想!
麦多课文档分享
全部分类
  • 标准规范>
  • 教学课件>
  • 考试资料>
  • 办公文档>
  • 学术论文>
  • 行业资料>
  • 易语言源码>
  • ImageVerifierCode 换一换
    首页 麦多课文档分享 > 资源分类 > PPT文档下载
    分享到微信 分享到微博 分享到QQ空间

    A Short Tutorial on Game Theory.ppt

    • 资源ID:377881       资源大小:248KB        全文页数:49页
    • 资源格式: PPT        下载积分:2000积分
    快捷下载 游客一键下载
    账号登录下载
    微信登录下载
    二维码
    微信扫一扫登录
    下载资源需要2000积分(如需开发票,请勿充值!)
    邮箱/手机:
    温馨提示:
    如需开发票,请勿充值!快捷下载时,用户名和密码都是您填写的邮箱或者手机号,方便查询和重复下载(系统自动生成)。
    如需开发票,请勿充值!如填写123,账号就是123,密码也是123。
    支付方式: 支付宝扫码支付    微信扫码支付   
    验证码:   换一换

    加入VIP,交流精品资源
     
    账号:
    密码:
    验证码:   换一换
      忘记密码?
        
    友情提示
    2、PDF文件下载后,可能会被浏览器默认打开,此种情况可以点击浏览器菜单,保存网页到桌面,就可以正常下载了。
    3、本站不支持迅雷下载,请使用电脑自带的IE浏览器,或者360浏览器、谷歌浏览器下载即可。
    4、本站资源下载后的文档和图纸-无水印,预览文档经过压缩,下载后原文更清晰。
    5、试题试卷类文档,如果标题没有明确说明有答案则都视为没有答案,请知晓。

    A Short Tutorial on Game Theory.ppt

    1、A Short Tutorial on Game Theory,EE228a, Fall 2002 Dept. of EECS, U.C. Berkeley,EE228a, Fall 2002,2,Outline,Introduction Complete-Information Strategic Games Static Games Repeated Games Stackelberg Games Cooperative Games Bargaining Problem Coalitions,EE228a, Fall 2002,3,Outline,Introduction What is

    2、game theory about? Relevance to networking research Elements of a game Non-Cooperative Games Static Complete-Information Games Repeated Complete-Information Games Stackelberg Games Cooperative Games Nashs Bargaining Solution Coalition: the Shapley Value,EE228a, Fall 2002,Introduction,4,What Is Game

    3、Theory About?,To understand how decision-makers interact A brief history 1920s: study on strict competitions 1944: Von Neumann and Morgensterns book Theory of Games and Economic Behavior After 1950s: widely used in economics, politics, biology Competition between firms Auction design Role of punishm

    4、ent in law enforcement International policies Evolution of species,EE228a, Fall 2002,Introduction,5,Relevance to Networking Research,Economic issues becomes increasingly important Interactions between human users congestion control resource allocation Independent service providers Bandwidth trading

    5、Peering agreements Tool for system design Distributed algorithms Multi-objective optimization Incentive compatible protocols,EE228a, Fall 2002,Introduction,6,Elements of a Game: Strategies,Decision-makers choice(s) in any given situation Fully known to the decision-maker Examples Price set by a firm

    6、 Bids in an auction Routing decision by a routing algorithm Strategy space: set of all possible actions Finite vs infinite strategy space Pure vs mixed strategies Pure: deterministic actions Mixed: randomized actions,EE228a, Fall 2002,Introduction,7,Elements of a Game: Preference and Payoff,Preferen

    7、ce Transitive ordering among strategiesif a b, b c, then a c Payoff An order-preserving mapping from preference to R+ Example: in flow control, U(x)=log(1+x) px,EE228a, Fall 2002,Introduction,8,Rational Choice,Two axiomatic assumptions on games In any given situation a decision-maker always chooses

    8、the action which is the best according to his/her preferences (a.k.a. rational play). Rational play is common knowledge among all players in the game.,EE228a, Fall 2002,Introduction,9,Example: Prisoners Dilemma,EE228a, Fall 2002,Introduction,10,Different Types of Games,Static vs multi-stage Static:

    9、game is played only once Prisoners dilemma Multi-stage: game is played in multiple rounds Multi-round auctions, chess games Complete vs incomplete information Complete info.: players know each others payoffs Prisoners dilemma Incomplete info.: other players payoffs are not known Sealed auctions,EE22

    10、8a, Fall 2002,Introduction,11,Representations of a Game,Normal- vs extensive-form representation Normal-form like the one used in previous example Extensive-form,EE228a, Fall 2002,12,Outline,Introduction Complete-Information Strategic Games Static Games Repeated Games Stackelberg Games Cooperative G

    11、ames Nashs Bargaining Problem Coalitions: the Shapley Value,EE228a, Fall 2002,13,Static Games,Model Players know each others payoffs But do not know which strategies they would choose Players simultaneously choose their strategies Game is over and players receive payoffs based on the combination of

    12、strategies just chosen Question of Interest: What outcome would be produced by such a game?,EE228a, Fall 2002,14,Example: Cournots Model of Duopoly,Model (from Gibbons) Two firms producing the same kind of product in quantities of q1 and q2, respectively Market clearing price p=A q1 q2 Cost of produ

    13、ction is C for both firms Profit for firm iJi = (A q1 q2) qi C qi= (A C q1 q2) qidefine B A C Objective: choose qi to maximize profitqi*= argmaxqi (B q1 q2) qi,EE228a, Fall 2002,15,A Simple Example: Solution,Firm is best choice, given its competitors q,EE228a, Fall 2002,16,Solution to Static Games,N

    14、ash Equilibrium (J. F. Nash, 1950) Mathematically, a strategy profile (s1* , , si*, sn* ) is a Nash Equilibrium if for each player iUi(s1* , , s*i-1, si*, s*i+1, sn* ) Ui(s1* , , s*i-1, si, s*i+1,sn* ), for each feasible strategy si Plain English: a situation in which no player has incentive to devi

    15、ate Its fixed-point solution to the following system of equations si=argmaxs Ui(s1, , si-1, s, si+1,sn ), i Other solution concepts (see references),EE228a, Fall 2002,17,An Example on Mixed Strategies,Pure-Strategy Nash Equilibrium may not exist,Player A,Player B,Head (H),Tail (T),H,T,1, 1,1, 1,1, 1

    16、,1, 1,Cause: each player tries to outguess his opponent!,EE228a, Fall 2002,18,Example: Best Reply,Mixed Strategies Randomized actions to avoid being outguessed Players strategies and expected payoffs Players plays H w.p. p and play T w.p. 1 p Expected payoff of Player Apa pb + (1 pa) (1 pb) pa (1 pb

    17、) pb (1 pa) = (1 2 pb) + pa (4pb 2) So if pb 1/2, pa*=1 (i.e. play H); if pb 1/2, pa*=0 (i.e. play T); if pb=1/2, then playing either H or T is equally good,EE228a, Fall 2002,19,Example: Nash Equilibrium,pb,pa,0,1,1,EE228a, Fall 2002,20,Existence of Nash Equilibrium,Finite strategy space (J. F. Nash

    18、, 1950) A n-player game has at least one Nash equilibrium, possibly involving mixed strategy. Infinite strategy space (R.B. Rosen, 1965) A pure-strategy Nash Equilibrium exists in a n-player concave game. If the payoff functions satisfy diagonally strict concavity condition, then the equilibrium is

    19、unique.(s1 s2) rjJj(s1) + (s2 s1) rjJj(s2) 0,EE228a, Fall 2002,21,Distributed Computation of Nash Equilibrium,Nash equilibrium as result of “learning” Players iteratively adjust their strategies based on locally available information Equilibrium is reached if there is a steady state Two commonly use

    20、d schemes,Gauss-Siedel,Jacobian,EE228a, Fall 2002,22,Convergence of Distributed Algorithms,Algorithms may not converge for some cases,EE228a, Fall 2002,23,Suggested Readings,J.F. Nash. “Equilibrium Points in N-Person Games.” Proc. of National Academy of Sciences, vol. 36, 1950. A “must-read” classic

    21、 paper R.B. Rosen. “Existence and Uniqueness of Equilibrium Points for Concave N-Person Games.” Econometrica, vol. 33, 1965. Has many useful techniques A. Orda et al. “Competitive Routing in Multi-User Communication Networks.” IEEE/ACM Transactions on Networking, vol. 1, 1993. Applies game theory to

    22、 routing And many more,EE228a, Fall 2002,24,Multi-Stage Games,General model Game is played in multiple rounds Finite or infinitely many times Different games could be played in different rounds Different set of actions or even players Different solution concepts from those in static games Analogy: o

    23、ptimization vs dynamic programming Two special classes Infinitely repeated games Stackelberg games,EE228a, Fall 2002,25,Infinitely Repeated Games,Model A single-stage game is repeated infinitely many times Accumulated payoff for a player,J=t1+dt2+d n-1tn+=Si d i-1ti,Main theme: play socially more ef

    24、ficient moves Everyone promises to play a socially efficient move in each stage Punishment is used to deter “cheating” Example: justice system,EE228a, Fall 2002,26,Cournots Game Revisited. I,Cournots Model At equilibrium each firm produces B/3, making a profit of B2/9 Not an “ideal” arrangement for

    25、either firm, becauseIf a central agency decides on production quantity qmqm=argmax (B q) q = B/2so each firm should produce B/4 and make a profit of B2/8 An aside: why B/4 is not played in the static game?If firm A produces B/4, it is more profitable for firm B to produce 3B/8 than B/4Firm A then in

    26、 turn produces 5B/16, and so on,EE228a, Fall 2002,27,Cournots Game Revisited. II,Collaboration instead of competitionQ: Is it possible for two firms to reach an agreement to produce B/4 instead of B/3 each?A: That would depend on how important future return is to each firm,A firm has two choices in

    27、each round: Cooperate: produce B/4 and make profit B2/8 Cheat: produce 3B/8 and make profit 9B2/64 But in the subsequent rounds, cheating will causeits competitor to produce B/3 as punishment its own profit to drop back to B2/9,EE228a, Fall 2002,28,Cournots Game Revisited. III,Is there any incentive

    28、 for a firm not to cheat?Lets look at the accumulated payoffs: If it cooperates: Sc = (1+d+ d2+ d3+ ) B2/8 =B2/8(1d) If it cheats:Sd = 9B2/64 + (d+ d2+ d3+ ) B2/9 =9/64 + d/9(1d) B2 So it will not cheat if Sc Sd .,This happens only if d9/17.,Conclusion If future return is valuable enough to each pla

    29、yer, then strategies exist for them to play socially efficient moves.,EE228a, Fall 2002,29,Strategies in Repeated Games,A strategy is no longer a single action but a complete plan of actions based on possible history of plays up to current stage usually includes some punishment mechanism Example: in

    30、 Cournots game, a players strategy is,Produce B/4 in the first stage. In the nth stage, produce B/4 if both firms have produced B/4 in each of the n1 previous stages; otherwise, produce B/3.,EE228a, Fall 2002,30,Equilibrium in Repeated Games,Subgame-perfect Nash equilibrium (SPNE) A subgame starting

    31、 at stage n is identical to the original infinite game associated with a particular sequence of plays from the first stage to stage n1 A SPNE constitutes a Nash equilibrium in every subgame Why subgame perfect? It is all about creditable threats:Players believe the claimed punishments indeed will be

    32、 carried out by others, when it needs to be evoked. So a creditable threat has to be a Nash equilibrium for the subgame.,EE228a, Fall 2002,31,Known Results for Repeated Games,Friedmans Theorem (1971)Let G be a single-stage game and (e1, en) denote the payoff from a Nash equilibrium of G.If x=(x1, ,

    33、xn) is a feasible payoff from G such that xi ei,i, then there exists a subgame-perfect Nash equilibrium of the infinitely repeated game of G which achieves x, provided that discount factor d is close enough to one. Assignment: Apply this theorem to Cournots game on an agreement other than B/4.,EE228

    34、a, Fall 2002,32,Suggested Readings,J. Friedman. “A Non-cooperative Equilibrium for Super-games.” Review of Economic Studies, vol. 38, 1971. Friedmans original paper R. J. La and V. Anantharam. “Optimal Routing Control: Repeated Game Approach,“ IEEE Transactions on Automatic Control, March 2002. Appl

    35、ies repeated game to improve the efficiency of competitive routing,EE228a, Fall 2002,33,Stackelberg Games,Model One player (leader) has dominate influence over another Typically there are two stages One player moves first Then the other follows in the second stage Can be generalized to have multiple

    36、 groups of players Static games in both stages Main Theme Leader plays by backwards induction, based on the anticipated behavior of his/her follower.,EE228a, Fall 2002,34,Stackelbergs Model of Duopoly,Assumptions Firm 1 chooses a quantity q1 to produce Firm 2 observes q1 and then chooses a quantity

    37、q2 Outcome of the game For any given q1, the best move for Firm 2 is q2* = (B q1)/2 Knowing this, Firm 1 chooses q1 to maximize J1 = (B q1 q2* ) q1= q1(B q1)/2which yieldsq1* = B/2, and q2* = B/4J1* = B2/8, and J2* = B2/16,EE228a, Fall 2002,35,Suggested Readings,Y. A. Korilis, A. A. Lazar and A. Ord

    38、a. “ Achieving Network Optima Using Stackelberg Routing Strategies.” IEEE/ACM Trans on Networking, vol.5, 1997. Network leads users to reach system optimal equilibrium in competitive routing. T. Basar and R. Srikant. “Revenue Maximizing Pricing and Capacity Expansion in a Many-User Regime.” INFOCOM

    39、2002, New York. Network charges users price to maximize its revenue.,EE228a, Fall 2002,36,Outline,Introduction Complete-Information Strategic Games Static Games Repeated Games Stackelberg Games Cooperative Games Nashs Bargaining Problem Coalitions: the Shapley value,EE228a, Fall 2002,37,Cooperation

    40、In Games,Incentive to cooperate Static games often lead to inefficient equilibrium Achieve more efficient outcomes by acting together Collusion, binding contract, side payment Pareto EfficiencyA solution is Pareto efficient if there is no other feasible solution in which some player is better off an

    41、d no player is worse off. Pareto efficiency may be neither socially optimal nor fair Socially optimal Pareto efficient Fairness issues Reading assignment as an example,A,B,mum,fink,mum,fink,1, 1,9, 0,0, 9,6, 6,EE228a, Fall 2002,38,Nashs Bargaining Problem,Model Two players with interdependent payoff

    42、s U and V Acting together they can achieve a set of feasible payoffs The more one player gets, the less the other is able to get And there are multiple Pareto efficient payoffs Q: which feasible payoff would they settle on? Fairness issue Example (from Owen): Two men try to decide how to split $100

    43、One is very rich, so that U(x) x The other has only $1, so V(x) log(1+x)log1=log(1+x) How would they split the money?,EE228a, Fall 2002,39,Intuition,Feasible set of payoffs Denote x the amount that the rich man gets (u,v)=(x, log(101x), x0,100,Let 0, du/u = dv/v Or du/u + dv/v = 0, or vdu+udv=0, or

    44、d(uv)=0.Find the allocation which maximizes UV x*=76.8!,EE228a, Fall 2002,40,Nashs Axiomatic Approach (1950),A solution (u*,v*) should be Rational (u*,v*) (u0,v0), where (u0,v0) is the worst payoffs that the players can get. Feasible (u*,v*)S, the set of feasible payoffs. Pareto efficient Symmetric

    45、If S is such that (u,v)S (v,u)S, then u*=v*. Independent from linear transformations Independent from irrelevant alternatives Suppose T S. If (u*,v*)T is a solution to S, then (u*,v*) should also be a solution to T.,EE228a, Fall 2002,41,Results,There is a unique solution which satisfies the above ax

    46、ioms maximizes the product of two players additional payoffs (uu0)(vv0) This solution can be enforced by “threats” Each player independently announces his/her threat Players then bargain on their threats If they reach an agreement, that agreement takes effort; Otherwise, initially announced threats

    47、will be used Different fairness criteria can be achieved by changing the last axiom (see references),EE228a, Fall 2002,42,Suggested Readings,J. F. Nash. “The Bargaining Problem.” Econometrica, vol.18, 1950. Nashs original paper. Very well written. X. Cao. “Preference Functions and Bargaining Solutio

    48、ns.” Proc. of the 21th CDC, NYC, NY, 1982. A paper which unifies all bargaining solutions into a single framework Z. Dziong and L.G. Mason. “FairEfficient Call Admission Control Policies for Broadband Networks a Game Theoretic Framework,” IEEE/ACM Trans. On Networking, vol.4, 1996. Applies Nashs bar

    49、gaining solution to resource allocation problem in admission control (multi-objective optimization),EE228a, Fall 2002,43,Coalitions,Model Players (n2) N form coalitions among themselves A coalition is any nonempty subset of N Characteristic function V defines a gameV(S)=payoff to S in the game between S and N-S, S NV(N)=total payoff achieved by all players acting togetherV() is assumed to be super-additiveS, T N, V(S+T) V(S)+V(T) Questions of Interest Condition for forming stable coalitions When will a single coalition be formed? How to distribute payoffs among players in a fair way?,


    注意事项

    本文(A Short Tutorial on Game Theory.ppt)为本站会员(appealoxygen216)主动上传,麦多课文档分享仅提供信息存储空间,仅对用户上传内容的表现方式做保护处理,对上载内容本身不做任何修改或编辑。 若此文所含内容侵犯了您的版权或隐私,请立即通知麦多课文档分享(点击联系客服),我们立即给予删除!




    关于我们 - 网站声明 - 网站地图 - 资源地图 - 友情链接 - 网站客服 - 联系我们

    copyright@ 2008-2019 麦多课文库(www.mydoc123.com)网站版权所有
    备案/许可证编号:苏ICP备17064731号-1 

    收起
    展开