End-User Programming of Intelligent Learning Agents.ppt

资源ID：374412 资源大小：1.48MB 全文页数：28页
资源格式： PPT 下载积分：2000积分

快捷下载

账号登录下载

微信登录下载

微信扫一扫登录

下载资源需要2000积分（如需开发票，请勿充值！）

邮箱/手机：
温馨提示：	如需开发票，请勿充值！快捷下载时，用户名和密码都是您填写的邮箱或者手机号，方便查询和重复下载（系统自动生成）。如需开发票，请勿充值！如填写123，账号就是123，密码也是123。
支付方式：
验证码：	换一换

加入VIP,交流精品资源

账号：
密码：
验证码：	换一换
当日自动登录忘记密码？

友情提示

1、下载资料失败解决办法

2、PDF文件下载后，可能会被浏览器默认打开，此种情况可以点击浏览器菜单，保存网页到桌面，就可以正常下载了。

3、本站不支持迅雷下载，请使用电脑自带的IE浏览器，或者360浏览器、谷歌浏览器下载即可。

4、本站资源下载后的文档和图纸-无水印,预览文档经过压缩，下载后原文更清晰。

5、试题试卷类文档，如果标题没有明确说明有答案则都视为没有答案，请知晓。

End-User Programming of Intelligent Learning Agents.ppt

1、End-User Programming of Intelligent Learning Agents,Prasad Tadepalli, Ron Metoyer, and Margaret Burnett,In conjunction with the EUSES Consortium: End Users Shaping Effective Software,Prasad Tadepalli: Machine Learning,Scaling Average-reward Reinforcement Learning to large spaces,Relational Learning,

2、Relational learning from prior knowledge and sparse user input,Relational Reinforcement Learning,NSF CAREER Award winner (2003). Complexities of animated content. Creating characters for training. Emphasis on usability and realism. Real-time simulation of evacuation dynamics for large crowds.,Ron Me

3、toyer: Computer Graphics & Animation,Margaret Burnett: Visual & End-User Programming,Project director: EUSES Consortium (End Users Shaping Effective Software) An ITR project by Oregon State, Carnegie Mellon, Drexel, Nebraska, & Penn State. Principal architect: Forms/3, FAR end-user programming suppo

4、rt. Co-architect: Functions for Excel users (a Microsoft Research project).,Motivation,Task Training Sports Military,Boston Dynamics Inc.,Who creates the training content?,Current Approaches,Joystick Control: User does all (once, not reusable). Scripting Languages User does all (reusable program). P

5、rogramming by Demonstration User and system share. Autonomous Agents System does all.,Application:Quarterback Training,QBs can benefit from 3D training content Coaches: Do not program or animate. Need responsive, semi-intelligent agents that perform football tasks. Agents: Should get better over tim

6、e. Should do so with few examples. Agent behavior: Must morph over time (different opponents).,End-User Programming by Demonstration,Generalizing from demonstrations is still an active area of research: Some viable approaches for particular assumptions, but not a solved problem. Other systems allow

7、demonstrating only reactive behaviors. Not used to train people strategy. Largely distinct from machine learning.,Our Approach to End-User Programming,Our approach: demonstrate goals and strategies to achieve the goals. Allows generalization and planning by agents. Thus, suited to training: Agents c

8、an simulate both “good” characters for training (desirable strategies) . and “bad” characters (strategies we know they employ).,Example,Goal: Get the football to Character A. Demonstration: Start state, goal state. Research issue: “What is relevant”? Any trees are ignorable background. Character A c

9、an be any character. The football is a unique object.,Start:,Goal:,Strategy 1: Pass it directly. Demonstration: Passing to A. “Whats relevant” issues arise again.Strategy 2: Pass it to B who passes to A. New issue: recursiveness. (Need to learn a general strategy of “get it to someone who can get it

10、 to closer to A”.),Example (cont.),Machine Learning Challenges,Learning must be on-line. Users can only give a few examples. Provide a predictable model of generalization. Must include support for debugging. Must allow safety checks. Expressive representation language.,Strategy Languages,Some high-l

11、evel languages exist to express strategies, e.g., Golog, CML. Our plan: simpler rule-based languages, suitable for learning. Starting point: our previous work on a decomposition-rule language: IF Condition(s) and Goal(s) Then Subgoals(s1,s2,sn) While invariant conditions hold.,Requirements of the Le

12、arning Algorithms,Follow HCI findings: User motivation, attention, trust. Need transparent generalization procedure, e.g., no neural nets. Treat user input as examples of high-level specification of strategy. .and fill in the details. User “steers” agent behaviors to correct faulty generalizations.

13、Assertions to monitor behavior. Provided, Inferred, and propagated.,Learning from Exercises,Generate examples automatically by searching for successful plans. Bottom-up learning of skills. Learn how to solve simple problems first. Compose known strategies for solving subgoals to solve more complex g

14、oals.,Oops! Thats Not Right!,Debugging by end-user programmers. When the agents pick the right strategy but it doesnt work right. When the agents pick the wrong strategies. These provide negative examples to the learning component.,How to Support Debugging?,User/system collaboration. User helps narr

15、ow the problem. System revises its rules and runs them on the example until the user is satisfied. Testing and Assertions Used for quality control, but designed specifically for end users. Assertions will be used to rule out bad generalizations.,Debugging (cont.),Draws from our previous work on end-

16、user software development: WYSIWYT testing, fault localization, and assertions. Surprise-Explain-Reward strategy: Empirically driven research. Draws from psychology to motivate desired behaviors via surprises (to arouse curiosity).,Research Issues,How to learn from a small number of examples? How to

17、 let the user “speak” his/her own language? How to motivate the users and earn their trust? How to facilitate debugging and maintenance in a natural way? How to make learning safe?,Summary: The Research Question,Is it possible to empower end users. .to program in evolving task-training environments.

18、 .using machine learning and programming by demonstration?,(The End),Leftovers,How to Support Debugging?,User/system collaboration. Builds on our previous work: Motivating, suggesting, and supporting. .end-user testing, end-user fault localization, and end-user assertions.,Web Navigation (*possibly

19、cut),Navigation of the web to satisfy a goal: Students trying to find an appropriate school that match their interests and constraints. Shoppers looking for bargain purchases. Traders searching for appropriate stocks to buy and sell. In each case, the system should learn to retrieve the target infor

20、mation efficiently.,Debugging,Negative examples are used to specialize over-general rules. Maintain confidences of rules based on their support among the training examples and suggest possible incorrect rules. Encourage users to enter assertions to correct errors. Verify assertions during rule evaluation and warn the user if they are not valid.,Agent Behavior Control,Joystick Controlled,Autonomous,Scripting languages,Autonomous but “teachable”,End-User Agents -program by interaction -generalize,

注意事项: 本文（End-User Programming of Intelligent Learning Agents.ppt）为本站会员（confusegate185）主动上传，麦多课文档分享仅提供信息存储空间，仅对用户上传内容的表现方式做保护处理，对上载内容本身不做任何修改或编辑。若此文所含内容侵犯了您的版权或隐私，请立即通知麦多课文档分享（点击联系客服），我们立即给予删除！