An Oxygenated Presentation Manager.ppt
《An Oxygenated Presentation Manager.ppt》由会员分享,可在线阅读,更多相关《An Oxygenated Presentation Manager.ppt(48页珍藏版)》请在麦多课文档分享上搜索。
1、1,Larry Rudolph & Shalini Agarwal,An Oxygenated Presentation Manager,Larry RudolphOxygen Workshop, January, 2002,2,Larry Rudolph & Shalini Agarwal,Goals & Overview,Integrate Many Oxygen Technologies Application Driven Use an application that we understand Personally use often Would help if were more
2、 human-centric Portable (as opposed to E-21) Develop Architectural Infrastructure Exposes new requirementsCritique of Presentation Manager What is wrong with it What needs improvement,3,Larry Rudolph & Shalini Agarwal,Application Scenario,4,Larry Rudolph & Shalini Agarwal,An Oxygen Application,Compo
3、nents Input Vision Speech Touch,Processing Changing configuration,Output Projector Handheld Archive,Equipment Today, it is too hard Linux laptop; windows laptop; camera; microphone; network; projector; power blocks Tomorrow, much easier a couple of H21s,5,Larry Rudolph & Shalini Agarwal,Camera watch
4、ing laser point on screen,Camera Challenges Inexpensive ones have wrong focal length Alignment issues Use edge of screen, display pattern, figure out from what is known to be visible We ended up displaying a pattern of concentric circles Relative size of laser point depends on distance Beyond ten fe
5、et, had to use only certain types of lasers Could slow-down camera and let pixels saturate (too complicated),6,Larry Rudolph & Shalini Agarwal,Camera watching laser point on screen (cont),Camera Interface Click at point (x,y) Hold laser at same location for 5 seconds Select horizontal line ( (x1,y1)
6、 , (x1,y2) ) Sweep laser back and forth, line is diameter of ellipse Select object centered at point (x,y) Sweep laser in circle, point is center of circle Previous or Next Click in left (right) 1/8 of screen,7,Larry Rudolph & Shalini Agarwal,Microphone listening to speaker,Microphone Many technolog
7、ies; Lapel-mic; mic array; room microphone Current approach: ipaq Continuous recognition Push to speak Audio server on ipaq Detects start and stop Best results when human pushes to start and releases to stop Audio wave file sent to Galaxy speech system Galaxy output actions via CGI-script A nice uni
8、fying mechanism One more complicated component,8,Larry Rudolph & Shalini Agarwal,Speaker controlling presentation via ipaq,Ipaq output to CGI-script Server Same actions as from speech server Action are Next slide, Previous slide, Goto slide #n, Goto slide named Next item, Previous item, Goto item #n
9、, Goto item named Next animations, previous animation, goto animation #n Start presentation , End presentation, Pause presentation Initialize Camera, test microphone Handheld (Ipaq) display GUI generated from speechbuilder grammar List of slides, items per slides Currently use ad-hoc solution where
10、power-point sends lists to ipaq. Need more automatic solution,9,Larry Rudolph & Shalini Agarwal,Output to projector, handheld, archive,Unlimited number of video / audio output producers E.g. powerpoint just one producer of output At any time, each output device has an associated producer This produc
11、er can receive input from several producers Handheld has proxy To reduce bandwidth to ipaq Current slide, list of slides, list of commands Archive Each slide shown, audio (from a different microphone) sent to archive Currently just gif of current slide,10,Larry Rudolph & Shalini Agarwal,Processing c
12、ontrolling session,Do not let powerpoint control the world Slide viewer; movie player; program execution; browser; etc Want to mix all types of applications Presenter has control of the output Eg: Switch output producer from powerpoint to media player Remove interrupting technologies Dynamically dis
13、connect any input / output source All done via core language Or some other glue language, e.g. meta-glue Which does all the other infrastructure issues,Multi-Modal Input,Shalini Agarwal Oxygen Conference January 8th, 2002,12,Larry Rudolph & Shalini Agarwal,Initial Experience With Presentation Manage
14、r,One Single Monolithic Context Command within slide, between slides, between applications Problem Too many false positives Preliminary Solution Slide tracking e.g. recognize “Next Slide” command only after at least 60% of words on slide have been said e.g. recognize “Show Demo” only after slide 17
15、Still lots of problems Many slide styles hard to track (e.g. figures not words on slide) Tracking for within slide different than for between slides,13,Larry Rudolph & Shalini Agarwal,A Better Solution: Multiple Contexts,Very Active Research AreaIntelligent-room project; Galaxy; Others Three layers,
16、 each having its own context Slide (Next Item, Next Animation) Presentation (Next Slide, Goto Conclusion, Goto Example) Session (Start Presentation, Switch to Browser, Show Questions) Challenges Each context requires its own speech recognition system Multicasting sound wave to each system Selecting
17、the best result,14,Larry Rudolph & Shalini Agarwal,Extending the Galaxy System,Start with context for speech and then extend Note, our goals are similar but not identical to those of the Spoken Language Group We are not dialog-based Exploit their work,Follow Galaxy Recognizer scores different guesse
18、s at words Language Processing Unit uses input grammar to select best input sentence Scott Cyphers gave us the nbest interface,15,Larry Rudolph & Shalini Agarwal,Recognizer chooses 10 best guesses at word matches (for this context),Language Processor picks best sentence from recognizer based on inpu
19、t grammar,16,Larry Rudolph & Shalini Agarwal,System Structure,17,Larry Rudolph & Shalini Agarwal,System Structure,next item,next movie,previous item,Selector,Language Processor,Recognizer,next item,end presentation,Language Processor,Recognizer,start presentation,start explorer,start presentation,Sl
20、ide Layer,Session Layer,start presentation,18,Larry Rudolph & Shalini Agarwal,System Structure,19,Larry Rudolph & Shalini Agarwal,Recognizer,Add Recognizer for T9,Language Processor,Presentation Layer,go to slide nine,Selector,Language Processor,next item,Language Processor,start presentation,Sound
21、Input,Slide Layer,Session Layer,start presentation,T9 Input,Recognizer,Recognizer,20,Larry Rudolph & Shalini Agarwal,Add Recognizer for Graffiti,Language Processor,Presentation Layer,go to slide nine,Selector,Language Processor,next item,Language Processor,start presentation,Sound Input,Slide Layer,
22、Session Layer,start presentation,T9 Input,Recognizer,Graffiti Input,Recognizer,Recognizer,21,Larry Rudolph & Shalini Agarwal,Other Input Modes,T9 (telephone keypad) To input a, b, or c press “2”; Current cell phones have dictionary to select correct word Lots of false positives (very annoying) Remem
23、ber my introduction? Using an application-dependent grammar would reduce errors,Pen-based character input Use strokes to input characters Current palm pilot only recognizes “Graffiti” alphabet Lots of false positives (very annoying) Using an application-dependent grammar would reduce errors,22,Larry
- 1.请仔细阅读文档,确保文档完整性,对于不预览、不比对内容而直接下载带来的问题本站不予受理。
- 2.下载的文档,不会出现我们的网址水印。
- 3、该文档所得收入(下载+内容+预览)归上传者、原创作者;如果您是本文档原作者,请点此认领!既往收益都归您。
下载文档到电脑,查找使用更方便
2000 积分 0人已下载
下载 | 加入VIP,交流精品资源 |
- 配套讲稿:
如PPT文件的首页显示word图标,表示该PPT已包含配套word讲稿。双击word图标可打开word文档。
- 特殊限制:
部分文档作品中含有的国旗、国徽等图片,仅作为作品整体效果示例展示,禁止商用。设计者仅对作品中独创性部分享有著作权。
- 关 键 词:
- ANOXYGENATEDPRESENTATIONMANAGERPPT
