欢迎来到麦多课文档分享! | 帮助中心 海量文档,免费浏览,给你所需,享你所想!
麦多课文档分享
全部分类
  • 标准规范>
  • 教学课件>
  • 考试资料>
  • 办公文档>
  • 学术论文>
  • 行业资料>
  • 易语言源码>
  • ImageVerifierCode 换一换
    首页 麦多课文档分享 > 资源分类 > PPT文档下载
    分享到微信 分享到微博 分享到QQ空间

    By Cleophas KiioDirector, ICT.ppt

    • 资源ID:379196       资源大小:718.50KB        全文页数:25页
    • 资源格式: PPT        下载积分:2000积分
    快捷下载 游客一键下载
    账号登录下载
    微信登录下载
    二维码
    微信扫一扫登录
    下载资源需要2000积分(如需开发票,请勿充值!)
    邮箱/手机:
    温馨提示:
    如需开发票,请勿充值!快捷下载时,用户名和密码都是您填写的邮箱或者手机号,方便查询和重复下载(系统自动生成)。
    如需开发票,请勿充值!如填写123,账号就是123,密码也是123。
    支付方式: 支付宝扫码支付    微信扫码支付   
    验证码:   换一换

    加入VIP,交流精品资源
     
    账号:
    密码:
    验证码:   换一换
      忘记密码?
        
    友情提示
    2、PDF文件下载后,可能会被浏览器默认打开,此种情况可以点击浏览器菜单,保存网页到桌面,就可以正常下载了。
    3、本站不支持迅雷下载,请使用电脑自带的IE浏览器,或者360浏览器、谷歌浏览器下载即可。
    4、本站资源下载后的文档和图纸-无水印,预览文档经过压缩,下载后原文更清晰。
    5、试题试卷类文档,如果标题没有明确说明有答案则都视为没有答案,请知晓。

    By Cleophas KiioDirector, ICT.ppt

    1、By Cleophas Kiio Director, ICT,15-sep-10,1,The Best Practices in Census Data Processing Operation: Case of 2009 Census:,Overview,Data processing Activities Review Planning for Data processing Setting the Data processing site Implementation Data capture Analysis Dissemination Archival,15-sep-10,2,Dat

    2、a Processing Activities Review,DP follows the completion of field data collection and entails the following: Capture Cleaning/Editing Tabulation Analysis Dissemination Archival,15-sep-10,3,Planning for Data Processing (DP),15-sep-10,4,Identification of Methodology/technology: Keying From Paper (KFP)

    3、 - Manual Data Entry largely used in KNBS for small Surveys Keying From Image (KFI) -scanning Optical Mark Reading (OMR)- scanning Optical/Intelligent Character Recognition (OCR/ICR) - scanning Online data capture use of pc Use of mobile devices (PDA) For the 2009 Census, KNBS chose scanning technol

    4、ogy with OCR/ICR having used the same in the 1999 Census. A study tour the US Census Bureau was conducted to understudy the best practices. Major considerations were the budget and availability of technical knowhow.,Planning for Data Processing (DP) contd,Selection of Tools and Equipment: Computers

    5、acquired 125 high capacity computers with duo screens. Servers- 3 high-end servers did the census (32 GB memory, multiple processors, 1 Terabyte secondary storage each) Storage 3 high capacity Storage Area Networks (SANs) were procured initially 5 Terabytes (TB) each but later upgraded to 14 TB each

    6、. Software- Capture software - with the challenges faced the 1999 census where the bureau used the AFPS pro from Top Image Systems (TIS), the Bureau chose to use the iCADE system ( integrated Computer Assisted Data Entry System) developed by the US Census Bureau. Cleaning/tabulation- Cspro (Census a

    7、nd Surveys Processing software) Scanners- 3 new Kodak 1860 high volume scanners were acquired in addition to the 2 existing Kodak 1900 scanners used during the 1999 Census. Capable of scanning over 200 ppm.Network infrastructure- all computers, scanners, servers and SAN were connected in a wide area

    8、 network (WAN),15-sep-10,5,Planning for Data Processing (DP) contd,Design of Questionnaires As standard practice questionnaires are developed and designed with technology to be used in mind.The 2009 Census questionnaires were designed by highly trained Bureau staff.Technical support was offered by t

    9、he US Census Bureau Precision in design was critical for compatibility with the iCADE system.,15-sep-10,6,Setting Up the DP Site,Planning the layout (library, KFI, OCR/Manual registration, server room, editing ) Installing the computer network Installing the power supply system and provisioning for

    10、power backup system: UPS and generators Installing the furniture, lifts and Air-conditioning Procuring high bandwidth internet. A ware house for storage Recruitment of staff,15-sep-10,7,15-sep-10,8,Installation Systems and testing was completed after census enumeration Integrated Computer Aided Data

    11、 Entry (iCADE) system training In 2009 we had approximately 12 million A3 questionnaires. Engaged close to 500 personnel for the processing. Processing took less than a one (1) year to complete,Implementation,Tracking of questionnaires done with a custom made tracking system with inbuilt geocode lis

    12、t to ascertain completeness and flow control Guillotining- trimming/cutting off the spirals iCADE system processes Batching- registering books from each EA in the iCADE Scanning Auto and Manual registration Exception review OCR review Key From Image (KFI),15-sep-10,9,a) 2009 Data Capture Processes,1

    13、5-sep-10,10,Capture Output,Captured data was output to a text file then auto-formated as input to the CSPro softwareOCR characters read: 2,485,008,272 with an accuracy rate of 99.86% (0.14% error) KFI characters keyed: 228,771,647 with a 99.94 accuracy rate (0.055%error)This means the OCR read over

    14、90% of the characters with a very high accuracy rate (OCR review definitely helped get this accuracy rate but customization algorithms had to be added to the quality).22,326,373 images from the census questionnaires 273,201 books in 144,098 batches10,602 batches went to exception review and 133,496

    15、batches bypassed Exception Review altogether and went straight into OCR.,15-sep-10,11,15-sep-10,12,b) 2009 Data Analysis KNBS used CSPro a freeware from the US Census Bureau. This process required: Subject matter specialists provide editing rules Programmers implement editing rules through programs

    16、The team developed the editing program with which data is cleaned.,Editing/cleaning and Imputation,15-sep-10,13,Systematic inspection of invalid and inconsistent responses, and subsequent manual or automatic correction according to predetermined rules (edit specs).Imputation is the procedure of assi

    17、gning values to missing, invalid, or inconsistent data using a set of predefined criteria embedded to an editing program.,Why Edit and Impute?,Clean up data to facilitate analysis Identify types and sources of error Improve quality of census dataErrors must be detected and their causes identified Ap

    18、propriate corrective measures are taken to improve the overall data quality.,15-sep-10,14,15-sep-10,15,c) Data Tabulation Process of producing data outputs (tables, frequencies, cross-tabulations,) Requires subject matter specialists to prepare dummy output layouts supported programmers Data in then

    19、 presented in this tabular layouts.,15-sep-10,16,15-sep-10,17,d) Data Dissemination Providing public with information through census books, fliers, CDs, DVDs, online databases (Census info, IMIS, sms service)e) Data Archival Documentation for permanent storage for further and future analysis,15-sep-

    20、10,18,15-sep-10,19,Ware-house was located about 10Km from processing centre Inadequate processing space Printing was not perfect this affected the OCR Limited number and constant breakdown of the KNBS dedicated lift slowed down processing. Power outages posed a major challenges Being a new system, t

    21、here was a cautious and slow acceptance of the system.,Challenges,15-sep-10,20,Comprehensive DP plan be developed with clearly defined objectives: Efficiency and effectiveness to process in the shortest time possible. Control cost of processing to avoid budget overruns. Quality data output Carry out

    22、 risk analysis beforehand to identify potential pitfalls and put in place mitigation measures.,Best practices: Lessons learnt,15-sep-10,21,Best practices: Lessons learnt contd,Cartographic mapping be completed 1 year before census geographical codes and related documentation (geo-codes) to be ready

    23、6 months before enumeration. Timely acquisition of census tools and equipment DP site be ready 6 months before enumeration date for test runs . Technical and maintenance support measures must be instituted and enforced.,15-sep-10,22,Questionnaires and manuals be ready 5 months before census date to

    24、allow for logistics and pretesting. Total quality control at the printing press must be ensured for precision printing. Recruitment and training of staff be done before the census date. DP site be located in close proximity to the questionnaire warehouse,Best practices: Lessons learnt contd,15-sep-1

    25、0,23,Conclusion,Despite the challenges, it was possible to complete DP in less than a year after census. However better planning and organization of the exercise it possible to complete the exercise within 6 months after enumeration. The lessons learnt may form the recommendations that if adopted the above can be attained.,15-sep-10,24,15-sep-10,25,Thank You!,


    注意事项

    本文(By Cleophas KiioDirector, ICT.ppt)为本站会员(amazingpat195)主动上传,麦多课文档分享仅提供信息存储空间,仅对用户上传内容的表现方式做保护处理,对上载内容本身不做任何修改或编辑。 若此文所含内容侵犯了您的版权或隐私,请立即通知麦多课文档分享(点击联系客服),我们立即给予删除!




    关于我们 - 网站声明 - 网站地图 - 资源地图 - 友情链接 - 网站客服 - 联系我们

    copyright@ 2008-2019 麦多课文库(www.mydoc123.com)网站版权所有
    备案/许可证编号:苏ICP备17064731号-1 

    收起
    展开