欢迎来到麦多课文档分享! | 帮助中心 海量文档,免费浏览,给你所需,享你所想!
麦多课文档分享
全部分类
  • 标准规范>
  • 教学课件>
  • 考试资料>
  • 办公文档>
  • 学术论文>
  • 行业资料>
  • 易语言源码>
  • ImageVerifierCode 换一换
    首页 麦多课文档分享 > 资源分类 > PDF文档下载
    分享到微信 分享到微博 分享到QQ空间

    ASTM E2849-2018 Standard Practice for Professional Certification Performance Testing.pdf

    • 资源ID:1243875       资源大小:73.57KB        全文页数:7页
    • 资源格式: PDF        下载积分:10000积分
    快捷下载 游客一键下载
    账号登录下载
    微信登录下载
    二维码
    微信扫一扫登录
    下载资源需要10000积分(如需开发票,请勿充值!)
    邮箱/手机:
    温馨提示:
    如需开发票,请勿充值!快捷下载时,用户名和密码都是您填写的邮箱或者手机号,方便查询和重复下载(系统自动生成)。
    如需开发票,请勿充值!如填写123,账号就是123,密码也是123。
    支付方式: 支付宝扫码支付    微信扫码支付   
    验证码:   换一换

    加入VIP,交流精品资源
     
    账号:
    密码:
    验证码:   换一换
      忘记密码?
        
    友情提示
    2、PDF文件下载后,可能会被浏览器默认打开,此种情况可以点击浏览器菜单,保存网页到桌面,就可以正常下载了。
    3、本站不支持迅雷下载,请使用电脑自带的IE浏览器,或者360浏览器、谷歌浏览器下载即可。
    4、本站资源下载后的文档和图纸-无水印,预览文档经过压缩,下载后原文更清晰。
    5、试题试卷类文档,如果标题没有明确说明有答案则都视为没有答案,请知晓。

    ASTM E2849-2018 Standard Practice for Professional Certification Performance Testing.pdf

    1、Designation: E2849 13E2849 18 An American National StandardStandard Practice forProfessional Certification Performance Testing1This standard is issued under the fixed designation E2849; the number immediately following the designation indicates the year oforiginal adoption or, in the case of revisio

    2、n, the year of last revision. A number in parentheses indicates the year of last reapproval. Asuperscript epsilon () indicates an editorial change since the last revision or reapproval.1. Scope1.1 This practice covers both the professional certification performance test itself and specific aspects o

    3、f the process thatproduced it.1.2 This practice does not include management systems. In this practice, the test itself and its administration, psychometricproperties, and scoring are addressed.1.3 This practice primarily addresses individual professional performance certification examinations, altho

    4、ugh it may be usedto evaluate exams used in training, educational, and aptitude contexts. This practice is not intended to address on-site evaluationof workers by supervisors for competence to perform tasks.1.4 This standard does not purport to address all of the safety concerns, if any, associated

    5、with its use. It is the responsibilityof the user of this standard to establish appropriate safety safety, health, and healthenvironmental practices and determine theapplicability of regulatory limitations prior to use.1.5 This international standard was developed in accordance with internationally

    6、recognized principles on standardizationestablished in the Decision on Principles for the Development of International Standards, Guides and Recommendations issuedby the World Trade Organization Technical Barriers to Trade (TBT) Committee.2. Terminology2.1 DefinitionsSome of the terms defined in thi

    7、s section are unique to the performance testing context. Consequently, termsdefined in other standards may vary slightly from those defined in the following.2.1.1 automatic item generation (AIG), na process of computationally generating multiple forms of an item.2.1.2 candidate, nsomeone who is elig

    8、ible to be evaluated through the use of the performance test; a person who is or willbe taking the test.2.1.3 construct validity, ndegree to which the test evaluates an underlying theoretical idea resulting from the orderlyarrangement of facts.2.1.4 differential system responsiveness, nmeasurable di

    9、fference in response latency between two systems.2.1.5 examinee, ncandidate in the process of taking a test.2.1.6 gating item, nunit of evaluation that shall be passed to pass a test.2.1.7 inter-rater reliability, nmeasurement of rater consistency with other raters.2.1.7.1 DiscussionSee rater reliab

    10、ility.2.1.8 item, nscored response unit.2.1.8.1 DiscussionSee task.1 This practice is under the jurisdiction of ASTM Committee E36 on Accreditation a task can be scored as one item; a task may also becomprised of multiple components each of which is scored as an item.2.1.21 test, nsampling of behavi

    11、or over a limited time in which an authenticated examinee is given specific tasks underspecified conditions, tasks that are scored by a uniformly applied rubric.2.1.21.1 DiscussionA test can also be referred to as an assessment, although typically “assessment” is used for formative evaluation. This

    12、practiceaddresses specifically certification and licensure, as stated in 1.3.Atest is designed to predict the examinees behavior in a specifiedcontext, the “target context.”2.1.22 trajectory, ncandidates path through the solution to a single item, task, or test.2.1.22.1 DiscussionAlso termed the res

    13、ponse trajectory.2.1.23 validity, nextent to which a test predicts target behavior for multiple candidates within a target context.3. Significance and Use3.1 This practice for performance testing provides guidance to performance test sponsors, developers, and delivery providersfor the planning, desi

    14、gn, development, administration, and reporting of high-quality performance tests. This practice assistsstakeholders from both the user and consumer communities in determining the quality of performance tests. This practice includesrequirements, processes, and intended outcomes for the entities that

    15、are issuing the performance test, developing, delivering andevaluating the test, users and test takers interpreting the test, and the specific quality characteristics of performance tests. Thispractice provides the foundation for both the recognition and accreditation of a specific entity to issue a

    16、nd use effectively a qualityperformance test.E2849 1823.2 Accreditation agencies are presently evaluating performance tests with criteria that were developed primarily or exclusivelyfor multiple-choice examinations. The criteria by which performance tests shall be evaluated and accredited are ones a

    17、ppropriateto performance testing. As accreditation becomes more critical for acceptance by federal and state governments, insurancecompanies, and international trade, it becomes more critical that appropriate standards of quality and application be developed forperformance testing.4. Candidate Prepa

    18、ration4.1 Number of Practice ItemsAcandidate shall be given access to sufficient practice items that the novelty of the item formatshall not inhibit the examinees ability to demonstrate his or her capabilities.4.2 Scoring Rubric Available to Candidates:4.2.1 Candidates shall have sufficient informat

    19、ion about the scoring rubric to be able to appropriately prioritize their efforts incompleting the item or test.4.2.2 The examinee shall not be provided so much information about the scoring rubric that it diminishes the ability ofstakeholders to generalize the examinees skills from his or her test

    20、score.4.3 Practice Tests:4.3.1 There are two types of practice tests: one for gaining familiarity with the user interface of the test items and the other toallow the candidate to self-evaluate mastery of the content.4.3.1.1 User Interface PreparationA practice test or tests to familiarize candidates

    21、 with the user interface shall be madeavailable to the candidate at no charge. The practice test shall be sufficient to assure adequate candidate practice time so that thedegree of familiarity with the user interface does not impair the validity of the test.4.3.1.2 Content Self-AssessmentPractice te

    22、sts that evaluate content mastery may be made available at no charge or for a fee.There is no obligation on the part of the test provider to provide a self-assessment practice test to evaluate content mastery.NOTE 1If a practice test is provided, it shall sample test content sufficiently to allow th

    23、e candidate to predict reasonably success or failure on thetest.4.3.2 Candidates shall know specifically which type of practice test they are requesting.4.3.3 Both types of practice test shall help candidates understand how their responses are going to be scored.5. Procedure5.1 Item DevelopmentAll r

    24、equirements in Section 5 may be superseded by empirical, logical, or statistical argumentsdemonstrating that the practices of a certification body are equivalent to or superior to the practices required to meet this practice.5.1.1 Item Time Limits:5.1.1.1 When items or test sections can be accessed

    25、repeatedly, no item time limit is required to be enforced or recommendedto the candidate.5.1.1.2 When items can be accessed only once, item time limits shall be either suggested or enforced, with a visual timekeepingoption for the examinee.5.1.1.3 For a power test, item time limits shall be set usin

    26、g a standard practice such as the mean item response time measuredin beta testing plus two standard deviations for successful candidates within the calibration sample. When sufficient data have beencollected from test administrations, the item time shall be recalibrated to reflect performance on the

    27、 actual test5.1.1.4 For a speeded test, item time limits shall be determined by measuring minimum acceptable time limits in the targetcontext.5.1.2 Differential System ResponsivenessDifferential system responsiveness may be due to variance in network bandwidth,network latency, random-access memory (

    28、RAM), storage speed, operating systems, computer processing unit (CPU) count andperformance, bus speed, or other factors.NOTE 2It is the obligation of the test developer to attempt to measure differences in latency and system responsiveness whenever possible and, ifpossible, to compensate appropriat

    29、ely for these variations.5.1.2.1 There shall be compensation in test scoring for variances in the hardware and software environment to assure that allexaminees are scored fairly.NOTE 3Compensation may be in adjusting item time limits, item latency scoring factors, or other compensatory variables.5.1

    30、.2.2 An examinee taking a test under one set of conditions shall receive the same score as if he or she took the test underany admissible alternative set of conditions.5.1.3 References/CitationsWhen possible, codes, guidelines, industry standards, application source code, or other evidenceshall be s

    31、ufficient to establish the correctness of scoring a procedure. Where such documentation does not exist, correct responsesmay be documented as standard practice by a vote of the subject matter expert (SME) advisory panel for the test.5.1.4 Rater ReliabilityWhen human raters are involved in assessing

    32、item success, rater reliability shall correlate with anestablished performance standard greater than 0.80.5.1.4.1 When multiple raters are used to rate a single performance, inter-rater reliability shall correlate higher than 0.80.E2849 1835.1.5 Automated ScoringTo verify automated scoring, the test

    33、 developer shall develop test cases that verify the scoring of aminimum of 95 % of anticipated responses. When items are scored automatically, for the first 100 administrations of the test, thetest developer shall verify that the scoring algorithm is scoring responses correctly. Verification may be

    34、done by humanobservation, alternate scoring mechanisms, playback of recorded performance, or audit of collected data. Initial verification shallbe performed for at least 5 % of failed items.After 100 administrations, the developer shall verify 1 % of failed items until at least200 failed items have

    35、been checked.5.1.6 Item Stimulus ConstructionThe item solution space shall enable options that would be used by at least 95 % ofpractitioners in addressing the problem represented by the item.NOTE 4The estimate of the practitioner percentage can be derived empirically from usability studies, use cas

    36、es, expert panels, observation, or otherempirical means.5.1.7 Simulation Representation of RealitySimulation rules shall represent reality as it is encountered in the target context oraccurately abstract essentials of reality in the target context, unless the content of the item is for the candidate

    37、 to infer the rulesof the simulation.5.1.8 Access to HelpSupport available to the candidate during the examination shall reflect the support available in the targetcontext, unless the test is designed to predict candidate behavior in an unsupported environment.5.1.9 ReconfigurationReconfiguration is

    38、 so commonplace in many work environments that it shall be taken into account whenevaluating the valid range of interpretations of a performance test.5.1.9.1 If minimal reconfiguration is encountered in the field, requiring the examinee to take the test with the defaultconfiguration is acceptable.5.

    39、1.9.2 If field practice normally involves extensive reconfiguration of the tools, then the test shall allow candidates to importtheir industry standard configurations into the test environment, provided that doing so does not compromise exam security,provide unfair advantage over other candidates, o

    40、r impact the generalizability of results.5.1.9.3 The criterion the test developer shall use to determine “minimal reconfiguration” is whether competence measured withthe default configuration will predict performance with a reconfigured system.5.1.10 Level of FeedbackFeedback during the test shall r

    41、eflect feedback available doing similar tasks in the target context.NOTE 5Feedback may be time compressed to minimize testing time. Interim results may be omitted if they do not impact success in performing theitem.5.1.11 American with Disabilities Act (ADA) AccommodationsAccommodations shall be fai

    42、r to the candidate, the testingadministrator, other candidates, and the potential employer alike, with no interest predominating. Before awardingaccommodations, the test administrator shall discuss with the candidate what the candidate feels would be reasonableaccommodations and, when feasible, shal

    43、l allow the methods candidates use for accomplishing tasks in the target context. Thecandidate shall possess the capability to perform the required test item in full with the agreed upon accommodations. In no caseshall a verbal option be given in place of a performance requirement.5.1.12 Sensitivity

    44、 and BiasItems shall be developed with sensitivity toward the cultural context within which the candidatewill be practicing the skills evaluated. The items shall not include content that would prevent people of equal ability or skill fromexhibiting those abilities or skills.5.1.13 Item Response Term

    45、inationItem termination methods used shall create an environment in which the examineesresponse during a test will best predict performance in the target context.NOTE 6In the target context, if an examinee determines completion of the task, then the examinee shall indicate completion of the task on

    46、the test.If, in the target context, an external individual determines completion of the task, then an examiner or external indication shall terminate the item.5.1.14 Observer Item EffectsThe test developer shall minimize the intrusiveness of the item observer on the process beingevaluated at or belo

    47、w the normal level of supervision encountered by the candidate in the target context.5.1.15 Item Scoring:5.1.15.1 Item scoring shall be both consistent and fair. The scoring rubric shall be applied in the same manner to all examineesresponses. The scoring rubric shall give credit to all correct resp

    48、onses.5.1.15.2 There shall be a method that allows an auditor to evaluate scored states of the item, evaluate the accuracy of task anditem timing, and assess the accuracy of the weighting scheme if one is applied.5.1.15.3 When the universe of response trajectories is undefined, scoring for a reasona

    49、ble set of correct paths to the correctanswer shall be verified.E2849 1845.2 Test Development:5.2.1 Equivalent Forms:5.2.1.1 Diffculty: IRTTest information functions shall have integrals within 2 % of each other and not depart more than 5 %anywhere along the theta range from 3.0 to +3.0.5.2.1.2 Diffculty: Classical Test TheoryDifficulty between forms shall be equated. The recommended range of P-values isfrom 0.35 to 0.95.5.2.1.3 Diffculty: AIG EquivalenceThe test developer shall periodically evaluate variant forms of items to assur


    注意事项

    本文(ASTM E2849-2018 Standard Practice for Professional Certification Performance Testing.pdf)为本站会员(bonesoil321)主动上传,麦多课文档分享仅提供信息存储空间,仅对用户上传内容的表现方式做保护处理,对上载内容本身不做任何修改或编辑。 若此文所含内容侵犯了您的版权或隐私,请立即通知麦多课文档分享(点击联系客服),我们立即给予删除!




    关于我们 - 网站声明 - 网站地图 - 资源地图 - 友情链接 - 网站客服 - 联系我们

    copyright@ 2008-2019 麦多课文库(www.mydoc123.com)网站版权所有
    备案/许可证编号:苏ICP备17064731号-1 

    收起
    展开