Cal State NorthridgePsy 427Andrew Ainsworth, PhD.ppt
《Cal State NorthridgePsy 427Andrew Ainsworth, PhD.ppt》由会员分享,可在线阅读,更多相关《Cal State NorthridgePsy 427Andrew Ainsworth, PhD.ppt(53页珍藏版)》请在麦多课文档分享上搜索。
1、Classical Test Theory and Reliability,Cal State Northridge Psy 427 Andrew Ainsworth, PhD,Basics of Classical Test Theory,Theory and Assumptions Types of Reliability Example,Classical Test Theory,Classical Test Theory (CTT) often called the “true score model” Called classic relative to Item Response
2、Theory (IRT) which is a more modern approach CTT describes a set of psychometric procedures used to test items and scales reliability, difficulty, discrimination, etc.,Classical Test Theory,CTT analyses are the easiest and most widely used form of analyses. The statistics can be computed by readily
3、available statistical packages (or even by hand) CTT Analyses are performed on the test as a whole rather than on the item and although item statistics can be generated, they apply only to that group of students on that collection of items,Classical Test Theory,Assumes that every person has a true s
4、core on an item or a scale if we can only measure it directly without error CTT analyses assumes that a persons test score is comprised of their “true” score plus some measurement error. This is the common true score model,Classical Test Theory,Based on the expected values of each component for each
5、 person we can see thatE and X are random variables, t is constant However this is theoretical and not done at the individual level.,Classical Test Theory,If we assume that people are randomly selected then t becomes a random variable as well and we get:Therefore, in CTT we assume that the error : I
6、s normally distributed Uncorrelated with true score Has a mean of Zero,T,X=T+E,True Scores,Measurement error around a T can be large or small,T1,T2,T3,Domain Sampling Theory,Another Central Component of CTT Another way of thinking about populations and samples Domain - Population or universe of all
7、possible items measuring a single concept or trait (theoretically infinite) Test a sample of items from that universe,Domain Sampling Theory,A persons true score would be obtained by having them respond to all items in the “universe” of items We only see responses to the sample of items on the test
8、So, reliability is the proportion of variance in the “universe” explained by the test variance,Domain Sampling Theory,A universe is made up of a (possibly infinitely) large number of items So, as tests get longer they represent the domain better, therefore longer tests should have higher reliability
9、 Also, if we take multiple random samples from the population we can have a distribution of sample scores that represent the population,Domain Sampling Theory,Each random sample from the universe would be “randomly parallel” to each other Unbiased estimate of reliability= correlation between test an
10、d true score= average correlation between the test and all other randomly parallel tests,Classical Test Theory Reliability,Reliability is theoretically the correlation between a test-score and the true score, squared Essentially the proportion of X that is TThis cant be measured directly so we use o
11、ther methods to estimate,CTT: Reliability Index,Reliability can be viewed as a measure of consistency or how well as test “holds together” Reliability is measured on a scale of 0-1. The greater the number the higher the reliability.,CTT: Reliability Index,The approach to estimating reliability depen
12、ds on Estimation of “true” score Source of measurement error Types of reliability Test-retest Parallel Forms Split-half Internal Consistency,CTT: Test-Retest Reliability,Evaluates the error associated with administering a test at two different times. Time Sampling Error How-To: Give test at Time 1 G
13、ive SAME TEST at Time 2 Calculate r for the two scores Easy to do; one test does it all.,CTT: Test-Retest Reliability,Assume 2 administrations X1 and X2The correlation between the 2 administrations is the reliability,CTT: Test-Retest Reliability,Sources of error random fluctuations in performance un
14、controlled testing conditions extreme changes in weather sudden noises / chronic noise other distractions internal factors illness, fatigue, emotional strain, worry recent experiences,CTT: Test-Retest Reliability,Generally used to evaluate constant traits. Intelligence, personality Not appropriate f
15、or qualities that change rapidly over time. Mood, hunger Problem: Carryover Effects Exposure to the test at time #1 influences scores on the test at time #2 Only a problem when the effects are random. If everybody goes up 5pts, you still have the same variability,CTT: Test-Retest Reliability,Practic
16、e effects Type of carryover effect Some skills improve with practice Manual dexterity, ingenuity or creativity Practice effects may not benefit everybody in the same way. Carryover & Practice effects more of a problem with short inter-test intervals (ITI). But, longer ITIs have other problems develo
17、pmental change, maturation, exposure to historical events,CTT: Parallel Forms Reliability,Evaluates the error associated with selecting a particular set of items. Item Sampling Error How To: Develop a large pool of items (i.e. Domain) of varying difficulty. Choose equal distributions of difficult /
18、easy items to produce multiple forms of the same test. Give both forms close in time. Calculate r for the two administrations.,CTT: Parallel Forms Reliability,Also Known As: Alternative Forms or Equivalent Forms Can give parallel forms at different points in time; produces error estimates of time an
19、d item sampling. One of the most rigorous assessments of reliability currently in use. Infrequently used in practice too expensive to develop two tests.,CTT: Parallel Forms Reliability,Assume 2 parallel tests X and XThe correlation between the 2 parallel forms is the reliability,CTT: Split Half Reli
20、ability,What if we treat halves of one test as parallel forms? (Single test as whole domain) Thats what a split-half reliability does This is testing for Internal Consistency Scores on one half of a test are correlated with scores on the second half of a test. Big question: “How to split?” First hal
- 1.请仔细阅读文档,确保文档完整性,对于不预览、不比对内容而直接下载带来的问题本站不予受理。
- 2.下载的文档,不会出现我们的网址水印。
- 3、该文档所得收入(下载+内容+预览)归上传者、原创作者;如果您是本文档原作者,请点此认领!既往收益都归您。
下载文档到电脑,查找使用更方便
2000 积分 0人已下载
下载 | 加入VIP,交流精品资源 |
- 配套讲稿:
如PPT文件的首页显示word图标,表示该PPT已包含配套word讲稿。双击word图标可打开word文档。
- 特殊限制:
部分文档作品中含有的国旗、国徽等图片,仅作为作品整体效果示例展示,禁止商用。设计者仅对作品中独创性部分享有著作权。
- 关 键 词:
- CALSTATENORTHRIDGEPSY427ANDREWAINSWORTH PHDPPT

链接地址:http://www.mydoc123.com/p-379248.html