An Introduction to Stata for EconomistsPart II-Data Analysis.ppt
《An Introduction to Stata for EconomistsPart II-Data Analysis.ppt》由会员分享,可在线阅读,更多相关《An Introduction to Stata for EconomistsPart II-Data Analysis.ppt(48页珍藏版)》请在麦多课文档分享上搜索。
1、An Introduction to Stata for Economists Part II: Data Analysis,Kerry L. Papps,1. Overview,Do-files Summary statistics Correlation Linear regression Generating predicted values and hypothesis testing Instrumental variables and other estimators Panel data capabilities Panel estimators,2. Overview (con
2、t.),Writing loops Graphs,3. Comment on notation used,Consider the following syntax description: list varlist in range Text in typewriter-style font should be typed exactly as it appears (although there are possibilities for abbreviation). Italicised text should be replaced by desired variable names
3、etc. Square brackets (i.e. ) enclose optional Stata commands (do not actually type these).,4. Comment on notation used (cont.),For example, an actual Stata command might be: list name occupation This notation is consistent with notation in Stata Help menu and manuals.,5. Do-files,Do-files allow comm
4、ands to be saved and executed in “batch” form. We will use the Stata do-file editor to write do-files. To open do-file editor click Window Do-File Editor or click Can also use WordPad or Notepad: Save as “Text Document” with extension “.do” (instead of “.txt”). Allows larger files than do-file edito
5、r.,6. Do-files (cont.),Note: a blank line must be included at the end of a WordPad do-file (otherwise last line will not run). To run a do-file from within the do-file editor, either select Tools Do or click If you highlight certain lines of code, only those commands will run. To run do-file from th
6、e main Stata windows, either select File Do or type: do dofilename,7. Do-files (cont.),Can “comment out” lines by preceding with * or by enclosing text within /* and */. Can save the contents of the Review window as a do-file by right-clicking on window and selecting “Save All.”.,8. Univariate summa
7、ry statistics,tabstat produces a table of summary statistics: tabstat varlist , statistics(statlist) Example: tabstat age educ, stats(mean sd sdmean n) summarize displays a variety of univariate summary statistics (number of non-missing observations, mean, standard deviation, minimum, maximum): summ
8、arize varlist,9. Multivariate summary statistics,table displays table of statistics: table rowvar colvar , contents(clist varname) clist can be freq, mean, sum etc. rowvar and colvar may be numeric or string variables. Example: table sex educ, c(mean age median inc),10. Multivariate summary statisti
9、cs (cont.),One “super-column” and up to 4 “super-rows” are also allowed. Missing values are excluded from tables by default. To include them as a group, use the missing option with table.,EXERCISE 1 11. Generating simple statistics,Open the do-file editor in Stata. Run all your solutions to the exer
10、cises from here. Open nlswork.dta from the internet as follows: webuse nlswork Type summarize to look at the summary statistics for all variables in the dataset. Generate a wage variable, which exponentiates ln_wage: gen wage=exp(ln_wage),EXERCISE 1 (cont.) 12. Generating simple statistics,Restrict
11、summarize to hours and wage and perform it separately for non-married and married (i.e. msp=0 and 1). Use tabstat to report the mean, median, minimum and maximum for hours and wage. Report the mean and median of wage by age (along the rows) and race (across the columns) : table age race, c(mean wage
12、 median wage),13. Sets of dummy variables,Dummy variables take the values 0 and 1 only. Large sets of dummy variables can be created with: tab varname, gen(dummyname) When using large numbers of dummies in regressions, useful to name with pattern, e.g. id1, id2 Then id* can be used to refer to all v
13、ariables beginning with *.,14. Correlation,To obtain the correlation between a set of variables, type: correlate varlist weight , covariance covariance option displays the covariances rather than the correlation coefficients. pwcorr displays all the pairwise correlation coefficients between the vari
14、ables in varlist: pwcorr varlist weight , sig,15. Correlation (cont.),sig option adds a line to each row of matrix reporting the significance level of each correlation coefficient. Difference between correlate and pwcorr is that the former performs listwise deletion of missing observations while the
15、 latter performs pairwise deletion. To display the estimated covariance matrix after a regression command use: estat vce,16. Correlation (cont.),(This matrix can also be displayed using Statas matrix commands, which we will not cover in this course.),17. Linear regression,To perform a linear regress
16、ion of depvar on varlist, type: regress depvar varlist weight if exp , noconstant robust depvar is the dependent variable. varlist is the set of independent variables (regressors). By default Stata includes a constant. The noconstant option excludes it.,18. Linear regression (cont.),robust specifies
17、 that Stata report the Huber-White standard errors (which account for heteroskedasticity). Weights are often used, e.g. when data are group averages, as in: regress inflation unemplrate year aweight=pop This is weighted least squares (i.e. GLS). Note that here year allows for a linear time trend.,19
18、. Post-estimation commands,After all estimation commands (i.e. regress, logit) several predicted values can be computed using predict. predict refers to the most recent model estimated. predict yhat, xb creates a new variable yhat equal to the predicted values of the dependent variable. predict res,
19、 residual creates a new variable res equal to the residuals.,20. Post-estimation commands (cont.),Linear hypotheses can be tested (e.g. t-test or F-test) after estimating a model by using test. test varlist tests that the coefficients corresponding to every element in varlist jointly equal zero. tes
20、t eqlist tests the restrictions in eqlist, e.g.: test sex=3 The option accumulate allows a hypothesis to be tested jointly with the previously tested hypotheses.,21. Post-estimation commands (cont.),Example: regress lnw sex race school age test sex race test school = age, accum,EXERCISE 2 22. Linear
21、 regression,Compute the correlation between wage and grade. Is it significant at the 1% level? Generate a variable called age2 that is equal to the square of age (the square operator in Stata is ). Create a set of race dummies with: tab race, gen(race) Regress ln_wage on: age, age2, race2, race3, ms
- 1.请仔细阅读文档,确保文档完整性,对于不预览、不比对内容而直接下载带来的问题本站不予受理。
- 2.下载的文档,不会出现我们的网址水印。
- 3、该文档所得收入(下载+内容+预览)归上传者、原创作者;如果您是本文档原作者,请点此认领!既往收益都归您。
下载文档到电脑,查找使用更方便
2000 积分 0人已下载
下载 | 加入VIP,交流精品资源 |
- 配套讲稿:
如PPT文件的首页显示word图标,表示该PPT已包含配套word讲稿。双击word图标可打开word文档。
- 特殊限制:
部分文档作品中含有的国旗、国徽等图片,仅作为作品整体效果示例展示,禁止商用。设计者仅对作品中独创性部分享有著作权。
- 关 键 词:
- ANINTRODUCTIONTOSTATAFORECONOMISTSPARTIIDATAANALYSISPPT

链接地址:http://www.mydoc123.com/p-378306.html