Introduction to R Lecture 3- Data Manipulation.ppt
《Introduction to R Lecture 3- Data Manipulation.ppt》由会员分享,可在线阅读,更多相关《Introduction to R Lecture 3- Data Manipulation.ppt(44页珍藏版)》请在麦多课文档分享上搜索。
1、Introduction to R Lecture 3: Data Manipulation,Andrew Jaffe 9/27/10,Overview,Practice Solutions Indexing Data Management Data Summaries,Practice,Make a 2 x 2 table of sex and dog, table(dat$sex, dat$dog)no yesF 264 229M 254 253,Practice,Create a BMI variable using height and weight, dat$bmi = dat$we
2、ight*703/dat$height2 head(dat$bmi) 1 23.44931 31.29991 25.69422 23.89881 23.11172 28.13324,Practice,Create an overweight variable, which gives the value 1 for people with BMI 30 and 0 otherwise, dat$overweight = ifelse(dat$bmi 30, 1, 0) head(dat$overweight) 1 0 1 0 0 0 0,Practice,Add those two varia
3、bles to the datasets and save it as a text file somewhere,write.table(dat, “lec2_practice.txt“, quote = F, row.names = F, sep=“t“),Overview,Practice Solutions Indexing Data Management Data Summaries,Indexing,Vectors: vectorindex takes index elements from vector and returns them, x = c(1,3,7,34,435)
4、x1 1 1 xc(1,4) 1 1 34 x2:4 1 3 7 34, 2:4 1 2 3 4,Indexing,Replace elements in a vector combining indexing, is.na(), and rep(), x = c(1,3,NA,6,NA,8) which(is.na(x) 1 3 5 xis.na(x) = 0 # or rep(0) x 1 1 3 0 6 0 8,Indexing,Data.frames/matrices: datrow,col Can subset/extract a row: datrow, Can subset/ex
5、tract a column: dat,col, x = matrix(c(1,2,3,4,5,6), ncol = 3) x,1 ,2 ,3 1, 1 3 5 2, 2 4 6,Indexing, x1, 1 1 3 5 x,1 1 1 2 x1,1 1 1 x1:2,1:2,1 ,2 1, 1 3 2, 2 4, x,1 ,2 ,3 1, 1 3 5 2, 2 4 6,Indexing, x1, = rep(1) x,1 ,2 ,3 1, 1 1 1 2, 2 4 6 x,1 = rep(2) x,1 ,2 ,3 1, 2 1 1 2, 2 4 6, x,1 ,2 ,3 1, 1 3 5
6、2, 2 4 6,Overview,Practice Solutions Indexing Data Management Data Summaries,Data Management,An aside: save() and load() save(obj_1,obj_n, file = “filename.rda”) Saves R objects (vectors, matrices, or data.frames) as an .rda file (similar to .dta) load(“filename.rda”) Loads whatever files were saved
7、 in the .rda Easier than reading/writing tables,Data Management,Your workspace can be saved an .rda file You get asked this every time you close R save.image(“filename.Rdata”) saves all objects in your workspace (what ls() returns) Each folder might have its own .Rdata file Doing this is personal pr
8、eference - if you have a script and its a quick analysis, probably dont need a saved image,Data Management,“lec3_data.rda” can be downloaded from the website Similar method to read in the data: load(“lec3_data.rda”) Put in the same directory as your script Set your working directory Use the full fil
9、ename,Data Management,What are the dimensions of the dataset?,Data Management,What are the dimensions of the dataset?, dim(dog_dat) 1 482 6,Data Management,How many dogs are in this dataset? Is this dataset unique?,Data Management,How many dogs are in this dataset? Is this dataset unique?, length(un
10、ique(dog_dat$dog_id) 1 482 length(dog_dat$dog_id) 1 482,Data Management,What are the column/variable names?,Data Management,What are the column/variable names?, head(dog_dat)dog_id owner_id dog_type dog_wt_mo1 dog_len_mo1 dog_food_mo1 1 1 394 lab 51.5 13.8 25.8 2 2 571 lab 48.3 24.6 33.1 3 3 986 poo
11、dle 59.3 22.7 29.2 4 4 750 lab 46.4 22.3 27.6 5 5 882 husky 48.0 20.9 28.0 6 6 762 poodle 47.0 19.1 31.0, names(dog_dat) 1 “dog_id“ “owner_id“ “dog_type” “dog_wt_mo1“ 5 “dog_len_mo1“ “dog_food_mo1“,Data Management,Some explanation of the variables dog_id: id of dog owner_id: id of owner dog_type: ty
12、pe of dog dog_wt_mo1: dog weight at month 1 (baseline) dog_len_mo1: dog length at month 1 dog_food_mo1: baseline dog food consumption,Data Management,Subsetting data: separate data into two data.frames based on a variable:, lab = dog_datdog_dat$dog_type = “lab“, head(lab)dog_id owner_id dog_type dog
13、_wt_mo1 dog_len_mo1 dog_food_mo1 1 1 394 lab 51.5 13.8 25.8 2 2 571 lab 48.3 24.6 33.1 4 4 750 lab 46.4 22.3 27.6 7 7 664 lab 53.0 18.2 25.7 13 13 713 lab 48.3 23.4 31.8 15 15 480 lab 46.6 20.8 31.3,Data Management, lab = dog_datdog_dat$dog_type = “lab“, head(which(dog_dat$dog_type = “lab“) 1 1 2 4
- 1.请仔细阅读文档,确保文档完整性,对于不预览、不比对内容而直接下载带来的问题本站不予受理。
- 2.下载的文档,不会出现我们的网址水印。
- 3、该文档所得收入(下载+内容+预览)归上传者、原创作者;如果您是本文档原作者,请点此认领!既往收益都归您。
下载文档到电脑,查找使用更方便
2000 积分 0人已下载
下载 | 加入VIP,交流精品资源 |
- 配套讲稿:
如PPT文件的首页显示word图标,表示该PPT已包含配套word讲稿。双击word图标可打开word文档。
- 特殊限制:
部分文档作品中含有的国旗、国徽等图片,仅作为作品整体效果示例展示,禁止商用。设计者仅对作品中独创性部分享有著作权。
- 关 键 词:
- INTRODUCTIONTORLECTURE3DATAMANIPULATIONPPT
