In my last blog, we successfully reproduced the SAS method for survival analysis, but it was a pain (and error prone) to do one line each for one testing subject. In other words, if on day 10, you had 50 dead bees, you will have to enter 50 lines of data!
With the help of a R guru from Penn State, the following code works for our example:
- assume the external file has the following data:
>test
trt age dead censor
gfp 24 1 1
gfp 48 2 1
gfp 96 3 1
gfp 96 20 0
rpl8 24 5 1
rpl8 48 12 1
rpl8 96 25 1
rpl8 96 3 0
the code below will translate each line into multiple lines so you do not have to do the work!
for example, line 4 will produce “gfp 96 0” 20 times so survival analysis package will interpret the data correctly.
d3=NULL d4=NULL for (i in 1:nrow(test)) { d4=rep(c(test$trt[i],test$age[i],test$censor[i]), times=test$dead[i]) dim(d4)=c(3,test$dead[i]) d4=t(d4) d3=rbind(d3,d4) } dd=data.frame(d3) >names(dd)=c("trt","age","censor") dd now contains the correct data for survival analysis (same as outputed by SAS, except that the trt has 1 and 2 instead of gfp and rpl8). dd trt age censor 1 1 24 1 2 1 48 1 3 1 48 1 4 1 96 1 5 1 96 1 6 1 96 1 7 1 96 0 8 1 96 0 9 1 96 0 10 1 96 0 11 1 96 0 12 1 96 0 13 1 96 0 14 1 96 0 15 1 96 0 16 1 96 0 17 1 96 0 18 1 96 0 19 1 96 0 20 1 96 0 21 1 96 0 22 1 96 0 23 1 96 0 24 1 96 0 25 1 96 0 26 1 96 0 27 2 24 1 28 2 24 1 29 2 24 1 30 2 24 1 31 2 24 1 32 2 48 1 33 2 48 1 34 2 48 1 35 2 48 1 36 2 48 1 37 2 48 1 38 2 48 1 39 2 48 1 40 2 48 1 41 2 48 1 42 2 48 1 43 2 48 1 44 2 96 1 45 2 96 1 46 2 96 1 47 2 96 1 48 2 96 1 49 2 96 1 50 2 96 1 51 2 96 1 52 2 96 1 53 2 96 1 54 2 96 1 55 2 96 1 56 2 96 1 57 2 96 1 58 2 96 1 59 2 96 1 60 2 96 1 61 2 96 1 62 2 96 1 63 2 96 1 64 2 96 1 65 2 96 1 66 2 96 1 67 2 96 1 68 2 96 1 69 2 96 0 70 2 96 0 71 2 96 0 > library(survival) > survdiff(Surv(age,censor)~trt, data=dd, rho=0) Call: survdiff(formula = Surv(age, censor) ~ trt, data = dd, rho = 0) N Observed Expected (O-E)^2/E (O-E)^2/V trt=1 26 6 20.2 9.99 28.3 trt=2 45 42 27.8 7.27 28.3 Chisq= 28.3 on 1 degrees of freedom, p= 1.01e-07