Everyday R code (3)

 

#excluding some rows, need to write ‘!’ in front of everything, because the expression df1$id %in% idNums1 produces a logical vector. To negate it, you need to negate the whole vector like this !(df1$id %in% idNums1)

test<-filter(cesab7, !( KILL_ID %in% c(3,6,9,10)))

#including some rows and filter others out

test<-filter(cesab7, KILL_ID %in% c(3,6,9,10))

 

## In the ‘data’, there are repeated CO_ID which has different SU_Q_CODE with an ANSWER for each SU_Q_CODE, but each CO_ID has the same S_ID and SU_NAME. For example, CO_ID 12345 has SU_Q_CODE ‘A’ with an ANSWER of ‘5’. And CO_ID 12345 has SU_Q_CODE ‘B’ with an ANSWER of ‘3’. To clean the data into unique CO_ID level, we need to use ‘cast’ function as follows.

Data2<-cast (data, CO_ID+S_ID+SU_NAME~SU_Q_CODE, value = ‘ANSWER’, fun=NULL)

 

#For survey data, we consider customers who respond to any of the questions as respondents. If any of the SU_Q_CODE column A B C D is not NULL, we write 1 into a new column called ‘responded’. Else we write 0.

cesab5$responded<-ifelse(

is.na(data$ SU_Q_CODEA) &

is.na(data$ SU_Q_CODEB) &

is.na(data$ SU_Q_CODEC) &

is.na(data$ SU_Q_CODED),0,1)

 

# if else can also be used this way. ‘|’ is used as or.

data$response<-ifelse(nchar(data$comments)>=5 | nchar(data$comments_other) >=5,1,0)

Leave a Comment