Thursday, October 29, 2009

Extract variables from data frame by name

I had to combine a bunch of .csv files into a data frame. As the various files had no column names a side effect of the import were duplicate column names. A similar result could arise when using "cbind". As I was only interested in the eighth variable (called "V8" by default) I needed a way to extract all (and only) the V8's into a new data frame. Unfortunately the "V8"s were not at regular intervals. 

names(april2008) yields:

 [1] "station" "V7"      "V8"      "V4"      "V5"      "V6"      "V7"    
 [8] "V8"      "V5"      "V6"      "V7"      "V8"      "V6"      "V7"    
[15] "V8

I wanted to extract columns 3, 8, 12, 15.

This was achieved by using the "which" command. The initial data frame was called "april2008".

I used:

april2008[,which(names(april2008)=="V8"] -> newapril08

names(newapril08) yields:

[1] "V8"    "V8.1"  "V8.2"  "V8.3"

1 comment:

  1. Hi friends,

    Each variable named in the expression after the operator on the right hand side of form is evaluated in object. If more than one variable is indicated in level they are combined into a data frame, else the selected variable is returned as a vector. Thanks.....

    ReplyDelete