R Data Analysis Cookbook(Second Edition)
上QQ阅读APP看书,第一时间看更新

Handling column headers/variable names

If your data file does not have column headers, set header=FALSE.

The auto-mpg-noheader.csv file does not include a header row. The first command in the following snippet reads this file. In this case, R assigns default variable names V1, V2, and so on.

> auto  <- read.csv("auto-mpg-noheader.csv", header=FALSE) 
> head(auto,2)

V1 V2 V3 V4 V5 V6 V7 V8 V9
1 1 28 4 140 90 2264 15.5 71 chevrolet vega 2300
2 2 19 3 70 97 2330 13.5 72 mazda rx2 coupe

If your file does not have a header row, and you omit the header=FALSE optional argument, the read.csv() function uses the first row for variable names and ends up constructing variable names by adding X to the actual data values in the first row. Note the meaningless variable names in the following fragment:

> auto  <- read.csv("auto-mpg-noheader.csv") 
> head(auto,2)

X1 X28 X4 X140 X90 X2264 X15.5 X71 chevrolet.vega.2300
1 2 19 3 70 97 2330 13.5 72 mazda rx2 coupe
2 3 36 4 107 75 2205 14.5 82 honda accord

We can use the optional col.names argument to specify the column names. If col.names is given explicitly, the names in the header row are ignored, even if header=TRUE is specified:

> auto <- read.csv("auto-mpg-noheader.csv",     header=FALSE, col.names =       c("No", "mpg", "cyl", "dis","hp",         "wt", "acc", "year", "car_name")) 

> head(auto,2)

No mpg cyl dis hp wt acc year car_name
1 1 28 4 140 90 2264 15.5 71 chevrolet vega 2300
2 2 19 3 70 97 2330 13.5 72 mazda rx2 coupe