Reading and Writing Data to and from R


Reading files into R

Usually we will be using data already in a file that nosotros demand to read into R in order to work on it. R can read data from a variety of file formats—for example, files created as text, or in Excel, SPSS or Stata. Nosotros volition mainly be reading files in text format .txt or .csv (comma-separated, usually created in Excel).

To read an entire information frame directly, the external file will ordinarily accept a special form

  • The first line of the file should accept a name for each variable in the data frame.
  • Each additional line of the file has as its first detail a row label and the values for each variable.

Hither we utilise the example dataset called airquality.csv and airquality.txt

Input file form with names and row labels:

Ozone Solar.R * Air current Temp Month Day

i 41 ***** 190 ** vii.4 ** 67 **** v ** 1

2 36 ***** 118 ** 8.0 ** 72 **** 5 ** two

iii 12 ***** 149 * 12.6 ** 74 **** 5 ** three

4 xviii ***** 313 * 11.5 ** 62 **** v ** four

5 NA ***** NA ** 14.three ** 56 **** 5 ** 5

   ...

By default numeric items (except row labels) are read equally numeric variables. This can be inverse if necessary.

The function read.table() can and then be used to read the data frame directly

     > airqual <- read.table("C:/Desktop/airquality.txt")

Similarly, to read .csv files the read.csv() function tin can exist used to read in the data frame direct

[Note: I have noticed that occasionally you lot'll need to do a double slash in your path //. This seems to depend on the automobile.]

> airqual <- read.csv("C:/Desktop/airquality.csv")

 In addition, you can read in files using the file.choose() function in R. After typing in this command in R, y'all can manually select the directory and file where your dataset is located.

  1. Read the airquality.csv file into R using the read.csv command.
  2. Read the airquality.txt file into R using the file.choose() command

Occasionally, you volition need to read in data that does not already have column name information.  For example, the dataset BOD.txt looks similar this:

ane    eight.3

two   10.3

3   19.0

4   16.0

5   15.6

seven   xix.8

Initially, in that location are no column names associated with the dataset.  Nosotros tin can use the colnames() command to assign column names to the dataset.  Suppose that we desire to assign columns, "Fourth dimension" and "need" to the BOD.txt dataset.  To do so we do the post-obit

> bod <- read.tabular array("BOD.txt", header=F)

> colnames(bod) <- c("Time","demand")

> colnames(bod)

[1] "Fourth dimension"   "demand"

The first control reads in the dataset, the command "header=F" specifies that there are no column names associated with the dataset.

Read in the cars.txt dataset and call it car1.  Make sure y'all utilize the "header=F" option to specify that at that place are no column names associated with the dataset.  Side by side, assign "speed" and "dist" to be the first and 2nd column names to the car1 dataset.

The two videos below provide a overnice explanations of different methods to read data from a spreadsheet into an R dataset.

Import Data, Copy Data from Excel to R, Both .csv and .txt Formats (R Tutorial 1.3) MarinStatsLectures [Contents]

alternative accessible content

Importing Data and Working With Data in R (R Tutorial 1.four) MarinStatsLectures [Contents]

alternative accessible content

Writing Data to a File


After working with a dataset, nosotros might similar to salve it for future use. Before we practice this, let'south first set a working directory so nosotros know where we can find all our data sets and files afterward.

Setting up a Directory

In the R window, click on "File" so on "Alter dir". You should then see a box pop up titled "Cull directory". For this grade, cull the directory "Desktop" by clicking on "Browse", so select "Desktop" and click "OK". In the future, you may want to create a directory on your computer where y'all keep your data sets and codes for this course.

Alternatively, you tin use the setwd() office to assign every bit working directory.

> setwd("C:/Desktop")

To detect out what your current working directory is, type

> getwd()

Setting Up Working Directories in R (R Tutorial one.viii) MarinStatsLectures [Contents]

alternative accessible content

In R, nosotros can write data frames hands to a file, using the write.tabular array() control.

> write.table(cars1, file=" cars1.txt ", quote=F)

The first argument refers to the information frame to be written to the output file, the 2nd is the name of the output file. By default R volition surround each entry in the output file past quotes, so we use quote=F.

Now, let's check whether R created the file on the Desktop, by going to the Desktop and clicking to open the file. You should see a file with three columns, the start giving the index (or row number) and the other 2 the speed and distance. R by default creates a column of row indices. If nosotros wanted to create a file without the row indices, we would use the command:

> write.tabular array(cars1, file=" cars1.txt ", quote=F, row.names=F)

Datasets in R


Watch the video below for a concise intoduction to working with the variables in an R dataset

Working with Variables and Data in R (R Tutorial 1.5) MarinStatsLecures [Contents]

alternative accessible content

Around 100 datasets are supplied with R (in the package datasets), and others are available.

To see the listing of datasets currently available employ the command:

data()

We will first look at a information set on CO2 (carbon dioxide) uptake in grass plants available in R.

> CO2

[ Note: capitalization matters here; also: information technology'southward the letter O, not zip. Typing this command should display the unabridged dataset chosen CO2, which has 84 observations (in rows) and v variables (columns).]

To get more information on the variables in the dataset, blazon in

> assist(CO2)

Evaluate and study the mean and standard deviation of the variables "Concentration" and "Uptake".

Subsetting Data in R With Square Brackets and Logic Statements (R Tutorial 1.6) MarinStatsLecures [Contents]

alternative accessible content