Rstudio Cloud How to Upload an Excel Data Set Into R
This lesson introduces the data.frame which is very similar to working with a spreadsheet in R
.
Learning Objectives
At the end of this activeness, yous will be able to:
- Open
.csv
or text file containing tabular (spreadsheet) formatted data inR
. - Quickly plot the data using the
GGPLOT2
officeqplot()
.
What You Demand
You need R
and RStudio
to complete this tutorial. Likewise we recommend that you lot accept an earth-analytics
directory set upwardly on your computer with a /data
directory within it.
- How to gear up up R / RStudio
- Prepare your working directory
In the homework from week 1, you used the code beneath to create a report with knitr
in RStudio
.
# load the ggplot2 library for plotting library ( ggplot2 ) # plow off factors options ( stringsAsFactors = FALSE ) # download data from figshare download.file ( "https://ndownloader.figshare.com/files/9282364" , "information/bedrock-precip.csv" , method = "libcurl" )
Allow'due south suspension the code to a higher place down. First, you use the download.file
part to download a datafile. In this case, the data are housed on Figshare - a popular data repository that is gratuitous to apply if your data are cumulatively smaller than 20gb.
Notice that download.file()
function has 2 ARGUMENTS:
- url: this is the path to the information file that y'all wish to download.
- destfile: this is the location on your estimator (in this case:
/data
) and proper name of the file when saved (in this case: bedrock-precip.csv). And then you lot downloaded a file from a url on figshare to your data directory. Y'all named that fileboulder-precip.csv
.
Next, you read in the information using the function: read.csv()
.
# import data boulder_precip <- read.csv ( file = "data/boulder-precip.csv" ) # view outset few rows of the data head ( boulder_precip ) ## X DATE PRECIP ## 1 756 2013-08-21 0.1 ## 2 757 2013-08-26 0.1 ## iii 758 2013-08-27 0.1 ## 4 759 2013-09-01 0.0 ## 5 760 2013-09-09 0.ane ## six 761 2013-09-10 i.0 # view the format of the boulder_precip object in R str ( boulder_precip ) ## 'data.frame': eighteen obs. of 3 variables: ## $ 10 : int 756 757 758 759 760 761 762 763 764 765 ... ## $ Appointment : chr "2013-08-21" "2013-08-26" "2013-08-27" "2013-09-01" ... ## $ PRECIP: num 0.1 0.1 0.1 0 0.1 1 ii.three 9.eight ane.9 i.4 ...
Challenge
What is the format associated with each cavalcade for the boulder_precip
data.frame? Describe the attributes of each format. Tin can you perform math on each column? Why or why not?
Introduction to the Data Frame
When you read data into R
using read.csv()
information technology imports it into a data frame format. Information frames are the de facto information structure for near tabular data, and what you lot use for statistics and plotting.
A data.frame
is a collection of vectors of identical lengths. Each vector represents a cavalcade, and each vector can be of a different data type (e.g. characters, integers, factors). The str()
function is useful to inspect the data types of the columns.
A data.frame
can exist created by mitt only about commonly they are generated when yous import a text file or spreadsheet into R
using the functions read.csv()
or read.table()
.
Extracting / Specifying "Columns" by Proper noun
Y'all can extract just one single cavalcade from your data.frame using the $
symbol followed by the name of the column (or the column header):
# when you lot download the data yous create a data.frame # view each cavalcade of the information frame using its name (or header) boulder_precip $ Engagement ## [1] "2013-08-21" "2013-08-26" "2013-08-27" "2013-09-01" "2013-09-09" ## [vi] "2013-09-x" "2013-09-11" "2013-09-12" "2013-09-13" "2013-09-fifteen" ## [xi] "2013-09-16" "2013-09-22" "2013-09-23" "2013-09-27" "2013-09-28" ## [16] "2013-10-01" "2013-ten-04" "2013-10-11" # view the precip column boulder_precip $ PRECIP ## [1] 0.1 0.1 0.i 0.0 0.i ane.0 2.3 9.eight 1.9 1.4 0.four 0.i 0.three 0.3 0.i 0.0 0.nine ## [eighteen] 0.ane
View Construction of a Information Frame
You can explore the format of your information frame in a similar way to how you explored vectors in the tertiary lesson of this module. Allow's take a look.
# when you download the data yous create a data.frame # view each column of the data frame using its proper noun (or header) # how many rows does the data frame have nrow ( boulder_precip ) ## [1] eighteen # view the precip column boulder_precip $ PRECIP ## [i] 0.one 0.i 0.1 0.0 0.1 one.0 2.3 nine.8 i.9 one.iv 0.iv 0.ane 0.iii 0.3 0.1 0.0 0.9 ## [18] 0.one
Plotting Your Data
Yous can quickly plot your data too. Note that you are using the ggplot2
function qplot()
rather than the R
base plot functionality. You are doing this because ggplot2
is more often than not more than powerful and efficient to utilise for plotting.
# q plot stands for quick plot. Let'southward utilize it to plot your data qplot ( x = boulder_precip $ Date , y = boulder_precip $ PRECIP )
Challenge
- Listing three arguments that are bachelor in the
read.csv
function. - How do you figure out what working directory you are in?
- List 2 ways to set the working directory in
RStudio
. - Explain what the
$
is used for when working with a data.frame inR
. - When yous use
read.csv
are you executing a: a) part or b) variable ?
Source: https://www.earthdatascience.org/courses/earth-analytics/time-series-data/open-plot-spreadsheet-data-in-R/