This article takes a quick tour of the main features of googlesheets4. Remember to see the articles for more detailed treatment of all these topics and more.
read_sheet(), a.k.a. range_read()
read_sheet() is the main “read” function and should evoke readr::read_csv() and readxl::read_excel(). It’s an alias for range_read(), which is the correct name for this function according to the scheme for naming googlesheets4 functions. You can use them interchangeably. googlesheets4 is pipe-friendly (and reexports %>%), but works just fine without the pipe.
read_sheet() is designed to “just work”, for most purposes, most of the time. It can read straight from a Sheets browser URL:
read_sheet("https://docs.google.com/spreadsheets/d/1U6Cf_qEOhiR9AZqTqS3mbMF3zt2db48ZP5v3rkrAEJY/edit#gid=780868077") #> Reading from "gapminder" #> Range "Africa" #> # A tibble: 624 x 6 #> country continent year lifeExp pop gdpPercap #> <chr> <chr> <dbl> <dbl> <dbl> <dbl> #> 1 Algeria Africa 1952 43.1 9279525 2449. #> 2 Algeria Africa 1957 45.7 10270856 3014. #> 3 Algeria Africa 1962 48.3 11000948 2551. #> 4 Algeria Africa 1967 51.4 12760499 3247. #> 5 Algeria Africa 1972 54.5 14760787 4183. #> # … with 619 more rows
However, these URLs are not pleasant to work with. More often, you will want to identify a Sheet by its ID:
read_sheet("1U6Cf_qEOhiR9AZqTqS3mbMF3zt2db48ZP5v3rkrAEJY") #> Reading from "gapminder" #> Range "Africa" #> # A tibble: 624 x 6 #> country continent year lifeExp pop gdpPercap #> <chr> <chr> <dbl> <dbl> <dbl> <dbl> #> 1 Algeria Africa 1952 43.1 9279525 2449. #> 2 Algeria Africa 1957 45.7 10270856 3014. #> 3 Algeria Africa 1962 48.3 11000948 2551. #> 4 Algeria Africa 1967 51.4 12760499 3247. #> 5 Algeria Africa 1972 54.5 14760787 4183. #> # … with 619 more rows
or by its name, which requires an assist from the googledrive package (googledrive.tidyverse.org):
library(googledrive) drive_get("gapminder") %>% read_sheet() #> Range "Africa" #> # A tibble: 624 x 6 #> country continent year lifeExp pop gdpPercap #> <chr> <chr> <dbl> <dbl> <dbl> <dbl> #> 1 Algeria Africa 1952 43.1 9279525 2449. #> 2 Algeria Africa 1957 45.7 10270856 3014. #> 3 Algeria Africa 1962 48.3 11000948 2551. #> 4 Algeria Africa 1967 51.4 12760499 3247. #> 5 Algeria Africa 1972 54.5 14760787 4183. #> # … with 619 more rows
Note that the name-based approach above will only work if you have access to a Sheet named “gapminder”. Sheet names cannot be used as absolute identifiers; only a Sheet ID can play that role.
For more Sheet identification concepts and strategies, see the article Find and Identify Sheets. See the article Read Sheets for more about reading from a specific (work)sheet or ranges, setting column type, and getting low-level cell data.
gs4_browse()
We’ve made a few Sheets available to “anyone with a link”, for use in examples and docs. Two helper functions make it easy to get your hands on these file IDs.
gs4_examples() lists all the example Sheets and it can also filter by matching names to a regular expression.gs4_example() requires a regular expression and returns exactly 1 Sheet ID (or throws an error).gs4_example("chicken-sheet") %>% read_sheet() #> Reading from "chicken-sheet" #> Range "chicken.csv" #> # A tibble: 5 x 4 #> chicken breed sex motto #> <chr> <chr> <chr> <chr> #> 1 Foghorn Leghorn Leghorn roost… That's a joke, ah say, that's a joke,… #> 2 Chicken Little unknown hen The sky is falling! #> 3 Ginger Rhode Island… hen Listen. We'll either die free chicken… #> 4 Camilla the Chick… Chantecler hen Bawk, buck, ba-gawk. #> 5 Ernie The Giant C… Brahma roost… Put Captain Solo in the cargo hold.
If you’d like to see a Sheet in the browser, including our example Sheets, use gs4_browse():
gs4_example("deaths") %>% gs4_browse()
gs4_get() exposes Sheet metadata, such as details on worksheets and named ranges.
ss <- gs4_example("deaths") gs4_get(ss) #> Spreadsheet name: deaths #> ID: 1VTJjWoP1nshbyxmL9JqXgdVsimaYty21LGxxs018H2Y #> Locale: en_US #> Time zone: America/Los_Angeles #> # of sheets: 2 #> # of named ranges: 2 #> #> (Sheet name): (Nominal extent in rows x columns) #> arts: 1000 x 26 #> other: 1000 x 26 #> #> (Named range): (A1 range) #> arts_data: 'arts'!A5:F15 #> other_data: 'other'!A5:F15 sheet_properties(ss) #> # A tibble: 2 x 8 #> name index id type visible grid_rows grid_columns data #> <chr> <int> <int> <chr> <lgl> <int> <int> <list> #> 1 arts 0 1512440582 GRID TRUE 1000 26 <NULL> #> 2 other 1 278837031 GRID TRUE 1000 26 <NULL> sheet_names(ss) #> [1] "arts" "other"
sheet_properties() and sheet_names() are two members of a larger family of functions for dealing with the (work)sheets within a (spread)Sheet.
The metadata exposed by gs4_get() is also revealed whenever you print an object that is (or can be converted to) a sheets_id (an S3 class we use to mark Sheet IDs).
gs4_get() is related to googledrive::drive_get(). Both functions return metadata about a file on Google Drive, such as its ID and name. However, gs4_get() reveals additional metadata that is specific to Drive files that happen to be Sheets, such as info about worksheets and named ranges.
The writing functions are the most recent additions and may still see some refinements re: user interface and which function does what. We’re very interested to hear how these functions feel in terms of ergonomics.
sheet_write() writes a data frame into a Sheet. The only required argument is the data.
df <- data.frame(x = 1:3, y = letters[1:3]) ss <- sheet_write(df) #> Creating new Sheet: "expecting-crab" ss #> Spreadsheet name: expecting-crab #> ID: 18TnL7qRppcp3xZqfAxHy7zLRxxG6ItHbQ4SQNZtOyFA #> Locale: en_US #> Time zone: Etc/GMT #> # of sheets: 1 #> #> (Sheet name): (Nominal extent in rows x columns) #> df: 4 x 2
You’ll notice the new (spread)Sheet has a randomly generated name. If that is a problem, use gs4_create() instead, which affords more control over various aspects of the new Sheet.
Let’s start over: we delete that Sheet and call gs4_create(), so we can specify the new Sheet’s name.
googledrive::drive_trash(ss) #> Files trashed: #> * expecting-crab: 18TnL7qRppcp3xZqfAxHy7zLRxxG6ItHbQ4SQNZtOyFA ss <- gs4_create("testy-hedgehog", sheets = df) #> Creating new Sheet: "testy-hedgehog" ss #> Spreadsheet name: testy-hedgehog #> ID: 1qWcXg2wdkDBNfXzhZ9vV8NSRlew73pAUVLDh_4ZwpGg #> Locale: en_US #> Time zone: Etc/GMT #> # of sheets: 1 #> #> (Sheet name): (Nominal extent in rows x columns) #> df: 4 x 2
sheet_write() can write to new or existing (work)sheets in this Sheet. Let’s write the chickwts data to a new sheet in ss.
sheet_write(chickwts, ss) #> Writing to "testy-hedgehog" #> Writing to sheet "chickwts" ss #> Spreadsheet name: testy-hedgehog #> ID: 1qWcXg2wdkDBNfXzhZ9vV8NSRlew73pAUVLDh_4ZwpGg #> Locale: en_US #> Time zone: Etc/GMT #> # of sheets: 2 #> #> (Sheet name): (Nominal extent in rows x columns) #> df: 4 x 2 #> chickwts: 72 x 2
We can also use sheet_write() to replace the data in an existing sheet.
sheet_write(data.frame(x = 4:10, letters[4:10]), ss, sheet = "df") #> Writing to "testy-hedgehog" #> Writing to sheet "df" read_sheet(ss, sheet = "df") #> Reading from "testy-hedgehog" #> Range "'df'" #> # A tibble: 7 x 2 #> x letters.4.10. #> <dbl> <chr> #> 1 4 d #> 2 5 e #> 3 6 f #> 4 7 g #> 5 8 h #> # … with 2 more rows
sheet_append() adds one or more rows to an existing sheet.
ss %>% sheet_append(data.frame(x = 11, letters[11]), sheet = "df") #> Writing to "testy-hedgehog" #> Appending 1 row(s) to "df" read_sheet(ss, sheet = "df") #> Reading from "testy-hedgehog" #> Range "'df'" #> # A tibble: 8 x 2 #> x letters.4.10. #> <dbl> <chr> #> 1 4 d #> 2 5 e #> 3 6 f #> 4 7 g #> 5 8 h #> # … with 3 more rows
A related function – range_write() – writes arbitrary data, into an arbitrary range. It has a very different “feel” from gs4_create(), sheet_write(), and sheet_append(), all of which assume we’re writing or growing a table of data in a (work)sheet. range_write() is much more surgical and limited. range_write() makes fewer assumptions about what it’s writing and why.
There is also a family of sheet_*() functions that do pure (work)sheet operations, such as add and delete.
We take one last look at the sheets we created in ss, then clean up.
sheet_properties(ss) #> # A tibble: 2 x 8 #> name index id type visible grid_rows grid_columns data #> <chr> <int> <int> <chr> <lgl> <int> <int> <list> #> 1 df 0 777669510 GRID TRUE 9 2 <NULL> #> 2 chickwts 1 1204655828 GRID TRUE 72 2 <NULL> googledrive::drive_trash(ss) #> Files trashed: #> * testy-hedgehog: 1qWcXg2wdkDBNfXzhZ9vV8NSRlew73pAUVLDh_4ZwpGg
The article Write Sheets has even more detail.