googlesheets4 provides an R interface to Google Sheets via the Sheets API v4. It is a reboot of the existing googlesheets package.

Why 4? Why googlesheets4? Did I miss googlesheets1 through 3? No. The idea is to name the package after the corresponding version of the Sheets API. In hindsight, the original googlesheets should have been googlesheets3.

Installation

You can install the released version of googlesheets4 from CRAN with:

And the development version from GitHub with:

No auth yet!

Sorry, auth hasn’t been wired up yet, but that’s the next priority. Until then, you can only use googlesheets4 to access Sheets where sharing settings say: “anyone with a link can view”. read_sheet() has an example that shows googledrive calls to achieve this or you can do in the Sheets browser UI via File > Share ….

read_sheet()

read_sheet() is the main “read” function and should evoke readr::read_csv() and readxl::read_excel(). It’s an alias for sheets_read(). Most functions in googlesheets4 actually start with sheets_. googlesheets4 is pipe-friendly (and reexports %>%), but works just fine without the pipe.

Identify and access your own Sheet

Let’s say you have a cheerful Google Sheet named “deaths”. If you want to access it by name, use googledrive to identify the document (capture its metadata, especially file id). Pass the result to functions like sheets_get() (gets spreadsheet-specific metadata) or read_sheet() (reads cells into a data frame).

If you’re willing to deal with the spreadsheet’s id, just provide that directly to googlesheets4 functions and omit googledrive.

Lesson: googledrive is the friendliest way to work with files on Google Drive, including those that are Google Sheets. You can refer to files by name. googlesheets4 is focused on operations specific to Sheets and is more programming oriented. You must pass a file id or something that contains the file id.

Specify the range and column types

We’ve made a few Sheets easy to access via sheets_example(). Here we read from a mini-Gapminder Sheet to show some of the different ways to specify (work)sheet and cell ranges. Note also that col_types gives control of column types.

library(googlesheets4)

read_sheet(sheets_example("mini-gap"), sheet = 2)
#> Reading from 'test-gs-mini-gapminder'
#> Range "'Americas'"
#> # A tibble: 5 x 6
#>   country   continent  year lifeExp      pop gdpPercap
#>   <chr>     <chr>     <dbl>   <dbl>    <dbl>     <dbl>
#> 1 Argentina Americas   1952    62.5 17876956     5911.
#> 2 Bolivia   Americas   1952    40.4  2883315     2677.
#> 3 Brazil    Americas   1952    50.9 56602560     2109.
#> 4 Canada    Americas   1952    68.8 14785584    11367.
#> 5 Chile     Americas   1952    54.7  6377619     3940.

read_sheet(sheets_example("mini-gap"), sheet = "Oceania", n_max = 3)
#> Reading from 'test-gs-mini-gapminder'
#> Range "'Oceania'"
#> # A tibble: 3 x 6
#>   country     continent  year lifeExp     pop gdpPercap
#>   <chr>       <chr>     <dbl>   <dbl>   <dbl>     <dbl>
#> 1 Australia   Oceania    1952    69.1 8691212    10040.
#> 2 New Zealand Oceania    1952    69.4 1994794    10557.
#> 3 Australia   Oceania    1957    70.3 9712569    10950.

read_sheet(sheets_example("deaths"), skip = 4, n_max = 10)
#> Reading from 'deaths'
#> Range "'arts'!5:1000"
#> # A tibble: 10 x 6
#>    Name               Profession   Age `Has kids` `Date of birth`    
#>    <chr>              <chr>      <dbl> <lgl>      <dttm>             
#>  1 David Bowie        musician      69 TRUE       1947-01-08 00:00:00
#>  2 Carrie Fisher      actor         60 TRUE       1956-10-21 00:00:00
#>  3 Chuck Berry        musician      90 TRUE       1926-10-18 00:00:00
#>  4 Bill Paxton        actor         61 TRUE       1955-05-17 00:00:00
#>  5 Prince             musician      57 TRUE       1958-06-07 00:00:00
#>  6 Alan Rickman       actor         69 FALSE      1946-02-21 00:00:00
#>  7 Florence Henderson actor         82 TRUE       1934-02-14 00:00:00
#>  8 Harper Lee         author        89 FALSE      1926-04-28 00:00:00
#>  9 Zsa Zsa Gábor      actor         99 TRUE       1917-02-06 00:00:00
#> 10 George Michael     musician      53 FALSE      1963-06-25 00:00:00
#> # ... with 1 more variable: `Date of death` <dttm>

read_sheet(
  sheets_example("deaths"), range = "other!A5:F15", col_types = "?ci??D"
)
#> Reading from 'deaths'
#> Range "'other'!A5:F15"
#> # A tibble: 10 x 6
#>    Name    Profession   Age `Has kids` `Date of birth`     `Date of death`
#>    <chr>   <chr>      <int> <lgl>      <dttm>              <date>         
#>  1 Vera R… scientist     88 TRUE       1928-07-23 00:00:00 2016-12-25     
#>  2 Mohame… athlete       74 TRUE       1942-01-17 00:00:00 2016-06-03     
#>  3 Morley… journalist    84 TRUE       1931-11-08 00:00:00 2016-05-19     
#>  4 Fidel … politician    90 TRUE       1926-08-13 00:00:00 2016-11-25     
#>  5 Antoni… lawyer        79 TRUE       1936-03-11 00:00:00 2016-02-13     
#>  6 Jo Cox  politician    41 TRUE       1974-06-22 00:00:00 2016-06-16     
#>  7 Janet … lawyer        78 FALSE      1938-07-21 00:00:00 2016-11-07     
#>  8 Gwen I… journalist    61 FALSE      1955-09-29 00:00:00 2016-11-14     
#>  9 John G… astronaut     95 TRUE       1921-07-28 00:00:00 2016-12-08     
#> 10 Pat Su… coach         64 TRUE       1952-06-14 00:00:00 2016-06-28

Other functions

sheets_get() exposes Sheet metadata. It has a nice print method, but there’s much more info in the object itself.

sheets_cells() returns a data frame with one row per cell and it gives access to raw cell data sent by the Sheets API.

spread_sheet() converts data in the “one row per cell” form into the data frame you get from read_sheet(), which involves reshaping and column typing.

What’s coming soon?

OAuth2

Writing to Sheets

Context

googlesheets4 draws on and complements / emulates other packages in the tidyverse:

  • googledrive already provides a fully-featured interface to the Google Drive API. Any “whole file” operations can already be accomplished today with googledrive: upload or download or update a spreadsheet, copy, rename, move, change permission, delete, etc. googledrive already supports OAuth2 and Team Drives.
  • readxl is the tidyverse package for reading Excel files (xls or xlsx) into an R data frame. googlesheets4 takes cues from parts of the readxl interface, especially around specifying which cells to read.
  • readr is the tidyverse package for reading delimited files (e.g., csv or tsv) into an R data frame. googlesheets4 takes cues from readr with respect to column type specification.

Please note that this project is released with a Contributor Code of Conduct. By participating in this project you agree to abide by its terms.