This function uses a quick-and-dirty method to read a Sheet that bypasses the
Sheets API and, instead, parses a CSV representation of the data. This can be
much faster than range_read()
-- noticeably so for "large" spreadsheets.
There are real downsides, though, so we recommend this approach only when the
speed difference justifies it. Here are the limitations we must accept to get
faster reading:
Only formatted cell values are available, not underlying values or details on the formats.
We can't target a named range as the
range
.We have no access to the data type of a cell, i.e. we don't know that it's logical, numeric, or datetime. That must be re-discovered based on the CSV data (or specified by the user).
Auth and error handling have to be handled a bit differently internally, which may lead to behaviour that differs from other functions in googlesheets4.
Note that the Sheets API is still used to retrieve metadata on the target
Sheet, in order to support range specification. range_speedread()
also
sends an auth token with the request, unless a previous call to
gs4_deauth()
has put googlesheets4 into a de-authorized state.
Arguments
- ss
Something that identifies a Google Sheet:
its file id as a string or
drive_id
a URL from which we can recover the id
a one-row
dribble
, which is how googledrive represents Drive filesan instance of
googlesheets4_spreadsheet
, which is whatgs4_get()
returns
Processed through
as_sheets_id()
.- sheet
Sheet to read, in the sense of "worksheet" or "tab". You can identify a sheet by name, with a string, or by position, with a number. Ignored if the sheet is specified via
range
. If neither argument specifies the sheet, defaults to the first visible sheet.- range
A cell range to read from. If
NULL
, all non-empty cells are read. Otherwise specifyrange
as described in Sheets A1 notation or using the helpers documented in cell-specification. Sheets uses fairly standard spreadsheet range notation, although a bit different from Excel. Examples of valid ranges:"Sheet1!A1:B2"
,"Sheet1!A:A"
,"Sheet1!1:2"
,"Sheet1!A5:A"
,"A1:B2"
,"Sheet1"
. Interpreted strictly, even if the range forces the inclusion of leading, trailing, or embedded empty rows or columns. Takes precedence overskip
,n_max
andsheet
. Noterange
can be a named range, like"sales_data"
, without any cell reference.- skip
Minimum number of rows to skip before reading anything, be it column names or data. Leading empty rows are automatically skipped, so this is a lower bound. Ignored if
range
is given.- ...
Passed along to the CSV parsing function (currently
readr::read_csv()
).
Value
A tibble
Examples
if (require("readr")) {
# since cell type is not available, use readr's col type specification
range_speedread(
gs4_example("deaths"),
sheet = "other",
range = "A5:F15",
col_types = cols(
Age = col_integer(),
`Date of birth` = col_date("%m/%d/%Y"),
`Date of death` = col_date("%m/%d/%Y")
)
)
}
#> Loading required package: readr
#> ✔ Reading from deaths, sheet other, range A5:F15.
#> ℹ Export URL:
#> <https://docs.google.com/spreadsheets/d/1VTJjWoP1nshbyxmL9JqXgdVsimaYty21LGxxs018H2Y/export?format=csv&range=A5%3AF15&gid=278837031>
#> # A tibble: 10 × 6
#> Name Profession Age `Has kids` `Date of birth` `Date of death`
#> <chr> <chr> <int> <lgl> <date> <date>
#> 1 Vera Rubin scientist 88 TRUE 1928-07-23 2016-12-25
#> 2 Mohamed Ali athlete 74 TRUE 1942-01-17 2016-06-03
#> 3 Morley Saf… journalist 84 TRUE 1931-11-08 2016-05-19
#> 4 Fidel Cast… politician 90 TRUE 1926-08-13 2016-11-25
#> 5 Antonin Sc… lawyer 79 TRUE 1936-03-11 2016-02-13
#> 6 Jo Cox politician 41 TRUE 1974-06-22 2016-06-16
#> 7 Janet Reno lawyer 78 FALSE 1938-07-21 2016-11-07
#> 8 Gwen Ifill journalist 61 FALSE 1955-09-29 2016-11-14
#> 9 John Glenn astronaut 95 TRUE 1921-07-28 2016-12-08
#> 10 Pat Summit coach 64 TRUE 1952-06-14 2016-06-28
# write a Sheet that, by default, is NOT world-readable
(ss <- sheet_write(chickwts))
#> ✔ Creating new Sheet: dolostone-macaque.
#>
#> ── <googlesheets4_spreadsheet> ───────────────────────────────────────────
#> Spreadsheet name: dolostone-macaque
#> ID: 1tYcI63BGH1p6UY73QfkbTN-n3rq8aF99edsnwTPn2Mo
#> Locale: en_US
#> Time zone: Etc/GMT
#> # of sheets: 1
#>
#> ── <sheets> ──────────────────────────────────────────────────────────────
#> (Sheet name): (Nominal extent in rows x columns)
#> chickwts: 72 x 2
# demo that range_speedread() sends a token, which is why we can read this
range_speedread(ss)
#> ✔ Reading from dolostone-macaque.
#> ℹ Export URL:
#> <https://docs.google.com/spreadsheets/d/1tYcI63BGH1p6UY73QfkbTN-n3rq8aF99edsnwTPn2Mo/export?format=csv>
#> Rows: 71 Columns: 2
#> ── Column specification ──────────────────────────────────────────────────
#> Delimiter: ","
#> chr (1): feed
#> dbl (1): weight
#>
#> ℹ Use `spec()` to retrieve the full column specification for this data.
#> ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.
#> # A tibble: 71 × 2
#> weight feed
#> <dbl> <chr>
#> 1 179 horsebean
#> 2 160 horsebean
#> 3 136 horsebean
#> 4 227 horsebean
#> 5 217 horsebean
#> 6 168 horsebean
#> 7 108 horsebean
#> 8 124 horsebean
#> 9 143 horsebean
#> 10 140 horsebean
#> # ℹ 61 more rows
# clean up
googledrive::drive_trash(ss)
#> File trashed:
#> • dolostone-macaque <id: 1tYcI63BGH1p6UY73QfkbTN-n3rq8aF99edsnwTPn2Mo>