Tidy manipulation of spatial data
knitr::opts_chunk$set(
collapse = TRUE,
comment = "#>",
fig.path = "man/figures/README-",
out.width = "100%"
)
Manipulate spatial data tidily. Geotidy
provides a selection of spatial
functions from the sf
package that are adapted to a tidy workflow. It relies
on tibbles and dplyr
verbs, and forces you to be explicit about the spatial
operations you desire. The package is experimental. If you want to learn more
about the motivations behind it, see the [Motivation] section.
You can install the experimental version of geotidy from github :
# install.packages("remotes")
remotes::install_github("etiennebr/geotidy")
Here’s how you manipulate spatial data with geotidy
.
library(tibble)
library(dplyr)
library(geotidy)
tibble(place = "Sunset Room", longitude = -97.7404985, latitude = 30.2645315) %>%
mutate(geometry = st_point(longitude, latitude))
Many spatial formats use a model where a geometry can have many attributes. This
model can be very useful in many applications, but it sometimes conflicts with
the principles of tidy data, where one observation is one row. To keep spatial
data and operations tidy, geotidy
does less than sf
. It makes manipulations
explicit by forcing the use of a function on the geometry column. It doesn’t try
to guess which is the geometry column that should receive the operation. It also
makes it clear by reading the code, which geometry is impacted. This is done by
treating geometry columns just like other tibble columns. sf
often hides the
geometry column, geotidy
treats it just like a regular columns. This also
makes it easier to interact with other OGC compliant tools, such as postgis
or
spark+geomesa
.
While sf
will guess which column should be buffered:
shp <- sf::st_read("")
st_buffer(shp)
geotidy
forces to be explicit and use dplyr
verbs
shp %>%
mutate(geometry = st_buffer(geometry))
If you already use dplyr
with sf
, geotidy
should feel natural and remove
some of the casting operations. geotidy
guarantees that your data will stay
tidy from start to finish. By having explicit management of geometry columns, it
is also easy to track multiple columns.
geotidy
also guarantees that the returned values are either scalar, or a
vector or a list with the same length than the original geometry and not drop
any data without the user consent (looking at you st_cast
!).
geotidy
is not a fork or a separation from sf
. It just adds a constrained
layer on top of sf
to facilitate a tidy workflow. It is an experiment that
could be integrated in sf
and is likely to change.