An Intake driver that provides access to SQLite databases identified by either local or remote URLs.
After experimenting for a while, we ultimately decided not to use Intake catalogs to
distribute our data, so this repository is no longer maintained.
See PUDL Data Access <https://catalystcoop-pudl.readthedocs.io/en/nightly/data_access.html>
__
for more information on how to access the data we publish.
… readme-intro
… image:: https://github.com/catalyst-cooperative/intake-sqlite/workflows/tox-pytest/badge.svg
:target: https://github.com/catalyst-cooperative/intake-sqlite/actions?query=workflow%3Atox-pytest
:alt: Tox-PyTest Status
… image:: https://img.shields.io/codecov/c/github/catalyst-cooperative/intake-sqlite?style=flat&logo=codecov
:target: https://codecov.io/gh/catalyst-cooperative/intake-sqlite
:alt: Codecov Test Coverage
… image:: https://img.shields.io/readthedocs/intake-sqlite?style=flat&logo=readthedocs
:target: https://intake-sqlite.readthedocs.io/en/latest/
:alt: Read the Docs Build Status
… image:: https://img.shields.io/pypi/v/intake-sqlite?style=flat&logo=python
:target: https://pypi.org/project/intake-sqlite
:alt: PyPI Latest Version
… image:: https://img.shields.io/conda/vn/conda-forge/intake-sqlite?style=flat&logo=condaforge
:target: https://anaconda.org/conda-forge/intake-sqlite
:alt: conda-forge Version
… image:: https://img.shields.io/pypi/pyversions/intake-sqlite?style=flat&logo=python
:target: https://pypi.org/project/intake-sqlite
:alt: Supported Python Versions
… image:: https://img.shields.io/badge/code style-black-000000.svg
:target: https://github.com/psf/black>
:alt: Any color you want, so long as it’s black.
This package provides a (very) thin wrapper around the more general intake-sql <https://github.com/intake/intake-sql>
__ driver, which can be used to generate Intake data catalogs <https://github.com/intake/intake>
__ from SQL databases.
The intake-sql
driver takes an SQL Alchemy database URL <https://docs.sqlalchemy.org/en/14/core/engines.html#database-urls>
__ and uses it to
connect to and extract data from the database. This works with just fine with
SQLite databases <https://www.sqlite.org/index.html>
__, but only when the database
file is stored locally and can be referenced with a simple path.
For example this path::
/home/zane/code/catalyst/pudl-work/sqlite/pudl.sqlite
would correspond to this SQL Alchemy database URL::
sqlite:///home/zane/code/catalyst/pudl-work/sqlite/pudl.sqlite
But you can’t access a remote SQLite DB this way.
Rather than using an SQL Alchemy database URL to reference the SQLite DB, this intake
driver takes a local path or a remote URL, like:
../pudl-work/sqlite/pudl.sqlite
https://global-power-plants.datasettes.com/global-power-plants.db
s3://cloudy-mc-cloudface-databucket/v1.2.3/mydata.db
For local paths, it resolves the path and prepends sqlite://
before handing it off
to intake-sql
to do all the hard work.
For remote URLs it uses fsspec <https://filesystem-spec.readthedocs.io/en/latest/>
__
to cache a local copy <https://filesystem-spec.readthedocs.io/en/latest/features.html?highlight=simplecache#caching-files-locally>
__
of the database, and then gives intake-sql
a database URL that points to the cached
copy.
… code:: python
import intake_sqlite
gpp_cat = intake_sqlite.SQLiteCatalog(
urlpath=“https://global-power-plants.datasettes.com/global-power-plants.db”,
storage_options={“simplecache”: {“cache_storage”: “/home/zane/.cache/intake”}},
)
list(gpp_cat)
Catalyst Cooperative <https://catalyst.coop>
__ is a small group of data
wranglers and policy wonks organized as a worker-owned cooperative consultancy.
Our goal is a more just, livable, and sustainable world. We integrate public
data and perform custom analyses to inform public policy (Hire us! <https://catalyst.coop/hire-catalyst>
__). Our focus is primarily on mitigating
climate change and improving electric utility regulation in the United States.
GitHub Discussions <https://github.com/catalyst-cooperative/pudl/discussions>
__sign up for our email list <https://catalyst.coop/updates/>
__.Office Hours <https://calend.ly/catalyst-cooperative/pudl-office-hours>
__@CatalystCoop <https://twitter.com/CatalystCoop>
__[email protected] <mailto:[email protected]>
__