Pentaho Data Integration ( ETL ) a.k.a Kettle
visualized crawler & ETL IDE written with C#/WPF
Declarative stream processing for mundane tasks and data engineering
A data orchestrator for machine learning, analytics, and ETL.
A curated list of awesome ETL frameworks, libraries, and software.
Linq to database provider.
A lightweight opinionated ETL framework, halfway between plain scripts and Apache Airflow
A Python stream processing engine modeled after Yahoo! Pipes
Data processing & ETL framework for Ruby
Dex : The Data Explorer -- A data visualization tool written in Java/Groovy/JavaFX capable of powerful ETL and publishing web visualizations....
Sync data between persistence engines, like ETL only not stodgy
Pandas on AWS - Easy integration with Athena, Glue, Redshift, Timestream, QuickSight, Chime, CloudWatchLogs, DynamoDB, EMR, SecretManager, PostgreSQL, MySQL, SQLSe...
Actively curated list of awesome BI tools. PRs welcome!
DataSphereStudio is a one stop data application development& management portal, covering scenarios including data exchange, desensitization/cleansing, analysis/min...
ETL best practices with airflow, with examples
Python scripts for ETL (extract, transform and load) jobs for Ethereum blocks, transactions, ERC20 / ERC721 tokens, transfers, receipts, logs, contracts, internal...
[DEPRECATED] Detect threats with log data and improve cloud security posture
An end-to-end GoodReads Data Pipeline for Building Data Lake, Data Warehouse and Analytics Platform.
Streaming reference architecture for ETL with Kafka and Kafka-Connect. You can find more on...
Embedded Template Library
a go daemon that syncs MongoDB to Elasticsearch in realtime
React components to build CSV files on the fly basing on Array/literal object of data
This repository is a getting started guide to Singer.
Data ETL & Analysis on the dataset 'Baby Names from Social Security Card Applications - National Data'....
Example project implementing best practices for PySpark ETL jobs and applications.
A hackable data integration & analysis tool to enable non technical users to edit data processing jobs and visualise data on demand....
:crystal_ball: Transform, query, and download geospatial data on the web.
Archived repository. For current repo, see: https://github.com/etlegacy/etlegacy
A serverless cluster computing system for the Go programming language
Airbyte is an open-source EL(T) platform that helps you replicate your data in your warehouses, lakes and databases....
A lightweight stream processing library for Go