ETL

Create pipelines to extract, transform, and load data. Use "big data" tools like Apache Spark and Kafka to build horizontally scalable ETL pipelines.

Simplify BigQuery ETL jobs using SQLAlchemy

Simplify BigQuery ETL jobs using SQLAlchemy

Extract and move data between BigQuery and relational databases using a plugin for SQLAlchemy.
Manage Data Pipelines with Apache Airflow

Manage Data Pipelines with Apache Airflow

Use Apache Airflow to build and monitor better data pipelines.
Becoming Familiar with Apache Kafka and Message Queues

Becoming Familiar with Apache Kafka and Message Queues

An overview of how Kafka works, as well as equivalent message brokers.
Learning Apache Spark with PySpark & Databricks

Learning Apache Spark with PySpark & Databricks

Get started with Apache Spark in part 1 of our series, where we leverage Databricks and PySpark.
Building an ETL Pipeline: From JIRA's REST API to SQL

Building an ETL Pipeline: From JIRA's REST API to SQL

Build a pipeline which extracts raw data from the JIRA's Cloud API, transforms it, and loads the data into a SQL database.