Create pipelines to extract, transform, and load data. Use Big Data tools like Apache Spark and Kafka to build horizontally scalable ETL pipelines.
Extract and move data between BigQuery and relational databases using PyBigQuery: a connector for SQLAlchemy.
Use Apache Airflow to build and monitor better data pipelines.
An overview of how Kafka works, as well as equivalent message brokers.
Get started with Apache Spark in part 1 of our series, where we leverage Databricks and PySpark.
Build a pipeline which extracts raw data from the JIRA's Cloud API, transforms it, and loads the data into a SQL database.