Data Engineering

The systematic collection and transformation of data via the creation of tools and pipelines.

Cleaning PySpark DataFrames

Cleaning PySpark DataFrames

Easy DataFrame cleaning techniques, ranging from dropping problematic rows to selecting important columns.
Spark
18 min read
April 27
Learning Apache Spark with PySpark & Databricks

Learning Apache Spark with PySpark & Databricks

Get started with Apache Spark in part 1 of our series, where we leverage Databricks and PySpark.
Spark
13 min read
April 26
Building an ETL Pipeline: From JIRA's REST API to SQL

Building an ETL Pipeline: From JIRA's REST API to SQL

Build a pipeline which extracts raw data from the JIRA's Cloud API, transforms it, and loads the data into a SQL database.
Data Engineering
11 min read
March 28
Working With GraphQL Fragments and Mutations

Working With GraphQL Fragments and Mutations

Make your GraphQL queries more dynamic with Fragments, plus get started with Mutations.
GraphQL
5 min read
March 19
Building a Client For Your GraphQL API

Building a Client For Your GraphQL API

Now that we have an understanding of GraphQL queries and API setup, it's time to get that data.
GraphQL
6 min read
March 09
Writing Your First GraphQL Query

Writing Your First GraphQL Query

Begin to structure complex queries against your GraphQL API.
GraphQL
8 min read
March 07
Welcome to SQL: Modifying Databases and Tables

Welcome to SQL: Modifying Databases and Tables

Brush up on SQL fundamentals such as creating tables, schemas, and views.
SQL
10 min read
February 19
Google BigQuery's Python SDK: Creating Tables Programmatically

Google BigQuery's Python SDK: Creating Tables Programmatically

Explore the benefits of Google BigQuery and use the Python SDK to programmatically create tables.
Big Data
8 min read
February 02