Hackers and Slackers
Hackers and Slackers Mobile

Hackers and Slackers

  • About
  • Series
  • Join
  • Donate
Sign in Subscribe
  • Python
  • Software
  • DevOps
  • Architecture
  • Data Engineering
  • Pandas
  • Data Analysis
  • Data Science
  • REST APIs
  • SQL
  • JavaScript
  • Flask
  • AWS
  • NodeJS
  • Google Cloud
  • Apache
  • MySQL
  • Frontend
  • Data Vis
  • NoSQL
  • Spark
  • Home
  • Series
    • Data Analysis with Pandas
    • Build Flask Apps
    • Learning Apache Spark
    • Google Cloud Architecture
    • Mastering SQLAlchemy
    • GraphQL Tutorials
    • Welcome to SQL
    • Working with MySQL
    • Mapping Data with Mapbox
    • Web Scraping With Python
    • Python Concurrency with Asyncio
    • Getting Started with Django
  • Series
  • Tags
    • Python
    • Software
    • DevOps
    • Data Engineering
    • Architecture
    • Pandas
    • Data Analysis
    • Data Science
    • REST APIs
    • SQL
    • JavaScript
    • Flask
    • AWS
    • NodeJS
    • Google Cloud
    • MySQL
    • Apache
  • Tags
Sign in Subscribe

Apache

Tutorials for Apache Big Data technologies including Apache Spark, Apache Kafka, Apache Airflow, and more critical tools for data engineers.
Join and Aggregate PySpark DataFrames

Join and Aggregate PySpark DataFrames

Perform SQL-like joins and aggregations on your PySpark DataFrames.
Todd Birchard's Avatar
Todd Birchard todd
Jun 24, 2019 • 7 mins
Spark
Working with PySpark RDDs

Working with PySpark RDDs

Working with Spark's original data structure API: Resilient Distributed Datasets.
Todd Birchard's Avatar
Todd Birchard todd
Jun 6, 2019 • 8 mins
Spark
Abstract depiction of an Apache Airflow pipeline

Manage Data Pipelines with Apache Airflow

Use Apache Airflow to create standardized and easily reproducible data pipelines in Python.
Todd Birchard's Avatar
Todd Birchard todd
Jun 3, 2019 • 13 mins
Apache
Structured Streaming in PySpark

Structured Streaming in PySpark

Become familiar with building a structured stream in PySpark using the Databricks interface.
Todd Birchard's Avatar
Todd Birchard todd
May 13, 2019 • 8 mins
Spark
Becoming Familiar with Apache Kafka and Message Queues

Becoming Familiar with Apache Kafka and Message Queues

Getting to know Apache Kafka: a horizontally scalable event streaming platform. Learn what makes Kafka critical to high-volume low-latency data pipelines.
Todd Birchard's Avatar
Todd Birchard todd
May 4, 2019 • 6 mins
Apache
Cleaning PySpark DataFrames

Cleaning PySpark DataFrames

Easy DataFrame cleaning techniques ranging from dropping rows to selecting important data.
Todd Birchard's Avatar
Todd Birchard todd
Apr 27, 2019 • 18 mins
Spark
Transforming PySpark DataFrames

Transforming PySpark DataFrames

Apply transformations to PySpark DataFrames such as creating new columns, filtering rows, or modifying string & number values.
Todd Birchard's Avatar
Todd Birchard todd
Apr 26, 2019 • 15 mins
Spark
Learning Apache Spark with PySpark & Databricks

Learning Apache Spark with PySpark & Databricks

Get started with Apache Spark in part 1 of our series, where we leverage Databricks and PySpark.
Todd Birchard's Avatar
Todd Birchard todd
Apr 25, 2019 • 13 mins
Spark
From CSVs to Tables: Infer Data Types From Raw Spreadsheets

From CSVs to Tables: Infer Data Types From Raw Spreadsheets

The quest to never explicitly set a table schema ever again.
Todd Birchard's Avatar
Todd Birchard todd
Jan 23, 2019 • 9 mins
Big Data

Tags

Python Software DevOps Architecture Data Engineering Pandas Excel Data Analysis REST APIs Data Science SQL Flask Code Snippet Corner JavaScript AWS NodeJS Google Cloud Frontend MySQL Apache Data Vis BI NoSQL Spark GraphQL ExpressJS PostgreSQL ETL Pipelines Tableau Machine Learning Big Data Powerpivot PowerBI Atlassian GatsbyJS Automation SQLAlchemy Data Warehouses Mapbox Plotly Golang JAMStack Scraping Django Concurrency ReactJS SaaS Products Hashicorp Docker Terraform Frameworks FastAPI Java Microsoft

Newsletter

Create an account to receive occasional updates and interact with the community.

Great!

Check your inbox and click the link to complete the subscription.

Error

Loading...

Series'

Data Analysis with Pandas 11
Build Flask Apps 11
Learning Apache Spark 6
Google Cloud Architecture 6
Mastering SQLAlchemy 4
GraphQL Tutorials 4
Welcome to SQL 4
Working with MySQL 4
Mapping Data with Mapbox 3
Web Scraping With Python 2
Python Concurrency with Asyncio 2
Getting Started with Django 2
Hackers and Slackers

Community of hackers obsessed with data science, data engineering, and analysis. Openly pushing a pro-robot agenda.

Pages

  • About
  • Series
  • Tags
  • Join
  • Donate
  • Sign in
  • Subscribe

Authors

  • Todd Birchard
  • Matthew Alhonte
  • Max Mileaf
  • Ryan Rosado
  • Graham Beckley
  • David Aquino
  • Paul Armstrong
  • Dylan Castillo

Series'

  • Data Analysis with Pandas
  • Build Flask Apps
  • Learning Apache Spark
  • Google Cloud Architecture
  • Mastering SQLAlchemy
  • GraphQL Tutorials
  • Welcome to SQL
  • Working with MySQL
  • Mapping Data with Mapbox
  • Web Scraping With Python
  • Python Concurrency with Asyncio
  • Getting Started with Django
©2025 Hackers and Slackers, All Rights Reserved.