Hackers and Slackers
Hackers and Slackers

Hackers and Slackers

  • About
  • Series
  • Join
  • Donate
Sign in Subscribe
  • Python
  • Software
  • DevOps
  • Architecture
  • Data Engineering
  • Pandas
  • Data Analysis
  • SQL
  • Data Science
  • REST APIs
  • JavaScript
  • Flask
  • AWS
  • NodeJS
  • Google Cloud
  • Apache
  • Frontend
  • MySQL
  • Data Vis
  • NoSQL
  • Home
  • About
  • Tags
    • Python
    • Software
    • DevOps
    • Architecture
    • Data Engineering
    • Pandas
    • Data Analysis
    • REST APIs
    • Data Science
    • SQL
    • JavaScript
    • Flask
    • AWS
    • NodeJS
    • Google Cloud
    • MySQL
  • Series
    • Data Analysis with Pandas
    • Build Flask Apps
    • Google Cloud Architecture
    • Learning Apache Spark
    • Mastering SQLAlchemy
    • Welcome to SQL
    • GraphQL Tutorials
    • Working with MySQL
    • Mapping Data with Mapbox
    • Python Concurrency with Asyncio
    • Getting Started with Django
    • Web Scraping With Python
  • Join
  • Donate
  • Sign in
  • Subscribe

Data Engineering

Collect and transform data on a large scale. Build data pipelines, work with a horizontally scalable architecture, or simply scrape and collect data.
Create Google BigQuery Tables via the Python SDK

Create Google BigQuery Tables via the Python SDK

Use Google Cloud's Python SDK to insert large datasets into Google BigQuery, enjoy the benefits of schema detection, and manipulating data programmatically.
Todd Birchard
Todd Birchard todd
Feb 8, 2021 • 11 mins
Google Cloud
Scrape Structured Data with Python and Extruct

Scrape Structured Data with Python and Extruct

Supercharge your scraper to extract quality page metadata by parsing JSON-LD data via Python's extruct library.
Todd Birchard
Todd Birchard todd
Jun 29, 2020 • 13 mins
Python
Google BigQuery Cluster

Simplify BigQuery ETL jobs using SQLAlchemy

Extract and move data between BigQuery and relational databases using PyBigQuery: a connector for SQLAlchemy.
Todd Birchard
Todd Birchard todd
Nov 16, 2019 • 8 mins
Data Warehouses
Using Amazon Redshift as your Data Warehouse

Using Amazon Redshift as your Data Warehouse

Get the most out of Redshift by performance tuning your cluster and learning how to query your data optimally.
Todd Birchard
Todd Birchard todd
Jul 29, 2019 • 12 mins
Data Warehouses
Join and Aggregate PySpark DataFrames

Join and Aggregate PySpark DataFrames

Perform SQL-like joins and aggregations on your PySpark DataFrames.
Todd Birchard
Todd Birchard todd
Jun 24, 2019 • 7 mins
Spark
Working with PySpark RDDs

Working with PySpark RDDs

Working with Spark's original data structure API: Resilient Distributed Datasets.
Todd Birchard
Todd Birchard todd
Jun 6, 2019 • 8 mins
Spark
Abstract depiction of an Apache Airflow pipeline

Manage Data Pipelines with Apache Airflow

Use Apache Airflow to create standardized and easily reproducible data pipelines in Python.
Todd Birchard
Todd Birchard todd
Jun 3, 2019 • 13 mins
Apache
Structured Streaming in PySpark

Structured Streaming in PySpark

Become familiar with building a structured stream in PySpark using the Databricks interface.
Todd Birchard
Todd Birchard todd
May 13, 2019 • 8 mins
Spark
Becoming Familiar with Apache Kafka and Message Queues

Becoming Familiar with Apache Kafka and Message Queues

Getting to know Apache Kafka: a horizontally scalable event streaming platform. Learn what makes Kafka critical to high-volume low-latency data pipelines.
Todd Birchard
Todd Birchard todd
May 4, 2019 • 6 mins
Apache
Cleaning PySpark DataFrames

Cleaning PySpark DataFrames

Easy DataFrame cleaning techniques ranging from dropping rows to selecting important data.
Todd Birchard
Todd Birchard todd
Apr 27, 2019 • 18 mins
Spark

Tags

Python Software DevOps Data Engineering Architecture Pandas Excel Data Analysis Data Science REST APIs SQL JavaScript Flask Code Snippet Corner AWS NodeJS Google Cloud Frontend Apache MySQL Data Vis BI NoSQL PostgreSQL GraphQL Spark ExpressJS Tableau ETL Pipelines GatsbyJS PowerBI Powerpivot Big Data SQLAlchemy Automation Atlassian Machine Learning Golang JAMStack Scraping Plotly Data Warehouses Mapbox SaaS Products Concurrency ReactJS Hashicorp Docker Django Java FastAPI Frameworks Terraform Microsoft

Newsletter

Create an account to receive occasional updates and interact with the community.

Great!

Check your inbox and click the link to complete the subscription.

Error

Loading...

Series'

Data Analysis with Pandas 11
Build Flask Apps 11
Google Cloud Architecture 6
Learning Apache Spark 6
Mastering SQLAlchemy 4
Welcome to SQL 4
GraphQL Tutorials 4
Working with MySQL 4
Mapping Data with Mapbox 3
Python Concurrency with Asyncio 2
Getting Started with Django 2
Web Scraping With Python 2
Hackers and Slackers

Community of hackers obsessed with data science, data engineering, and analysis. Openly pushing a pro-robot agenda.

Navigation

    • About
    • Series
    • Join
    • Donate
  • Sign in
  • Subscribe

Series'

  • Data Analysis with Pandas
  • Build Flask Apps
  • Google Cloud Architecture
  • Learning Apache Spark
  • Mastering SQLAlchemy
  • Welcome to SQL
  • GraphQL Tutorials
  • Working with MySQL
  • Mapping Data with Mapbox
  • Python Concurrency with Asyncio
  • Getting Started with Django
  • Web Scraping With Python

Authors

  • Todd Birchard
  • Matthew Alhonte
  • Max Mileaf
  • Ryan Rosado
  • Graham Beckley
  • David Aquino
  • Paul Armstrong
  • Dylan Castillo
©2023 Hackers and Slackers, All Rights Reserved.