Data Engineering

The systematic collection and transformation of data via the creation of tools and pipelines.

MongoDB Stitch Serverless Functions

A crash course in MongoDB Cloud’s bread and butter: serverless functions

At times, I've found my opinion of MongoDB Atlas and MongoDB Stitch to waver between two extremes. Sometimes I'm struck by the allure of a cloud which fundamentally disregards schemas (wooo no schema party!). Other times, such as when Mongo decides to upgrade to a new version and you find all your production instances broken, I like the ecosystem a

NoSQL Todd Birchard avatarTodd Birchard Todd Birchard avatar
toddJan 15
Nov 26
Read

Scraping Data on the Web with BeautifulSoup

The honest act of systematically stealing data without permission

There are plenty of reliable and open sources of data on the web. Datasets are freely released to the public domain by the likes of Kaggle, Google Cloud, and of course local & federal government. Like most things free and open, however, following the rules to obtain public data can be a bit... boring. I'm not suggesting we go and

Python Todd Birchard avatarTodd Birchard Todd Birchard avatar
toddJan 15
Nov 11
Read

Create a REST API Endpoint Using AWS Lambda

Use Python and MySQL to Build an Endpoint

Now that you know your way around API Gateway, you have the power to create vast collections of endpoints. If only we could get those endpoints to actually receive and return some stuff.

We'll create a GET function which will solve the common task of retrieving data from a database. The sequence will look something like:

  • Connect to the database
Read

MySQL, Google Cloud, and a REST API that Generates Itself

Deploy a MySQL database that auto-creates endpoints for itself.

It wasn’t too long ago that I haphazardly forced us down a journey of exploring Google Cloud’s cloud SQL service. The focus of this exploration was Google’s accompanying REST API for all of its cloud SQL instances. That API turned out to be a relatively disappointing administrative API which did little to extend the features you’d

MySQL Todd Birchard avatarTodd Birchard Todd Birchard avatar
toddJan 15
Oct 23
Read

Extract Nested Data From Complex JSON

Steal our code and never manually walk through JSON objects again

We're all data people here, so you already know the scenario: it happens perhaps once a day, perhaps 5, or even more. There's an API you're working with, and it's great. It contains all the information you're looking for, but there's just one problem: the complexity of nested JSON objects is endless, and suddenly the job you love needs to

Python Todd Birchard avatarTodd Birchard Todd Birchard avatar
toddJan 15
Oct 10
Read

Reading and Writing to CSVs in Python

Playing with tabular data the native Python way.

Tables. Cells. Two-dimensional data. We here at Hackers & Slackers know how to talk dirty, but there's one word we'll be missing from our vocabulary today: Pandas.Before the remaining audience closes their browser windows in fury, hear me out. We love Pandas; so much so that tend to recklessly gunsling this 30mb library to perform simple tasks. This isn't

Python Todd Birchard avatarTodd Birchard Todd Birchard avatar
toddJan 15
Sep 27
Read

Using MongoDB Atlas as your Flask Database

Since you prefer using Python and Flask, I’ll assume we both prefer enjoyable dev.

It's been roughly a year since MongoDB launched their Stitch "back-end as a service" product, and I've been tinkering with Mongo on the cloud ever since. Alright fine, "tinkering with" may better be described as  "accidentally became dependent on it after developing new features in production environments," but I can't really complain thus-far. If you're not familiar, MongoDB Atlas is

Flask Todd Birchard avatarTodd Birchard Todd Birchard avatar
toddJan 15
Jul 31
Read

Hacking Your Tableau Linux Server

Cracking Tableau's master Postgres account

Let's say you're a Data Scientist. Well maybe not a data scientist... I mean, those online data analysis courses were definitely worth it, and you'd made it this far without being quizzed on Bayesian linear regression. So maybe you're analyst or something, but whatever:  you use Tableau, So you must be a Scientist™.

I've admitted a few times in the

Tableau Todd Birchard avatarTodd Birchard Todd Birchard avatar
toddJan 15
Jul 26
Read

Using Pandas with AWS Lambda Functions

Forcefully use the Pandas library in your AWS Lambda functions

In one corner we have Pandas: Python's beloved data analysis library. In the other, AWS: the unstoppable cloud provider we're obligated to use for all eternity. We should have know this day would come.

While not the prettiest workflow, uploaded Python package dependencies for usage in AWS Lambda is typically straightforward. We install the packages locally to a virtual env,

Read

Working with XML tree data in Python

Make use of Python's native XML library to walk through and extract data

Life is filled with things we don't want to do; you're a developer so you probably understand this to a higher degree than most people. Sometimes we waste weeks of our lives thanks to an unreasonable and unknowledgeable stakeholder. Other times, we need to deal with XML trees.

At some point or another you're going to need to work with

Python Todd Birchard avatarTodd Birchard Todd Birchard avatar
toddJan 15
Jun 19
Read