Pandas

Analyze data with Python's Pandas. Start from the basics or see real-life examples of using Pandas to solve problems.

Importing Excel Datetimes Into Pandas II

Pandas and Excel Pt. 2

What if, like during my data import task a few months back, the dates & times are in separate columns?  This gives us a few new issues.  Let's import that Excel file!

import pandas as pd
import xlrd
import datetime

df = pd.read_excel("hasDatesAndTimes.xlsx", sheet_name="Sheet1")

book = xlrd.open_workbook("hasDatesAndTimes.xlsx&
Pandas Matthew Alhonte avatarMatthew Alhonte Matthew Alhonte avatar
mattJan 15
Aug 20
Read

Importing Excel Datetimes Into Pandas

Pandas & Excel, Part 1

Different file formats are different!  For all kinds of reasons!

A few months back, I had to import some Excel files into a database. In this process I learned so much about the delightfully unique way Excel stores dates & times!  

The basic datetime will be a decimal number, like 43324.909907407404.  The number before the decimal is the day,

Pandas Matthew Alhonte avatarMatthew Alhonte Matthew Alhonte avatar
mattJan 15
Aug 13
Read

Lazy Pandas and Dask

Picking Low-Hanging Fruit With Dask

Ah, laziness.  You love it, I love it, everyone agrees it's just better.

Flesh-and-blood are famously lazy.  Pandas the package, however, uses Eager Evaluation.  What's Eager Evaluation, you ask?  Is Pandas really judgey, hanging out on the street corner and being fierce to the style choices of people walking by?  Well, yes, but that's not the most relevant sense in

Pandas Matthew Alhonte avatarMatthew Alhonte Matthew Alhonte avatar
mattJan 15
Aug 06
Read

All That Is Solid Melts Into Graphs

Reshaping Pandas dataframes with a real-life example, and graphing it with Altair

Last few Code Snippet Corners were about using Pandas as an easy way to handle input and output between files & databases.  Let's shift gears a little bit!  Among other reasons, because earlier today I discovered a package that exclusively does that, which means I can stop importing the massive Pandas package when all I really wanted to do with

Python Matthew Alhonte avatarMatthew Alhonte Matthew Alhonte avatar
mattJan 15
Jul 30
Read

Automagically Turn JSON into Pandas DataFrames

Let pandas do the heavy lifting for you when turning JSON into a DataFrame.

In his post about extracting data from APIs, Todd demonstrated a nice way to massage JSON into a pandas DataFrame. This method works great when our JSON response is flat, because dict.keys() only gets the keys on the first "level" of a dictionary. It gets a little trickier when our JSON starts to become nested though, as I experienced

Read

Trash Pandas: Messy, Convenient DB Operations via Pandas

(And a way to clean it up with SQLAlchemy)

Let's say you were continuing our task from last week: Taking a bunch of inconsistent Excel files and CSVs, and putting them into a database.

Let's say you've been given a new CSV that conflicts with some rows you've already entered, and you're told that these rows are the correct values.

Why Not Use Pandas' Built-in Method?

Pandas' built-in to_

Pandas Matthew Alhonte avatarMatthew Alhonte Matthew Alhonte avatar
mattJan 15
Jul 23
Read

A Dirty Way of Cleaning Data (ft. Pandas & SQL)

Code Snippet Corner ft. Pandas & SQL

Warning The following is FANTASTICALLY not-secure.  Do not put this in a script that's going to be running unsupervised.  This is for interactive sessions where you're prototyping the data cleaning methods that you're going to use, and/or just manually entering stuff.  Especially if there's any chance there could be something malicious hiding in the data to be uploaded.  We're

Pandas Matthew Alhonte avatarMatthew Alhonte Matthew Alhonte avatar
mattJan 15
Jul 16
Read

Extract Massive Amounts of Data from APIs in Python

Abusing APIs for all they’re worth

Taxation without representation. Colonialism. Not letting people eat cake. Human beings rightfully meet atrocities with action in an effort to change the worked for the better. Cruelty by mankind justifies revolution, and it is this writer's opinion that API limitations are one such cruelty.

The data we need and crave is stashed in readily available APIs all around us. It's

Python Todd Birchard avatarTodd Birchard Todd Birchard avatar
toddJan 15
Jul 04
Read

Using Pandas to Make Dealing With DBs Less Of a Hassle

Use SQLAlchemy with PyMySQL to make database connections easy.

Manually opening and closing cursors? Iterating through DB output by hand? Remembering which function is the actual one that matches the Python data structure you're gonna be using?

There has to be a better way!

There totally is.

One of Pandas' most useful abilities is easy I/O. Whether it's a CSV, JSON, an Excel file, or a database -

Python Matthew Alhonte avatarMatthew Alhonte Matthew Alhonte avatar
mattJan 15
Jul 03
Read