Working with PySpark RDDs
Working with Spark's original data structure API: Resilient Distributed Datasets.
PowerPivot 3: Managing the Data Model
Analyzing ginormous files with Microsoft PowerPivot.
Manage Data Pipelines with Apache Airflow
Use Apache Airflow to build and monitor better data pipelines.
Recasting Low-Cardinality Columns as Categoricals
Downcast strings in Pandas to their proper data-types using HDF5.
PowerPivot 2: What's the Deal with Delimiters?
Working with large flat files in PowerPivot.
Removing Duplicate Columns in Pandas
Dealing with duplicate column names in your Pandas DataFrame.
Using Hierarchical Indexes With Pandas
Use Panda's Multiindex to make your data work harder for you.
Managing Flask Session Variables
Using Flask-Session and Flask-Redis to store user session variables.
Power to the Pivot Redux: Enter PowerPivot
Dipping into Microsoft's PowerPivot add-on for Excel.
Reshaping Pandas DataFrames
A guide to DataFrame manipulation using groupby, melt, pivot tables, pivot, transpose, and stack.
Page 2 of 54