Manage files in your Google Cloud Storage bucket using the google-cloud-storage Python library.
Working with Spark's original data structure API: Resilient Distributed Datasets.
Use Apache Airflow to build and monitor better data pipelines.
Downcast strings in Pandas to their proper data-types using HDF5.
Dealing with duplicate column names in your Pandas DataFrame.
Use Panda's multi-index to create smarter datasets. Speed up your workflow by easily selecting and aggregating related data.
Store temporary data generated during user sessions more efficiently. Integrate Redis with Flask-Session for a fast, reliable, cloud-based data store.
A guide to DataFrame manipulation using groupby, melt, pivot tables, pivot, transpose, and stack.
Become familiar with building a structured stream in PySpark using the Databricks interface.
Create beautiful data visualizations out-of-the-box with Python’s Seaborn.