We've had quite a journey exploring the magical world of PySpark together. After covering DataFrame transformations, structured streams, and RDDs, there are only so many things left to cross off the list before we've gone too deep.
To round things up for this series, we're a to take a look back at some powerful DataFrame operations we missed. In particular we'll be focusing on operations which modify DataFrames as a whole, such as
Joining DataFrames in PySpark
I'm going to assume you're already familiar with the concept of SQL-like joins. To demonstrate these in PySpark, I'll create two simple DataFrames: