This is not a guide on how to set up a spark or create a bucket on a google cloud platform. The documentation for setting up the Cloud Storage connector is lacking, so I decided to create this quick guide to access your google storage files with PySpark.

Go to this google storage connector link and download the version of your connector for your Spark-Hadoop version. In my case, I was using spark-2.4.6-bin-hadoop2.7, so I download the Cloud Storage connector for Hadoop 2.x.

Once the .jar file is downloaded, just put the .jar file into C:\%path/to/your/spark%\spark\spark-2.4.6-bin-hadoop2.7\jars.

Next thing, go to…

Jayce Jiang

Data Engineer at Disney, who previous work at Bytedance.

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store