Do you already know Python and work with Pandas? Do you work with Big Data? Then PySpark should be your friend!
PySpark is a Python API for Spark which is a general-purpose distributed data processing engine. It does computations in a distributed manner which enables the ability to analyse a large amount of data in a short time.