Hadoop + Spark + Python Docker Container
If you want to learn Hadoop, Spark and Python (PySpark), we have published a Docker container to facilitate your learning efforts. The source code is available on GitHub and the container is published on Docker Hub. An example notebook is provided to get you jump started as well (see below).