My first Datalabs Spark cluster

In this tutorial, you will create a Spark cluster within Datalabs and use it inside a notebook.

A Spark cluster created this way must use the same Conda environment as your notebook, and requires a project storage.

It is recommended that you should already be familiar with the material in the tutorial My first Jupyter project.

  1. What is a Spark cluster?
  2. Getting ready
  3. Create a Spark cluster
  4. Start Spark session
  5. Perform Spark calculation (Python)
  6. Perform Spark calculation (R)
  7. Delete Spark cluster
  8. Conclusion