324x Filetype PDF File size 1.33 MB Source: www.dcs.bbk.ac.uk
Cloud Computing
Apache Spark
Dell Zhang
Birkbeck, University of London
2018/19
Spark: The Definitive Guide
https://github.com/databricks/Spark-The-Definitive-Guide
https://pages.databricks.com/the-apache-spark-collection.html
What is Spark?
• Apache Spark is a unified computing engine
and a set of libraries for parallel data
processing on computer clusters.
–The most actively developed open source engine
for this task.
–The de facto tool for any developer or data
scientist interested in big data.
One popular answer to
“What’s beyond MapReduce?”
no reviews yet
Please Login to review.