Practical Apache Spark in 10 Minutes
BRANK

commentsBy ActiveWizardsEditor's note: This is a summary of a series of articles written on this subject from our friends at ActiveWizards. As such, each article in the series is intended as a 10 minute tutorial on a particular Apache Spark topic.Apache Spark is a powerful open-source processing engine built around speed, ease of use, and sophisticated analytics. It has originally been developed at UC Berkeley in 2009, while Databricks was founded later by the creators of Spark in 2013.The Spark engine runs in a variety of environments, from cloud services to Hadoop or Mesos clusters. It is used to perform ETL, interactive queries (SQL), advanced analytics (e.g., machine learning) and streaming over large datasets in a wide range of data stores (e.g., HDFS, Cassandra, HBase, S3). Spark supports a variety of popular development languages including Java, Python, and Scala. Part 1 - Ubuntu installationIn this article, we are going to walk you through the installation process of Spark as …

kdnuggets.com
Related Topics: Distributed Computing Machine Learning Neo4j