Learn analyzing large data sets with Apache Spark by 10+ hands-on examples. Take your big data skills to the next level.
What Will I Learn?
- An overview of the architecture of Apache Spark.
- Work with Apache Spark's primary abstraction, resilient distributed datasets(RDDs) to process and analyze large data sets.
- Develop Apache Spark 2.0 applications using RDD transformations and actions and Spark SQL.
- Scale up Spark applications on a Hadoop YARN cluster through Amazon's Elastic MapReduce service.
- Analyze structured and semi-structured data using Datasets and DataFrames, and develop a thorough understanding about Spark SQL.
- Share information across different nodes on a Apache Spark cluster by broadcast variables and accumulators.
- Advanced techniques to optimize and tune Apache Spark jobs by partitioning, caching and persisting RDDs.
- Best practices of working with Apache Spark in the field.
Includes:
- 3.5 hours on-demand video
- Top-responding instructor
- 10 Articles
- 1 Supplemental Resource
- Full lifetime access
- Access on mobile and TV
- Certificate of Completion
No comments:
Post a Comment