Spark Developer

An in-depth course on the most powerful big data tools.

Who this course is for:
The course is designed for Data Engineers who want to learn more about Spark, but also Hadoop and Hive

In the course, you will learn the following basic topics:

Hadoop (basic components, vendor distributions)
HDFS architecture
YARN architecture
Data formats
Spark
Spark Streaming and Flink
Hive
Orchestration, Monitoring and CI/CD
etc.

Learn how to put it all into practice and consolidate with interesting and challenging homework assignments and a final project.

After taking this course, you will be able to:

Use Hadoop to process data
Interact with its components via console clients and APIs
Work with loosely structured data in Hive
Write and optimize applications on Spark
Write tests for Spark applications
Use Spark to process tabular, streaming, geo-data, and even graphs
Configure CI and monitoring of Spark applications

Required Knowledge

Experience writing code in at least one of the following languages: Python, Java, Scala
Basic knowledge of SQL and experience with any relational database
A computer or a Linux-based virtual machine with at least 8 GB of RAM

take a full set of training materials: video recordings of all webinars, presentations for classes, as well as solutions of problems and projects in the form of code on github and other additional materials;
Receive a certificate of completion;
receive an invitation to an interview in partner companies (the most successful students get this opportunity).