Durata corso:
5 giorni
Costo:
1.800,00 €
Cloudera Data Platform (7) Admin
Codice: DSC01
Il corso fornisce ai partecipanti una comprensione completa di tutti i passaggi necessari per utilizzare e mantenere un cluster Hadoop utilizzando Cloudera Manager.
Al termine del corso, i partecipanti saranno in grado di sostenere l’esame CCA Administrator Exam (CCA131).
Modalità di erogazione
In aula o Live Virtual Classroom
Attestato di partecipazione
Al termine del corso verrà rilasciato l’attestato di frequenza
Contenuti del corso
- Introduction
- The Case for Apache Hadoop
- Hadoop Cluster Installation
- The Hadoop Distributed File System (HDFS)
- MapReduce and Spark on YARN
- Hadoop Configuration and Daemon Logs
- Getting Data Into HDFS
- Planning Your Hadoop Cluster
- Installing and Configuring Hive, Impala, and Pig
- Hadoop Clients Including Hue
- Advanced Cluster Configuration
- Hadoop Security
- Managing Resources
- Cluster Maintenance
- Cluster Monitoring and Troubleshooting
- CDP New FeaturesStructured Streaming with Apache Kafka
- Aggregating and Joining Streaming DataFrames
- Message Processing with Apache Kafka
Partecipanti
Amministratori di sistema e responsabili IT
Prerequisiti
Conoscenza di linux base
Obiettivi
Al termine del corso gli allievi saranno in grado di:
- Configuring and deploying production-scale clusters that provide key Hadoop-related services, including YARN, HDFS, Impala, Hive, Spark, Kudu, and Kafka
- Proper cluster configuration and deployment to integrate with the data center
- Ingesting, storing, and accessing data in HDFS, Kudu, and cloud object stores such as Amazon S3
- How to load file-based and streaming data into the cluster using Kafka and Flume
- Configuring automatic resource management to ensure service-level agreements are met for multiple users of a cluster
- Troubleshooting, diagnosing, and solving cluster issues
- How to write, configure, and deploy Apache Spark applications on a Hadoop cluster
- How to use the Spark shell and Spark applications to explore, process, and analyze distributed data
- How to query data using Spark SQL, DataFrames, and Datasets
- How to use Spark Streaming to process a live data stream
Lingue
Italiano
Vuoi ulteriori info?