Building Codeless Pipelines on Cloud Data Fusion

Go to class
Write Review

Building Codeless Pipelines on Cloud Data Fusion provided by Qwiklabs is a comprehensive online course, which lasts for 7 hours worth of material. Upon completion of the course, you can receive an e-certificate from Qwiklabs. The course is taught in Englishand is Free Certificate. Visit the course page at Qwiklabs for detailed price information.

Overview
  • This fundamental-level Quest offers hands-on practice with Cloud Data Fusion, a cloud-native, code-free, data integration platform. ETL Developers, Data Engineers and Analysts can greatly benefit from the pre-built transformations and connectors to build and deploy their pipelines without worrying about writing code. This Quest starts with a quickstart lab that familiarises learners with the Cloud Data Fusion UI. Learners then get to try running batch and realtime pipelines as well as using the built-in Wrangler plugin to perform some interesting transformations on data.

Syllabus
    • Getting Started with Cloud Data Fusio
      • In this lab you will learn how to create a Data Fusion instance and deploy a sample pipeline
    • uilding Batch Pipelines in Cloud Data Fusio
      • This lab will teach you how to use the Pipeline Studio in Cloud Data Fusion to build an ETL pipeline. Pipeline Studio exposes the building blocks and built-in plugins for you to build your batch pipeline, one node at a time. You will also use the Wrangler plugin to build and apply transformations to your data that goes through the pipeline.
    • uilding Transformations and Preparing Data with Wrangler in Cloud Data Fusio
      • In this lab you’ll be working with Wrangler directives which are used by the Wrangler plugin, the “Swiss Army Knife” of plugins in the Data Fusion platform, so that your transformations are encapsulated in one place and we can group transformation tasks into manageable blocks.
    • uilding Realtime Pipelines in Cloud Data Fusio
      • In addition to batch pipelines, Data Fusion also allows you to create realtime pipelines, that can process events as they are generated. Currently, realtime pipelines execute using Apache Spark Streaming on Cloud Dataproc clusters. In this lab you you will learn how to build a streaming pipeline using Data Fusion.