Automate a deduplication process using Talend Cloud

 

Subscription onlyThis content is available for Talend Academy subscription users only. Open use case - EN

 

Prerequisites

Talend Cloud Data Preparation, Talend Cloud Data Stewardship, Talend Cloud Administration

Third-party software

MySQL

Description

 

 

This advanced use case goes through the process of cleansing and deduplication of customers data. The use case also addresses how to automate the process by reading completed tasks from Talend Data Stewardship and running tasks and plans in Talend Cloud.

 

You play the role of a business analyst by cleansing the customer data using Talend Data Preparation. Then you act as a data steward by managing merging tasks in Talend Data Stewardship. You also play the role of a developer, using Talend Studio to automate the cleansing and the creation of deduplication tasks. Finally, as an administrator, you publish the implemented tasks, create a plan, and schedule them in Talend Management Console.