Serverless Processing with Google and Talend Cloud

 

Subscription onlyThis content is available for Talend Academy subscription users only. Open use case - EN

 

Prerequisites

Talend Data Integration Basics, Talend Data Integration Advanced, Talend Cloud Administration, Talend Big Data Spark Batch

Third-party software

Google Dataproc, Google Cloud Storage, Google BigQuery

Description

 

 

The MovieStar company is a worldwide recognized movie producing company. One of their serverless projects is to clean and enrich their Movies datasets and have the Movie data available on Google Cloud Platform (GCP), in Google Cloud Storage (GCS) buckets, and cloud warehouse tables (BigQuery tables).

 

As the volume of input datasets can be large, Big Data clusters (Google Dataproc) are used to process the data.

 

The MovieStar company has set a development environment consisting of a Dataproc cluster that only starts when processing data and GCP resources to store results.

 

For this project development phase, this use case shows you everything that you must configure to process data in a cloud environment using Talend Cloud and Talend Big Data products.