Everyone’s talking about big data. But do companies actually know how to extract relevant and clean data to derive meaningful insights?

We wanted to equip our developers with the right knowledge about data processing on Google Cloud. So, we sent them for a four-day workshop, ‘Data Engineering on Google Cloud Platform’.

Held from Jan 30 to Feb 2, the workshop provided participants a hands-on experience on how to:

  • design and build data processing systems;
  • build end-to-end data pipelines;
  • analyse data;
  • and carry out machine learning;

on the Google Cloud Platform.

The workshop also featured presentations and demos and covered structured, unstructured, and streaming data.

The modules covered were as follows:

  • Module 1: Google Cloud Dataproc Overview
  • Module 2: Running Dataproc Jobs
  • Module 3: Integrating Dataproc with Google Cloud Platform
  • Module 4: Making Sense of Unstructured Data with Google’s Machine Learning APIs
  • Module 5: Serverless data analysis with BigQuery
  • Module 6: Serverless, autoscaling data pipelines with Dataflow
  • Module 7: Getting started with Machine Learning
  • Module 8: Building ML models with Tensorflow
  • Module 9: Scaling ML models with CloudML
  • Module 10: Feature Engineering
  • Module 11: Architecture of streaming analytics pipelines
  • Module 12: Ingesting Variable Volumes
  • Module 13: Implementing streaming pipelines
  • Module 14: Streaming analytics and dashboards
  • Module 15: High throughput and low-latency with Bigtable

Our developers Amiruddin (left) and Dzarief (right)


Workshop in session.


Lava is an authorised Cloud Partner of Google and is a reseller of G Suite (previously known as Google Apps, Google Maps for Work, and Google Cloud Platform) in Malaysia. With more than a decade of experience in the industry, we’re proud to say we’re one of the leading cloud consultants and service providers in the Asia Pacific region.