Description
What you’ll learn
-
Understand the End to End Data Engineering Project for Retailer Domain
-
Design and Implement Scalable ETL Pipelines for Healthcare Data
-
Implement Key Techniques like Incremental Data, SCD2, Metadata driven approach, Medallion Arch, Error Handling, CDM , CICD & Many more..
-
Develop and Deploy Data Solutions with CI/CD Practices
-
This project focuses on building a data lake in Google Cloud Platform (GCP) for Retailer Domain
-
The goal is to centralize, clean, and transform data from multiple sources, enabling Retailers providers and insurance companies to streamline billing, claims processing, and revenue tracking.
-
GCP Services Used:
-
Google Cloud Storage (GCS): Stores raw and processed data files.
-
BigQuery: Serves as the analytical engine for storing and querying structured data.
-
Dataproc: Used for large-scale data processing with Apache Spark.
-
Cloud Composer (Apache Airflow): Automates ETL pipelines and workflow orchestration.
-
Cloud SQL (MySQL): Stores transactional Electronic Medical Records (EMR) data.
-
GitHub & Cloud Build: Enables version control and CI/CD implementation.
-
CICD (Continuous Integration & Continuous Deployment): Automates deployment pipelines for data processing and ETL workflows.
-
-
Techniques involved :
-
Metadata Driven Approach
-
SCD type 2 implementation
-
CDM(Common Data Model)
-
Medallion Architecture
-
Logging and Monitoring
-
Error Handling
-
Optimizations
-
CICD implementation
-
many more best practices
-
-
Data Sources
-
MySQL Retailer Database
-
MySQL Supplier Database
-
API Reviews (api-reviews)
-
-
Expected Outcomes
-
Efficient Data Pipeline: Automating the ingestion and transformation of RCM data.
-
Structured Data Warehouse: gold tables in BigQuery for analytical queries.
-
After Analysis, Looker BI is used to generate dashboards and reports based on gold-layer tables.
-
All processes (data extraction, loading into GCS, transformation in BigQuery) are managed using Apache Airflow, ensuring automation, scheduling, and monitoring.
-
Who this course is for:
- Aspiring Data Engineers, Data Professionals
- For getting interview Ready





Reviews
There are no reviews yet.