End-to-end real-time data pipeline using Kafka, Spark, Delta Lake, DuckDB, and Power BI. Simulates clickstream analytics with batch + streaming workflows for modern data engineering. End-to-end Azure ...
Implemented pandas-based cleaning rules in data_preprocessing.py, transformations for salesorder.csv → clean_salesorder.csv, pipeline testing via multiple DAG runs.
一些您可能无法访问的结果已被隐去。
显示无法访问的结果