Monitoring the Data Lake: Detecting Data Anomalies in ETL Pipelines
Clean and correct data is fundamental for business – and the way to achieve that is to monitor your data lake for anomalies. In this talk, we cover hands-on activities with pseudo-code examples of tests that you can run against your tables and data models. You will learn about the different classes of tests, how to set them up, and the important metrics to monitor.
- Learn how to think about your testing strategy
- Discover types of problems you can encounter
- Develop an approach and run tests in production