How Yelp Built Their Data Pipeline with Amazon Redshift


The tech world has changed dramatically since Yelp was launched back in 2004. Eight years later and Yelp started its change. It transformed from running a huge monolithic application on-premises to one built on microservices running in the AWS cloud. By the end of 2014, there were more than 150 production services running, with over 100 of them owning data. Its main part of the cloud stack is better known as PaaSTA, based on Mesos and Docker, offloading data to warehouses such as Redshift, Salesforce and Marketo. Data enters the pipeline through Kafka, which in turn receives it from multiple different “producer” sources.

Yelp Data Pipeline
Yelp Data Pipeline


Download the Data Pipeline Resource Bundle

You'll get additional resources like:

  • Full stack breakdown and tech checklist
  • Summary slides with links to resources
  • PDF version of the blog post
Mike Pavloski

Mike Pavloski

Join 11,000 of your peers.
Subscribe to our newsletter SF Data.
People at Facebook, Amazon and Uber read it every week.

Every Monday morning we'll send you a roundup of the best content from and around the web. Make sure you're ready for the week! See all issues.