Start Now Login

At intermix.io, we run a fleet of over ten Amazon Redshift clusters. In this post, I’ll describe how we use AWS IAM authentication for our database users. 

AWS access is managed by configuring policies and connecting them to IAM identities (users, groups of users, or roles) or AWS resources. A policy is essentially an object in AWS that defines their permissions when associated with an identity or resource.  Evaluation of the policies occurs when an IAM principal (user or role) makes a request, and permissions within the policies determine whether the request is allowed or denied. Policies reside in AWS as JSON documents. There is support for six types of policies: identity-based policies, resource-based policies, permissions boundaries, organization SCPs, ACLs, and session policies.

By using IAM credentials, we can enforce security policies on users and passwords. It’s a scalable and secure way to manage access to your cluster(s).

The approach we describe is useful in environments where security is a top concern. Examples are industries with regulatory requirements such as Finance or Healthcare. Other use cases include enterprises with strict IT security compliance requirements.

Table of Contents

Secure Ways to Manage Access to Amazon Redshift

Amazon Redshift started as a data warehouse in the cloud. A frequent initial use case was business intelligence. Over the years, though, the use cases for Amazon Redshift have evolved. Redshift is still used for reporting. But it’s now also part of mission-critical data services and applications. intermix.io is an example – we built our application on top of Redshift. And so it’s critical to track and manage user access to a cluster.

The standard way for users to log onto Amazon Redshift is by providing a database username and password; this is because Amazon Redshift is based on PostgreSQL. But using a username/password combination has a few drawbacks.  The biggest one is that there is no way to enforce security best practices for password changes; this may not be an issue when you are in a small company where some 2-3 people can access your cluster. But that’s different for enterprises with a large pool of users. It’s also different when you’re a small company where headcount is growing fast. It’s easy to lose track of everybody who has access to your most valuable data sets.

Luckily, AWS has recently developed alternatives to using a username and password. There are three options to maintaining login credentials for your Amazon Redshift database:

  1. Permit users to create user credentials and login with their IAM credentials.
  2. Permit users to login with a federated sign-on sign-on (SSO) through a SAML 2.0-compliant identity provider.
  3. Generate temporary database credentials. Permissions are granted through an AWS Identity and Access Management (IAM) permissions policy. By default, these credentials expire after 15 minutes, but you can configure them to expire up to an hour after creation.

This post will discuss #3: using IAM credentials to generate expiring database credentials.Amazon Redshift provides the GetClusterCredentials API operation and get-cluster-credentials command for the AWS Command Line Interface (AWS CLI). Both offer the ability to generate temporary database user credentials programmatically. It is also possible to configure your SQL client with Amazon Redshift JDBC or ODBC drivers who manage the process of calling the GetClusterCredentials operation; this retrieves the database user credentials establishing a connection between the SQL client and Amazon Redshift database. Checkout JDBC and ODBC Options for Creating Database User Credentials for more information.

How intermix.io Uses IAM to Generate Temporary Passwords

The Old Way: Manual

Handling credentials used to be a manual process, and that’s a pain in the neck. We’re already using the AWS Secrets Manager. It could have been a pretty trivial exercise to add auto-rotation to our secrets, and trigger ALTER <user> queries to update them with new credentials. That would have been my initial approach. But one of my new colleagues pointed out the option of using IAM. Redshift allows you to get time-scoped IAM credentials associated with a role within Redshift itself.

The New Way: Using IAM to Generate Expiring Passwords

That turned into a somewhat larger undertaking. First, I had to understand how we were using Redshift across our platform. With a reasonable idea of how the change would work, I changed our Redshift connection logic to pull credentials before connecting. That turned out to be a pretty easy change. We’re a Python shop and Boto3 – the AWS SDK for Python – is exhaustive. Boto enables Python developers to create, configure, and manage AWS services.

We deployed the change into one of our testing environments. Everything went well until we hit the rate limit for the Redshift API.  We were making too many requests to GetClusterCredentials. We have a lot of in-flight transformations that all connect to Redshift. Making those calls along-side the connect exhausted the rate-limit. But that wasn’t insurmountable. It was pretty easy to add a caching mechanism to our connection logic so that we didn’t need to generate a new credential every time.

Once we got the caching mechanism deployed, we were able to disable logins. Access to the cluster was now only available through IAM credentials.

That left us with an awkward situation, though. Our developers have to connect to our clusters to run queries or test new features. They needed a way to generate IAM credentials and connect to a remote cluster.

We already understood how to generate IAM credentials and had code that handled that. We then solved our need by creating a task in our execution and deployment framework. The task connects to a Redshift cluster in any of our environments using IAM credentials. You can see it in use below!

Generate IAM Credentials

The code we used to do this isn’t that important because it’s made possible by the Redshift API and some IAM policies. So you can do this yourself, even without seeing what I’ve done. You need an AWS client for your preferred programming language and an IAM policy profile granting your users access to get IAM credentials on your clusters.

See your data in intermix.io

Getting Started

The first thing you’re going to want to do is create an IAM policy with the Redshift Actions needed to create an IAM credential on a Redshift cluster. 

The following is a policy that allows the IAM role to call the GetClusterCredentials operation, which automatically creates a new user and specifies groups the user joins at login. The “Resource”: “*” wildcard grants the role access to any resource, including clusters, database users, or user groups.

Of course, this example is not secure and strictly for demonstration. Please see Resource policies for GetClusterCredentials for more information and examples to achieve more granular access control.

Once you have that policy created, go ahead and create a group to contain the users to whom you want to grant access; if you’ve already got that group created, great! Attach the policy you defined in the previous step. Add the users you’d like into the group and attach the policy under the Permissions tab. At this point, all users will now be able to generate IAM credentials for existing users on your clusters. 

The following shows how to use Amazon Redshift CLI to generate temporary database credentials for an existing user named adam.

As a side note, if the user doesn’t exist in the database and AutoCreate is true, the creation of a new occurs with PASSWORD disabled. In the case where the user doesn’t exist, and AutoCreate is false, the request fails.

aws redshift get-cluster-credentials –cluster-identifier clusterA –db-user adam –db-name dbA –duration-seconds 7200.

The result is the following: 

{
  “DbUser”: “IAM:adam”,
  “Expiration”: “2020-10-08T21:10:53Z”,
  “DbPassword”: “EXAMPLEjArE3hcnQj8zt4XQj9Xtma8oxYEM8OyxpDHwXVPyJYBDm/gqX2Eeaq6P3DgTzgPg==”
}

Check out the official documentation for a more in-depth exploration of the CLI method.

However, using the AWS CLI is manual and can be error-prone. My recommendation is to create a utility to generate the credentials and connect to the cluster on behalf of the user. This utility could also generate ODBC or JDBC connection strings if that’s how your users connect to a cluster.

Here’s a quick Python-based example that outputs a JDBC URL. The examples assume that:

You’re now in a situation where you have an IAM policy and a group containing your users. You’ve also configured your Redshift users with passwords DISABLED.

In short, you’ve secured your Redshift cluster. Your security team can enforce credential rotation on users using standard IAM behavior vs. direct database implementation.

For more on best practices when working with Amazon Redshift, read our post on 3 Things to Avoid When Setting Up an Amazon Redshift Cluster. Or download our best-practices guide for more tips on enterprise-grade deployments for Amazon Redshift.

Here at Intermix, we’re constantly building products and innovating for Amazon Redshift users, and we’d like to think we have our finger on the pulse of what Redshift customers are saying. The number one driver of change that we’ve seen for our clients is that they’re experiencing huge growth in data and query volumes.

A 2016 report by IDG found that the average company now manages 163 terabytes (163,000 gigabytes) of information—and the figures have undoubtedly only increased since the survey was released. Of course, having more data at hand also means that you need higher query volumes to extract all of the insights they contain. 

We’ve observed three main drivers behind the growth in data and query volumes among our customers:

  1. Growth in the number of data sources connected to a Redshift cluster, and the volume of data coming from those sources. Among many of our clients, we’ve observed exponential data growth rates—even doubling every year.
  2. Growth in the number of workflows that are running once data resides inside a cluster. The reason is simple: the more data and sources you have, the more ways you have to join and combine that data.
  3. Growth in the number of data consumers who want more data, in more combinations, at a higher frequency. Much of this growth is driven by how easy dashboard tools make it to explore data.  

When building data pipelines with Amazon Redshift, all of these growth factors make it quite challenging to monitor and detect anomalies across all of the queries running on the system. For example, a lot of our clients use Looker as their tool for data exploration, and one of the most common complaints with the platform is slow Looker dashboards.

One way of approaching this problem is to follow best practices for setting up your workload management in Amazon Redshift the right way. You get a more granular view of your users and the resources they use, and the WLM gives you the power to isolate them from each other.

But even after optimizing your WLM configuration, you’re still lacking critical insights. For example:

Until now, it’s been quite hard to surface all of this information in a single place. That’s why we built a way to intuitively and powerfully visualize these insights, helping to make your life as a data engineer easier when working with Amazon Redshift.

The new feature is called “Query Insights,” and it’s available immediately in your Intermix dashboard. Below, we’ll check out some examples of using the Query Insights feature in Intermix.

Detect a Huge Increase in Query Volumes in Amazon Redshift

In this example, we’ll show how you can use Query Insights to quickly identify that a huge spike in query volume happened, as well as the specific queries that caused it.

The Intermix dashboard clearly shows that there has been a query count spike at 8:59 p.m. on January 24. In the left panel in the Intermix user interface, click on “query group” to segment the chart by query group. Click on the chart to reveal a slide-in list of query groups, sorted in descending order of Count (i.e. the number of queries). Finally, click on the “View Queries” icon to jump to the Query Details list, where you can see exactly which queries are causing the problem.

intermix.io dashboard on queries


Identifying Which Query Group is Causing a Batch Pipeline Slow Down in Amazon Redshift?

In this example, we’ll show you how to use Query Insights to find the cause of an overall increase in latency.

Click on the graph icon in the main Query Insights dashboard to group the data by app. This reveals that the “intermix_collector” app is experiencing a large increase in latency. Next, click on the “intermix_collector” app so that you can group by a custom attribute.

This app has two custom attributes, “dag” and “task”, that appear on the left navigation panel. This is because these queries are tagged using the Intermix annotation format. First group by “dag”, and then by “task”, to isolate the specific task that is seeing a spike in latency. Finally, group by “query group” to see the specific queries that are causing a problem.

query insights in intermix.io


Find the Query in Amazon Redshift causing a Looker PDT Latency Spike

In this example, let’s monitor our Looker PDT queries to find the queries which are the slowest.

First, click on the “Looker PDT” tag in the dashboard in order to only view data for Looker PDT queries. As you can see in the left navigation panel, the Looker PDT app has a few custom attributes. Click on “model” to group the queries by model name. We immediately see that the “supply chain” model is the slowest. We can click on it and then group by “query group” to find the individual queries causing a problem.

query optimization

What’s Next

Query Insights is a tremendously valuable tool in your Redshift toolkit, but we’re only getting started. Keep your eyes open for a new feature “Transfer Insights” soon, which will allow you to monitor the users and apps that are loading data and rows into your Amazon Redshift cluster.

Want to try Query Insights out for yourself? Take the opportunity to get a personalized tour of Query Insights, as well as see what’s next on our product roadmap and provide your feature requests and feedback. Please find a time on my calendar at this link, or sign up today for a free trial of Intermix.

What are Query Groups in Amazon Redshift and intermix?

Amazon Redshift is a robust, enterprise-class cloud data warehouse—but that doesn’t mean it’s always the most user-friendly solution. Getting the most out of your Redshift deployment can be a challenge at the best of times (which is one reason why we wrote our article “15 Performance Tuning Techniques for Amazon Redshift”).

In particular, Redshift queries can cause performance issues and mysterious slowdowns if you don’t know how to optimize them. But without a dedicated way to analyze Redshift query performance, how can you hope to isolate the problem?

That’s why we’ve introduced the Query Groups feature to Intermix, our performance monitoring solution for Amazon Redshift. Query Groups is a powerful asset that intelligently classifies and ranks query workloads on your cluster. Using Query Groups, you can answer questions like:

How Do Query Groups Work in Amazon Redshift and intermix?

Using Intermix, the queries in a Query Group are grouped together using a proprietary algorithm, and ranked by volume, execution time, and queue time. More metrics might be added in the future depending on user demand and necessity. All of the queries in a Query Group share a SQL structure and operate on the same tables.

Example: Find the Queries Causing a Query Spike

At 8:17 a.m. on August 7, the below cluster experienced an eightfold spike in queries—wow! Typically, this type of event is caused by a handful of new queries that suddenly increased in volume. But how do you find which queries, and who ran them?

intermix.io’s Query Groups feature can quickly determine which of your queries are responsible.

First, click on the new Query Groups page in the left navigation panel. By default, Query Groups are sorted by Rank. In this case, we want to re-sort by “Rank Change,” which will order the list of Query Groups by the “fastest movers.” In other words, this will help us quickly see the groups which have been moving up the ranks in the past week.

Sure enough, we see a handful of Query Groups which suddenly started running. Clicking into the first one, we can isolate the exact queries that are causing the problem.

You could also use the same procedure to determine the queries that underwent a spike in latency or queue time.

What’s Next for Query Groups?

Query Groups in Redshift and Intermix offer the potential to make your Redshift performance tuning process radically more efficient—and we’re just getting started. We plan to expand the “grouping” concept in the future to add:

Sound like what you’ve been looking for,? We’re here to assist. Sign up today for a free trial of the Intermix platform. Query Groups are just one of the ways that we’ve helped Redshift customers get the most from their cloud data warehouse.

Monitoring your data apps with app tracing gives you better control and can help train new hires to use tools. Amazon Redshift training, for example, can become much easier when you use Intermix.io Data Tracing. If you don’t know how monitoring data apps benefit your organization and tools, the following article will give you a few ideas.

App Tracing surfaces important information about how apps & users interact with your data. It can help answer questions like:

What is a “Data App”?

Data apps typically fall into one or more of three categories:

  1. Data integration services: Vendors who ETL data from external systems or applications into your data environment.
  2. Workflow orchestration: Tools for workflow orchestration – typically batch processing on your data pipeline.
  3. Visualization & Analysis: Reporting, modeling, and visualization apps used by analysts and data scientists to highlight emerging trends in information.

Data Integration

Intermix can fulfill all three of these functions. It integrates with your favorite tools to show you which end-users connect to your data warehouse. It comes ready to integrate with popular tools like:

Whether you want to connect your data to more BI tools or you want insights that lead to better Amazon Redshift training, Intermix can help.

Visualization & Analysis

Intermix App Tracing thrives on visualization and analysis. If you want to know how much data an app uses, just click on the icon to get a graph that shows a timeline of the app’s data usage. 

You can also get visualization analysis to learn:

Visualization makes information easier for everyone to understand. Whether you have years of experience in IT or you just want an overview of your data use, Intermix’s App Tracing gives you a straightforward approach to view information.

How it Works

App Tracing requires the data app to annotate the executed SQL with a comment. The comment encodes metadata about the application which submitted this query.

Intermix.io will automatically index all data contained in the annotation, and make it accessible as first-class labels in our system. I.e. for Discover searches, Saved Searches, and aggregations in the Throughput Analysis page.

Supported Apps

Out of the box, we support:

Don’t see your data app? No problem. Any queries tagged with our format will be automatically detected. See here for instructions on using the Tag Generator to create tags to embed into your SQL.

Example: Which Looker User is Causing a Concurrency Spike

In the below example, a query spike in WLM 3 causes a bottleneck in query latency. The result is that queries which would otherwise take 13-14 seconds to execute are stuck in the queue for > greater than 3 minutes.

App Tracing detects that the majority of these queries are from Looker. How do you know which user is causing this?

Click on the chart, and a widget will pinpoint the specific Looker user(s) who ran those queries. In this example, we see that user 248 is responsible.

App Tracing in Looker

Armed with this information, you can now:

Monitoring & Setting an Alarm

See all the activity for this user by heading to Discover and use the new ‘App’ filter to search for Looker user 248.

To set up an alarm to get email notifications, save that search and stream the following metrics to CloudWatch:

Cloudswatch App Tracing with intermix.io

See What Customers are Saying

The following Slack conversation took place the morning we soft-launched app tracing in June 2018:

comment_app_tracing

Make Amazon Redshift More Effective

Intermix.io makes Amazon Redshit more powerful. We’ve seen clients use Intermix.io to:

Using Apache Airflow?

If you’re using Amazon Redshift in combination with Apache Airflow, and you’re trying to monitor your DAGs  – we’d love to talk! We’re running a private beta for a new Airflow plug-in with a few select customers. Go ahead and click on the chat widget on the bottom right of this window. Answer three simple questions, schedule a call, and then mention “Airflow” at the end and we’ll get you set up! As a bonus, we’ll throw in an extended trial of 4 weeks instead of 2!

Looker is a powerful tool for self-service data analytics. A lot of companies use Looker together with Amazon Redshift for powerful business intelligence and insights. By making it easy for users to create custom reports and dashboards, Looker helps companies derive more value from their data.

Unfortunately, “slow Looker dashboards” is one of the most frequent issues we hear with Amazon Redshift. Some of our customers who use Looker tell us that queries that should take seconds to execute instead take minutes, while dashboards seem to “hang”.

The good news is that we can probably help: the issue is likely a mismatch between your Looker workloads and your Amazon Redshift configuration. In this post, we’ll explain the causes of slow Looker dashboards, and how to fine-tune Amazon Redshift to get blazing-fast performance from Looker.

The Problem: Slow Looker Dashboards

Analytics stacks often grow out of a simple experiment. Somebody spins up an Amazon Redshift cluster, builds a few data pipelines, and then connects a Looker dashboard to it. The data is popular, so you set more people up with dashboards—and at some point, the problems start.

Looker performance issues can range from slow dashboards to long execution times for persistent derived tables (PDTs). In some cases, these problems can even appear at the very start of the journey. Consider this post on the Looker forum, which complains that “first-run query performance is terrible”:

Slow Looker Customer Question Screenshot
Image 1: Support Request on Looker Discourse

The key to solving bottlenecks lies in balancing your Looker workloads with your Redshift setup. First, let’s discuss how Amazon Redshift processes queries, and then we’ll look closer at how Looker generates workloads.

Amazon Redshift Workload Management and Query Queues

A key feature in Amazon Redshift is the workload management (WLM) console. Redshift operates in a queuing model. The WLM console allows you to set up different query queues, and then assign a specific group of queries to each queue.

For example, you can assign data loads to one queue, and your ad-hoc queries to another. By separating your workloads, you ensure that they don’t block each other. You can also assign the right amount of concurrency, a.k.a. “slot count,” to each queue.The default configuration for Redshift is one queue with a concurrency of 5.

It’s easy to overlook WLM and queuing when getting started with Redshift. But as your query volumes grow and you run more than 5 concurrent queries, your queries will start to get stuck in the queue as they wait for other queries to finish. When that happens, you’re experiencing the “slow Looker dashboards” phenomenon.

Slow Looker Dashboards: Understanding LookML and Persistent Derived Tables

There are two components of the Looker platform, LookML and persistent derived tables (“PDTs”), that make it easy for a company to explore its data.

But we’ll see how they can also generate high query volumes with heavy workloads that can slow down your Redshift clusters.

LookML – Abstracting Query Structure from Content

LookML is a data modeling language that separates query structure from content. In other words, the query structure (e.g. how to join tables) is independent of the query content (e.g. what columns to access, or which functions to compute). A LookML project represents a specific collection of models, views and dashboards. The Looker app uses a LookML model to construct SQL queries and run them against Redshift.

The benefit of separating structure from content is that business users can run queries without having to write SQL. That abstraction makes a huge difference. Analysts with SQL skills only define the data structure once in a single place (a LookML project), and business users then leverage that data structure to focus on the content they need.Looker uses the LookML project to generate ad-hoc queries on the fly. The below image illustrates the process behind LookML:

LookML data flow
Image 2: LookML separates content of queries from structure of queries.

Persistent Derived Tables

Some Looks create complex queries that need to create temporary tables, e.g. to store an intermediate result of a query. These tables are ephemeral, and the queries to create the table run every time a user requests the data. It’s essential for these derived tables to perform well, so that they don’t put excessive strain on a cluster.

In some cases where a query takes a long time to run, creating a so-called PDT (“persistent derived table”) is the better option. Looker writes PDTs into a scratch Redshift schema, and refreshes the PDT on a set schedule. Compared to temporary tables, PDTs reduce query time and database load, because when a user requests the data from the PDT, it has already been created.

There’s a natural progression from single queries to PDTs when doing LookML modeling. When you’re starting out, you connect all tables into a LookML model to get basic analytics. To get new metrics or roll-ups and to iterate quickly, you start using derived tables. Finally, you leverage PDTs to manage the performance implications.

Slow Looker Dashboards: The Impact of LookML and PDTs on Query Volume

The separation of structure from content via LookML can have dramatic implications for query volume. The SQL structure of one productive analyst can be reused by countless other users.

A Simple Math Example

Consider a simplified scenario with a single-node Amazon Redshift cluster, 5 business users, and a single LookML project. Each user has 10 dashboards with 20 Looks (i.e. a specific chart). Behind each Look is a single query. With each refresh, they will trigger a total of 5 (users) * 10 (dashboards) * 20 (looks) = 1,000 queries.

With a single-node Amazon Redshift cluster and a default WLM setup, you will process 5 queries at a time. You’ll need 1,000/5 = 200 cycles to process all of these queries. While 5 of these queries process, all of the other ones will have to wait in the queue. The below image shows a screenshot from the intermix.io dashboards that shows what your queue wait times can look like.

Queue wait time for Looker Queries in intermix.io
Image 3: Queue wait time for Looker Queries in intermix.io

Let’s assume each query takes 15 seconds to run. For all queries to run, we’re looking at a total of 200 * 15 = 3,000 seconds (50 minutes). In other words, your last 15-second query will finish running after 50 minutes.

Even if you add a node now, i.e. you double the amount of queries you can process, you’re only cutting that total wait time in half—that’s still 25 minutes.

Now let’s also add PDTs into the mix. Our PDTs will generate more workloads, often with complex, memory-intensive and long-running queries. The PDTs then compete with the already slow ad-hoc queries for resources.

There are a few possible remedies: for example, throttling the number of per-user queries, reducing the row limit for queries, or allowing fewer data points. But the whole point of using Looker is to derive meaningful conclusions from huge amounts of data. Imposing query and row limits, or using fewer data points, doesn’t make sense.

3  Steps to Configure your Amazon Redshift Cluster for Faster Looker Dashboards

The good news is that there are only 3 steps to getting faster Looker dashboards:

  1. Optimize your Amazon Redshift WLM for your Looker workloads.
  2. Optimize your Looker workloads.
  3. Optimize your Amazon Redshift node count.

By following these 3 steps, you’ll also be able to optimize your query speeds, your node count, and your Redshift spend.

See your data in intermix.io

Step 1: Optimize the Amazon Redshift WLM for Looker Workloads

We’ve written before about “4 Simple Steps To Set-up Your WLM in Amazon Redshift For Better Workload Scalability.” The 4 steps, in summary, are:

The same logic applies for your Looker queries. Have your Looker queries run in a queue that’s separate from your loads and transforms. This will allow you to define the right concurrency and memory configuration for that queue. Having enough concurrency means each Looker query will run, while having enough memory means that you’ll minimize the volume of disk-based queries.

During peak times, Concurrency Scaling for Amazon Redshift gives your Redshift clusters additional capacity to handle any bursts in query load. Concurrency scaling works by off-loading queries to new, “parallel” clusters in the background. Queries are routed based on their WLM configuration and rules.

In your intermix.io dashboard, you can see the high watermark/peak concurrency for your Looker queries. You’ll also see how much memory they consume, telling you what memory percentage you should assign to each slot.

By using the right settings, you can balance your Redshift usage with your Looker workloads. Doing this step alone will give you much faster dashboards.

Step 2: Optimize Your Looker Workloads

What is a redundant Looker workload? It’s a query that’s running but doesn’t need to be (for example, if users are refreshing their dashboards more frequently than they need). By reducing that refresh rate, your Redshift cluster will have to process less queries, which in turn drives down concurrency.

Looker User in intermix.io dashboard
Image 4: Identifying High-Volume Looker Users in intermix.io

With intermix.io’s app tracing feature, you can see which of your Looker users are driving most of the query volume, down to the single Look. Below, you can see feedback from one of our customers  during our private beta for app tracing:

Image 5: Finding high volume Looker users

Step 3: Optimize Your Amazon Redshift Node Count

Once you’ve squeezed all the juice out of your WLM, it’s time to adjust your Redshift node count. If you’re still encountering concurrency issues or disk-based queries, it may be time to add more nodes. In most cases, though, there’s an opportunity to reduce node count and save on your Redshift spend.

Consider the case of our customer Remind, a messaging app for schools. By configuring their workload management, they managed to reduce their Amazon Redshift spend by 25 percent.

That’s it! There are a few more tweaks you can do that will improve performance. Examples are setting your dist/sort keys for your PDTs, or moving some PDTs into your ELT process. But the 3 steps in this post will give you the biggest immediate return on your Looker investment.

Ready to scale, get fast dashboards, and handle more data and users with Looker?

Sign up today for a free trial of intermix.io. As your company grows, you can be confident that your Looker dashboards will always be lightning fast.