By: Brad Lathrop
By: Dean Jurecic
Microsoft Power BI is a robust analytics platform that empowers users to:
- Bring data from disparate sources into one place
- Build a model for that data
- Create interactive reports to enhance data storytelling
The popular analytics platform can bring invaluable benefits to an organization, especially when an instance is properly implemented and routinely optimized.
For this blog post, we’re going to focus on two different modes available in Power BI for data storage (that is, bringing data into Power BI): Import vs DirectQuery.
Power BI Import
Power BI’s Import data storage mode enables users to easily bring data into Power BI Desktop. Navigate to your data source, choose the tables you wish to add to Power BI Desktop, and then simply click Import. The platform does the rest.
Using Import is Microsoft’s top recommendation over other data storage modes available in the platform. Why? Because import “takes advantage of the high performance query engine of Power BI, and provides a highly interactive and fully featured experience”.
High performance is always critical to successful reporting and analytics. Using the Import data storage mode is usually your best bet for performance in Power BI.
The drawback of Import, however, is that you must refresh the dataset if there are changes in the underlying data that you want to see updated in the Power BI assets you’re working with.
This is not so with DirectQuery, the alternative data storage mode we’re going to discuss in this blog post.
Power BI DirectQuery
Power BI’s DirectQuery data storage mode works much differently than Import. DirectQuery allows users to connect directly to a data source and maintain that direct connection while creating Power BI reports. This means that any changes in the underlying dataset are reflected in the report in real-time. Manual data refreshes are not required with DirectQuery, which is a big advantage over Import.
Like Import, DirectQuery isn’t without its drawbacks. The main culprit here? Performance.
DirectQuery is not recommended for most Power BI scenarios – and Import is – because the direct connection chews up processing power. This reduces query engine speed and is not desirable in most circumstances.
So why use DirectQuery then?
When to Use DirectQuery
Using the DirectQuery data storage method makes sense when you’re working with a dataset that’s constantly changing and you want to instantly see those changes reflected in the reports you’re working with.
Remember that the major advantage of using DirectQuery is that you don’t have to manually refresh data and that you’re always working with current data. This means your visualizations are updating to reflect any changes to the dataset in real-time. This is highly valuable and important for some Power BI instances and situations, but not all.
Another advantage DirectQuery has is that it’s very useful when working with massive datasets. In some cases, using Import to bring in enormous quantities of data is not possible nor the most efficient step to take. In these cases, DirectQuery is highly useful.
Client Example – Import vs. DirectQuery
We are currently working with one of the largest health systems in the northeast, helping the organization implement and optimize their Power BI instance and data warehousing. The health system is working with an enormous amount of data, so we are using DirectQuery for storage as Import simply isn’t feasible in this scenario!
On a different note, we are also working with a large, international retailer that too has massive quantities of data, just not as much as the health system client previously mentioned. Performance is a top priority for this client, so we recommended the Import mode and the company has been using it ever since.
If you’re a Power BI user or report developer, it’s really important to understand the advantages and disadvantages of Import vs DirectQuery.
Import is simpler and promises higher query engine performance relatively speaking, but it does require the manual refreshing of the underlying data set.
DirectQuery allows you to work with the most recent data without having to refresh the dataset, meaning your reports reflect the most current data as you work on them. The issue is performance, which can be a problem for many but not for all.
If you need help with implementing Power BI or optimizing an existing instance, we can help! Contact us today!
Thanks for Reading! Questions?
Thanks for reading! We hope you found this blog post useful. Feel free to let us know if you have any questions about this article by simply leaving a comment below. We will reply as quickly as we can.
Keep Your Data Analytics Knowledge Sharp
Get fresh Key2 content and more delivered right to your inbox!