By: Ken Adams
Not Enough Space
On my current engagement, we are trying to build a reporting platform that meets the needs of our client while simultaneously adhering to best practices. One of the best practices we’ve established as part of the platform is space savings. Our client uses SQL Server 2017, and they have a large number of databases primarily due to the fact that their customers do not allow for multi-tenancy.
Inside of the client databases, analysts have the ability to create N number of objects and some of those objects can get quite large. This has made it very difficult to maintain and manage this environment. Space is an ongoing problem, as in some cases there is a requirement to keep the data online for a year or more.
The new reporting platform will streamline the flow of data from cradle to grave, and while we can’t fully eliminate the multi-tenancy requirement, we can certainly help reduce the amount of storage required. In SQL Server 2008, compression was introduced – a functionality applicable to both tables and indexes. Compression was initially an enterprise-only feature, but in SQL Server 2016, the functionality was added to lower editions going down to “standard”, making compression widely available to most in the marketplace.
I don’t want to get into the minutiae of compression in this blog as there has been so much written on it over the years. Instead, I wanted to share a simple script that can help in adding compression to existing tables and indexes.
Adding Compression to Existing Tables and Indexes
We had started developing a prototype for our current reporting platform. This was a series of import, staging, and ultimately relational tables. Once we got through the requirements and prototyping, we were ready to establish proper guidelines for the way forward. As previously mentioned, we decided that compression was part of that way forward. Please note that the functionality might not be right for your environment. There has been much written on this topic as well.
We did some initial investigation on some of the larger tables to determine the effect of adding compression. It was decided that PAGE compression suited our needs best and we began testing the before and after of adding compression. Again, you can find many articles on testing for compression and adding it to objects – I recommend this one.
I really wanted to take it a step further and find a way to look for uncompressed objects and include the associated logic for testing those objects as well as the DML for adding compression to the object.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 |
WITH CompressionCheck AS ( SELECT DISTINCT SCHEMA_NAME(o.schema_id) AS SchemaName, OBJECT_NAME(o.object_id) AS BaseTableName, SCHEMA_NAME(o.schema_id) + '.' + OBJECT_NAME(o.object_id) AS TableName, i.name AS IndexName, ps.reserved_page_count AS PageCount FROM sys.partitions p INNER JOIN sys.objects o ON p.object_id = o.object_id JOIN sys.indexes i ON p.object_id = i.object_id AND i.index_id = p.index_id INNER JOIN sys.dm_db_partition_stats ps ON i.object_id = ps.object_id AND ps.index_id = i.index_id WHERE p.data_compression_desc = 'NONE' AND SCHEMA_NAME(o.schema_id) <> 'SYS' ) SELECT * , IIF(IndexName IS NULL, REPLACE(CONCAT_WS(' ', 'EXEC sp_estimate_data_compression_savings', '''' + SchemaName + ''',', '''' + BaseTableName + ''',', '''NULL'', ', '''NULL'', ', '''PAGE'''), '''NULL''', 'NULL'), '') AS CheckTableSavingsSQL , IIF(IndexName IS NULL, CONCAT_WS(' ', 'ALTER TABLE', TableName, 'REBUILD PARTITION = ALL WITH (DATA_COMPRESSION = PAGE);'), CONCAT_WS(' ', 'ALTER INDEX', IndexName, 'ON', TableName, 'REBUILD PARTITION = ALL WITH (DATA_COMPRESSION = PAGE);')) AS CompressionSQL FROM CompressionCheck ORDER BY PageCount |
*Update 7/12/19: For those that might want to do this in batches working from smallest to largest I added a page count sort. Happy compressing!*
Executing those calls returns the 2 rows seen below:
This was a good enough indication that adding PAGE compression would give us the desired result and wouldn’t be a hindrance based on our environment. I went back to the previous results and grabbed the rows under CompressionSQL. I copied and pasted those to a new query window.
1 2 |
ALTER TABLE Import.TestOnline REBUILD PARTITION = ALL WITH (DATA_COMPRESSION = PAGE); ALTER TABLE Stage.Student REBUILD PARTITION = ALL WITH (DATA_COMPRESSION = PAGE); |
Running the SQL above adds page compression to the 2 tables and ultimately returns some space back to the database. When running the compression check query again you will no longer see those tables show up.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 |
SELECT DISTINCT SCHEMA_NAME(o.schema_id) AS SchemaName, OBJECT_NAME(o.object_id) AS BaseTableName, SCHEMA_NAME(o.schema_id) + '.' + OBJECT_NAME(o.object_id) AS TableName, i.name AS IndexName FROM sys.partitions p INNER JOIN sys.objects o ON p.object_id = o.object_id JOIN sys.indexes i ON p.object_id = i.object_id AND i.index_id = p.index_id WHERE p.data_compression_desc = 'NONE' AND SCHEMA_NAME(o.schema_id) <> 'SYS' --(0 rows affected) |
As noted, this is not the most elegant solution but it did the trick for us. You could take it even further and step through the rows while executing the SQL for those objects along the way. We wanted the ability to take snippets of code and run those to evaluate the results. If we felt the object was a good candidate, we then ran the ALTER TABLE statement.
You’ll also note that the stored procedure call is not being made for the indexes. We were only running that for the tables. There is some NULL handling required on the stored procedure calls for the tables as NULL are basically ignored by the CONCAT_WS function. Hence, passing that in a string and then removing it with the outer REPLACE.
Anyways, I hope you found this useful and can add this knowledge to your SQL tool belt. If you run into any issues or would like to see more topics like this one, let us know! We’d be glad to assist. Thanks for reading!
Helpful Compression Links:
- https://docs.microsoft.com/en-us/sql/relational-databases/data-compression/data-compression?view=sql-server-2017
- https://thomaslarock.com/2018/01/when-to-use-row-or-page-compression-in-sql-server/
Questions?
Thanks for reading! We hope you found this blog post to be useful. Do let us know if you have any questions or topic ideas related to BI, analytics, the cloud, machine learning, SQL Server, (Star Wars), or anything else of the like that you’d like us to write about. Simply leave us a comment below, and we’ll see what we can do!
Keep your data analytics sharp by subscribing to our mailing list
Get fresh Key2 content around Business Intelligence, Data Warehousing, Analytics, and more delivered right to your inbox!
Key2 Consulting is a data warehousing and business intelligence company located in Atlanta, Georgia. We create and deliver custom data warehouse solutions, business intelligence solutions, and custom applications.
Great article thanks Ken. I am considering applying compression to one of the databases I am responsible for to free up some space. It contains some tables in excess of 600GB and I am concerned about Transaction Log usage during the Table Rebuild operation since we don’t have a lot of free space on the transaction log drive. My understanding is that this will be a minimally logged operation so won’t have much impact on the log. Could you please confirm if that is the case?
Philip,
I really appreciate you reading the blog and also appreciate your question. Yes it’s my understanding as well that enabling compression is a minimally logged operation similar to rebuilding an index. Oddly enough you can’t find any direct evidence from Microsoft stating just that. I’d make sure that you test this thoroughly in another environment prior to applying in production and still make sure you do have some headroom for any database growth.
I would work through the objects a few at a time rather than trying to compress everything all at once. You can also add a join to the query I provided that will allow you to sort the objects by size. Then you can work from smallest to largest. Please note that enabling on larger objects will require some time. Good luck and let me know if you have any additional questions.
WITH CompressionCheck
AS
(
SELECT DISTINCT
SCHEMA_NAME(o.schema_id) AS SchemaName,
OBJECT_NAME(o.object_id) AS BaseTableName,
SCHEMA_NAME(o.schema_id) + ‘.’ + OBJECT_NAME(o.object_id) AS TableName,
i.name AS IndexName,
ps.reserved_page_count AS PageCount
FROM sys.partitions p
INNER JOIN sys.objects o ON
p.object_id = o.object_id
JOIN sys.indexes i ON
p.object_id = i.object_id
AND i.index_id = p.index_id
INNER JOIN sys.dm_db_partition_stats ps ON
i.object_id = ps.object_id
AND ps.index_id = i.index_id
WHERE p.data_compression_desc = ‘NONE’
AND SCHEMA_NAME(o.schema_id) <> ‘SYS’
)
SELECT *
, IIF(IndexName IS NULL, REPLACE(CONCAT_WS(‘ ‘, ‘EXEC sp_estimate_data_compression_savings’, ”” + SchemaName + ”’,’, ”” + BaseTableName + ”’,’, ”’NULL”, ‘, ”’NULL”, ‘, ”’PAGE”’), ”’NULL”’, ‘NULL’), ”) AS CheckTableSavingsSQL
, IIF(IndexName IS NULL, CONCAT_WS(‘ ‘, ‘ALTER TABLE’, TableName, ‘REBUILD PARTITION = ALL WITH (DATA_COMPRESSION = PAGE);’), CONCAT_WS(‘ ‘, ‘ALTER INDEX’, IndexName, ‘ON’, TableName, ‘REBUILD PARTITION = ALL WITH (DATA_COMPRESSION = PAGE);’)) AS CompressionSQL
FROM CompressionCheck
ORDER BY PageCount
– Ken Adams, Key2 Consulting