SQL Server: Does Parallel Plan for rebuild index impact the fragmentation of index?

The title for this post might sound ironical since rebuild of the index is used to remove the fragmentation of the index so how does rebuild of index affect the fragmentation of index? Well, this is what I would like to explain in this blog post.

Generally, for rebuilding of indexes (alter index … rebuild) optimizer tries to generate a serial plan with no parallelism, however for large tables, the index rebuild compels the optimizer to generate parallel plan to distribute the keys across multiple threads to perform parallel sorts so that indexes rebuild can be performed faster at the expense of higher CPU.

The Parallel Index rebuild Query Plan uses range partitioning where in each of the thread is given a range of keys which it sorts and build a smaller index subunits.Finally there is Gather stream operator which links all the index subunits together (logically) to form a logically contiguous index page links but physically non-contiguous.

Thus when the optimizer decides to go for parallel plan to rebuild the index, the parallel threads does extent allocation in noncontiguous allocation and thereby doesn’t eliminate the fragmentation completely and still leads to some fragmentation due to extent allocation in non-contiguous location.

Let me illustrate this with an example

create table T1
(
a int primary key,
b char(80)
)

--Performing large number of random inserts in the table which will cause fragmentation in the table

Declare @i int;
SET @i = 1
while (@i <=500000)
begin
insert into T1 values(@i,'Test')
insert into T1 values(1000000-@i,'Test')
set @i = @i + 1
end

--Let us know check the fragmentation in the clustered index of the table caused due to random inserts

select database_id,object_id,index_id,index_level,
avg_fragmentation_in_percent,fragment_count,
avg_fragment_size_in_pages,page_count 
from sys.dm_db_index_physical_stats(DB_ID(),OBJECT_ID('T1'),1,1,'DETAILED')

database_id object_id index_id index_level avg_fragmentation_in_percent fragment_count avg_fragment_size_in_pages page_count
12                   197575742                  1              0        66.789458                                             2358                      1.4966                                      3529
12                    197575742                1              1         90.9                                                           11                           1                                               11
12                    197575742                 1             2          0                                                              1                             1                                                 1

As we see in the above output, the leaf level of the clustered index (level 0) is around 67% fragmentated with 2358 fragments with around 1.5 pages per fragments. Hence the total clustered index takes around 3541 pages (3529+11+1).

Now let me rebuild the index by using the following command

alter index all on T1 rebuild WITH(MAXDOP=2)

Even if I do not specific MAXDOP=2 , SQL instance on my laptop will use parallelism of 2 by default since it is a dual-core machine.

After rebuild the index, let us check the fragmentation again.

select database_id,object_id,index_id,index_level,
avg_fragmentation_in_percent,fragment_count,
avg_fragment_size_in_pages,page_count 
from sys.dm_db_index_physical_stats(DB_ID(),OBJECT_ID('T1'),1,1,'DETAILED')

database_id object_id index_id index_level avg_fragmentation_in_percent fragment_count avg_fragment_size_in_pages page_count
12                    197575742            1               0                        0.08688                                     8                                143.875                              1151
12                    197575742            1               1                        0                                                  2                                      1                                        2
12                    197575742            1               2                         0                                                  1                                      1                                         1

As expected we see in the above output, the fragmentation in the leaf level of the index is reduced to 0.09% with 8 fragments at leaf level and 2 fragments at the intermediate level. The leaf level has on an average 144 pages per fragments which intermediatel level has 1 page per fragments

Now let us truncate the table,populate the table again with the same script above and rebuild the index but this time WITH MAXDOP=1 setting and check the fragmentation of the index

TRUNCATE TABLE t1;
GO

Declare @i int;
SET @i = 1
while (@i <=50000)
begin
insert into T1 values(@i,'Test')
insert into T1 values(100000-@i,'Test')
set @i = @i + 1
end

alter index all on T1 rebuild WITH(MAXDOP=1)
GO

select database_id,object_id,index_id,index_level,
avg_fragmentation_in_percent,fragment_count,
avg_fragment_size_in_pages,page_count 
from sys.dm_db_index_physical_stats(DB_ID(),OBJECT_ID('T1'),1,1,'DETAILED')

database_id object_id index_id index_level avg_fragmentation_in_percent fragment_count avg_fragment_size_in_pages page_count
12                  197575742       1          0                        0                                                        3                            383.333333333333                     1150
12                   197575742       1          1                         0                                                        1                                            2                                          2
12                   197575742       1          2                         0                                                        1                                            1                                           1

As we see in the output now, the leaf level of the clustered index has now only 3 fragments with each fragments having around 384 pages on an average per fragment. Further the entire leaf level now requires only 1150 pages as compared 1151 pages when the index is rebuild WITH MAXDOP=2. if we compare the intermediate level we now have only 1 fragments which consists of 2 pages as compared to 2 fragments observed previously

A fragment is a chunk or a segment in the datafile with continuous allocation of pages. So more number of fragments is an indication of more number of dis-continuous allocation of segments hence more fragmentation.

My laptop didn't had enough resources nor did I have patience for data load, but if you want to see difference more prominent, you can load more rows in the above script and if you have more cpus and hence more parallel threads used for rebuild, the amount of fragments will be huge

Using the above learning we conclude the following

Conclusion

For large databases/datawarehouse generally of the order of TBs which are hosted on the servers with large number of CPUs i.e SMP architecture, while rebuilding indexes it is recommended to rebuild with (MAXDOP=1). This can lead to index rebuild being slower but at the same time we ensure the indexes are completely defragmentated and further serial plan uses less resources (CPU and memory) as compared to parallel plan which causes high utilization of CPU and Memory.

Further, due to same reason in SQL 2008 R2 Fastrack Datawarehouse, Microsoft recommends to rebuild the index WITH (MAXDOP=1).

So, I would not recommend to use serial plan for rebuild indexes always. The answer is our favourite "Its depends", if the parallel index rebuild doesn't use much resources and reduces the fragmentation to a tolerable limit we can go for the parallel plan which is default. However if either the resource utilization caused by the index rebuild is high or if the fragmentation of the index is not reduced within the desired limit, we should go for MAXDOP=1

Further Reading
------------------

http://msdn.microsoft.com/en-us/library/ms191292(v=sql.90).aspx

http://blogs.msdn.com/b/sqlqueryprocessing/archive/2006/12/13/index-build-strategy-in-sql-server-part-2-offline-parallel-no-partitioning-non-stats-plan-no-histogram.aspx

http://blogs.msdn.com/b/sqlqueryprocessing/archive/2006/12/11/index-build-strategy-in-sql-server-part-2-offline-parallel-no-partitioning.aspx

Hope this helps!!!

Parikshit Savjani
Premier Field Engineer

SQL Server: Does Parallel Plan for rebuild index impact the fragmentation of index?

Web Application Analytics using IIS Logs & Powerview

I killed Reindexing Job and now it is in KILLED/ROLLBACK State for more than 9 hours

Did you know: A Full Database Backup for a database in Full Recovery model don’t truncate the T-Log and hence doesn’t break Log backup chain

Long Blocking Chain Caused due to Compile Locks

Troubleshooting Non-Buffer Latch Contention: LATCH_XX

When do we need asymmetric storage with AlwaysON FCI and Availability Storage?

Leave a Reply Cancel reply

Similar Posts

Leave a Reply Cancel reply