SQL Server: Does Parallel Plan for rebuild index impact the fragmentation of index?

The title for this post might sound ironical since rebuild of the index is used to remove the fragmentation of the index so how does rebuild of index affect the fragmentation of index? Well, this is what I would like to explain in this blog post.

Generally, for rebuilding of indexes (alter index … rebuild) optimizer tries to generate a serial plan with no parallelism, however for large tables, the index rebuild compels the optimizer to generate parallel plan to distribute the keys across multiple threads to perform parallel sorts so that indexes rebuild can be performed faster at the expense of higher CPU.

The Parallel Index rebuild Query Plan uses range partitioning where in each of the thread is given a range of keys which it sorts and build a smaller index subunits.Finally there is Gather stream operator which links all the index subunits together (logically) to form a logically contiguous index page links but physically non-contiguous.

Thus when the optimizer decides to go for parallel plan to rebuild the index, the parallel threads does extent allocation in noncontiguous allocation and thereby doesn’t eliminate the fragmentation completely and still leads to some fragmentation due to extent allocation in non-contiguous location.

Let me illustrate this with an example

create table T1
(
a int primary key,
b char(80)
)

--Performing large number of random inserts in the table which will cause fragmentation in the table

Declare @i int;
SET @i = 1
while (@i <=500000)
begin
insert into T1 values(@i,'Test')
insert into T1 values(1000000-@i,'Test')
set @i = @i + 1
end

--Let us know check the fragmentation in the clustered index of the table caused due to random inserts

select database_id,object_id,index_id,index_level,
avg_fragmentation_in_percent,fragment_count,
avg_fragment_size_in_pages,page_count 
from sys.dm_db_index_physical_stats(DB_ID(),OBJECT_ID('T1'),1,1,'DETAILED')


database_id object_id index_id index_level avg_fragmentation_in_percent fragment_count avg_fragment_size_in_pages page_count
12                   197575742                  1              0        66.789458                                             2358                      1.4966                                      3529
12                    197575742                 1              1         90.9                                                           11                            1                                               11
12                    197575742                 1              2           0                                                               1                             1                                                 1

As we see in the above output, the leaf level of the clustered index (level 0) is around 67% fragmentated with 2358 fragments with around 1.5 pages per fragments. Hence the total clustered index takes around 3541 pages (3529+11+1).

Now let me rebuild the index by using the following command

alter index all on T1 rebuild WITH(MAXDOP=2)

Even if I do not specific MAXDOP=2 , SQL instance on my laptop will use parallelism of 2 by default since it is a dual-core machine.

After rebuild the index, let us check the fragmentation again.

select database_id,object_id,index_id,index_level,
avg_fragmentation_in_percent,fragment_count,
avg_fragment_size_in_pages,page_count 
from sys.dm_db_index_physical_stats(DB_ID(),OBJECT_ID('T1'),1,1,'DETAILED')

database_id object_id index_id index_level avg_fragmentation_in_percent fragment_count avg_fragment_size_in_pages page_count
12                    197575742            1               0                        0.08688                                     8                                143.875                              1151
12                    197575742            1               1                         0                                                  2                                      1                                        2
12                    197575742            1               2                         0                                                  1                                      1                                         1

As expected we see in the above output, the fragmentation in the leaf level of the index is reduced to 0.09% with 8 fragments at leaf level and 2 fragments at the intermediate level. The leaf level has on an average 144 pages per fragments which intermediatel level has 1 page per fragments

Now let us truncate the table,populate the table again with the same script above and rebuild the index but this time WITH MAXDOP=1 setting and check the fragmentation of the index

TRUNCATE TABLE t1;
GO

Declare @i int;
SET @i = 1
while (@i <=50000)
begin
insert into T1 values(@i,'Test')
insert into T1 values(100000-@i,'Test')
set @i = @i + 1
end

alter index all on T1 rebuild WITH(MAXDOP=1)
GO

select database_id,object_id,index_id,index_level,
avg_fragmentation_in_percent,fragment_count,
avg_fragment_size_in_pages,page_count 
from sys.dm_db_index_physical_stats(DB_ID(),OBJECT_ID('T1'),1,1,'DETAILED')

database_id object_id index_id index_level avg_fragmentation_in_percent fragment_count avg_fragment_size_in_pages page_count
12                   197575742       1          0                        0                                                        3                            383.333333333333                     1150
12                   197575742       1          1                         0                                                        1                                            2                                          2
12                   197575742       1          2                         0                                                        1                                            1                                           1

As we see in the output now, the leaf level of the clustered index has now only 3 fragments with each fragments having around 384 pages on an average per fragment. Further the entire leaf level now requires only 1150 pages as compared 1151 pages when the index is rebuild WITH MAXDOP=2. if we compare the intermediate level we now have only 1 fragments which consists of 2 pages as compared to 2 fragments observed previously

A fragment is a chunk or a segment in the datafile with continuous allocation of pages. So more number of fragments is an indication of more number of dis-continuous allocation of segments hence more fragmentation.

My laptop didn't had enough resources nor did I have patience for data load, but if you want to see difference more prominent, you can load more rows in the above script and if you have more cpus and hence more parallel threads used for rebuild, the amount of fragments will be huge

Using the above learning we conclude the following

Conclusion

For large databases/datawarehouse generally of the order of TBs which are hosted on the servers with large number of CPUs i.e SMP architecture, while rebuilding indexes it is recommended to rebuild with (MAXDOP=1). This can lead to index rebuild being slower but at the same time we ensure the indexes are completely defragmentated and further serial plan uses less resources (CPU and memory) as compared to parallel plan which causes high utilization of CPU and Memory.

Further, due to same reason in SQL 2008 R2 Fastrack Datawarehouse, Microsoft recommends to rebuild the index WITH (MAXDOP=1).

So, I would not recommend to use serial plan for rebuild indexes always. The answer is our favourite "Its depends", if the parallel index rebuild doesn't use much resources and reduces the fragmentation to a tolerable limit we can go for the parallel plan which is default. However if either the resource utilization caused by the index rebuild is high or if the fragmentation of the index is not reduced within the desired limit, we should go for MAXDOP=1

Further Reading
------------------

http://msdn.microsoft.com/en-us/library/ms191292(v=sql.90).aspx

http://blogs.msdn.com/b/sqlqueryprocessing/archive/2006/12/13/index-build-strategy-in-sql-server-part-2-offline-parallel-no-partitioning-non-stats-plan-no-histogram.aspx

http://blogs.msdn.com/b/sqlqueryprocessing/archive/2006/12/11/index-build-strategy-in-sql-server-part-2-offline-parallel-no-partitioning.aspx

Hope this helps!!!

Parikshit Savjani
Premier Field Engineer

SQL Server 2012 AlwaysON Automatic Failover Set – Caveat

As I was going through the documentation of AlwaysON, I came across the following caveat with SQL Server AlwaysON Automatic Failover set.

While we can have upto synchronous secondary replica,  we can have only one secondary replica to be part of Automatic Failover Set. Now, what that means is only of the synchronous secondary replica can be set for Automatic failover if the primary replica has failed.

Caveat

If the synchronous secondary replica which forms the Automatic Failover set with Primary Replica goes down due to some reason, the Primary Replica is exposed until the same secondary replica node is up again.

Scenario

I am taking the following Example from Msdn

Replica Availability   Mode and Failover Mode Settings
A (Current Primary) Synchronous   commit with automatic failover
B Synchronous   commit with automatic failover
C Synchronous   commit with manual failover only


if the Secondary B fails due to some reason and during the same time, if the primary replica does down, the server cannot perform the Automatic Failover inspite of having another synchronous secondary replica. It will wait for Manual failover in which case availability of the server will be impacted.

While this scenario might be highly unlikely wherein you have the failure of two replicas involved in Automatic Failover Set at the same time and as a result the Product Team had taken this conscious decision to have only 1 secondary replica involved in AlwaysON v1 Automatic Failover set.

However if we have lot of many customer with similar scenario, the Product  Team might consider allowing multiple secondary replicas in the Automatic Failover Set.

I have raised a Connect Item to flag the same as a suggestion to the Product Team and if you think it is valid suggestion, please vote for it so that Product Team takes it with priority

https://connect.microsoft.com/SQLServer/feedback/details/757086/sql-alwayson-allows-only-a-single-secondary-to-form-an-automatic-failover-set

However currently, if you think you might be vulnerable to hit this scenario you can use custom Powershell scripts to Test whether the Secondary Replica part of the Automatic Failover set is online and if goes down, you can switch the other synchronous secondary replica as the Failover target for automatic Failover.

To develop the above script, you can use the new Test-SqlAvailabilityReplica Powershell cmdlet to test the secondary replica and switch another secondary replica for automatic failover using Set-SqlAvailabilityReplica cmdlet.

Hope this helps !!!

Parikshit Savjani
Premier Field Engineer

Performance Tuning with Columnstore index using Batch Processing

As we all are aware, SQL 2012 introduced columstore indexes which is based Microsoft’s xvelocity in-memory engine  and as you might have heard and read in numerous articles it is known to provide 5-50X improvement in performance of queries.

Reading and Hearing 5-50X improvement in performance of queries causes many eyes to pop-out and they begin to consider building columnstore indexes in most of the tables assuming it would provide significant improvement in their application.  In this blog post, I would like to emphasize on when should one consider building columnstore index and further ensuring whether the query is benefited from batch processing mode of columnstore index.

Let me start with

When should you not consider creating Columnstore index

  •  On small tables with fewer rows, typically in thousands of records one should not create Columnstore index.
  • If the queries filters on the columns with high selectivity with no aggregation. In order words if the query plan already  perform Seek operation.
  • If the table is highly transactional  which requires frequent inserts, updates and deletes.

The above points are all characteristics of an OLTP database and hence columnstore indexes are not good candidate for OLTP system which consists of normalized transactional databases.

Now,

When should you consider creating Columnstore index

  • On large tables, containing  large number of records typically containing millions of records.
  • Queries which perform aggregations, sorting, group by, order by which requires full table scan of the tables
  • The table should refresh periodically and should be transactional in nature.

The above points are all characteristics of an DW database with every large fact table in which measures are aggregated grouped by dimension columns and in which the tables are refreshed less frequently by ETL operations.

Here is how I interpret columnstore indexes, while seeks are optimized by clustered and nonclustered indexes, Tables Scans, Index Scans, Aggregations, Sorts  are optimized by using Columnstore indexes.

However, Columnstore index  alone doesn’t provide the 5-50X performance improvement, the performance gains is observed when columnstore index scan is used in conjunction with batch processing which is the new vector based query execution engine designed to optimize columnstore index scan. Batch processing operates on 1000 rows batch at a time which provides that eye popping performance improvement which you might be aiming at.

Batch processing is supported by Columnstore Index Scan, Hash Aggregates, Hash Join, Hash Build, filter  operators in the query plan  and hence when you would like performance tune the query using columnstore indexes, you need to focus on achieving the batch processing mode and avoid row processing mode.

Batch Processing and Segment Elimination alongwith Columnstore Index Scan can provide maximum performance gain.

If you have gone through Eric Hanson’s session from TechEd here , he has clearly specified under following conditions, the query might not go batch processing mode

  • Outer Joins
  • IN & Exists
  • UNION ALL

In the same session, Eric has shown workarounds to rewrite the query to achieve batch processing mode which gives significant performance improvement.

I encountered one such query in which query didn’t go for batch processing mode by default and I had to rewrite the query to force it to go for batch processing mode.

Consider the following Demo

Use Adventureworks2012
GO

IF OBJECT_ID('dbo.Demo') is not NULL
DROP TABLE Demo

SELECT * INTO Demo FROM Sales.SalesOrderDetail
GO

DECLARE @i int = 1

WHILE(@i<=5)
BEGIN
INSERT INTO Demo(SalesOrderID,CarrierTrackingNumber,OrderQty,ProductID,SpecialOfferID,UnitPrice,UnitPriceDiscount,LineTotal,rowguid,ModifiedDate)
SELECT SalesOrderID,CarrierTrackingNumber,OrderQty,ProductID,SpecialOfferID,UnitPrice,UnitPriceDiscount,LineTotal,newi(),ModifiedDate FROM Sales.SalesOrderDetail

SET @i=@i+1

END

CREATE CLUSTERED INDEX IDX_CLUS_Demo ON Demo(SalesOrderDetailId)
GO

CREATE NONCLUSTERED INDEX IDX_NONCLUS_Demo ON Demo(SalesOrderId,Productid)
GO

CREATE NONCLUSTERED COLUMNSTORE INDEX IDX_CS_Demo ON
Demo
(
SalesOrderID,
SalesOrderDetailid,
CarrierTrackingNumber,
OrderQty,
ProductID,
SpecialOfferID,
UnitPrice,
UnitPriceDiscount
)
GO

Now when I fire the following query

SELECT s.OrderDate, s.ShipDate, sum(d.OrderQty),avg(d.UnitPrice),avg(d.UnitPriceDiscount)
FROM Demo d 
join Sales.SalesOrderHeader s
on d.SalesOrderID=s.SalesOrderID
WHERE d.OrderQty>500
GROUP BY s.OrderDate,s.ShipDate

If I go by the rules which I described above, the query contains inner join between a very large table (Demo) and smaller table SalesOrderHeader and the query doesn’t use OuterJoin, IN or Union All so it should go for batch processing mode.

But to my surprise, the query was going for row processing mode and the elapsed time for the query was around 300 ms on my machine.

I started wondering why it is not using the batch processing mode and Nested Loop Join caught my attention since Loop Join doesn’t support Batch Processing Mode which might be forcing the query to go for row processing mode.

So I used the query hint OPTION(hash Join) to force hash Join and Voila there I see my Batch Processing Mode used by the query plan.

With Batch Processing Mode, my query completed within the flash of an eye and it was blistering fast.

From this example, my intention is not to promote query hints but we need to understand that with SQL 2012, optimizer code may not be as matured to account for all possible ways to execute batch processing mode. Hence we need to rewrite our code or use such hints which will influence the optimizer to choose batch processing since batch processing and segment elimination can give you that mind blowing performnace improvement which you might be aiming for with columnstore index.

Parikshit Savjani
Premier Field Engineer

Troubleshooting Non-Buffer Latch Contention: LATCH_XX

Latches are lightweight synchronization primitives that are used by the SQL Server engine to guarantee consistency of in-memory structures including; index, data pages and internal structures

Latch waits are one of the possible causes of slowness in SQL Server which can limit the throughput (tps) of OLTP application.

Whenever we experience slowness in performance of SQL Server, one of the few DMVs we are interested to look is sys.dm_os_wait_stats.Cumulative wait information is tracked by SQL Server and can be accessed using the Dynamic Management View (DMW) sys.dm_os_wait_stats. SQL Server employs three latch wait types as defined by the corresponding “wait_type” in the sys.dm_os_wait_stats DMV:

Buffer (BUF) latch: used to guarantee consistency of index and data pages for user objects. They are also used to protect access to data   pages    that SQL Server uses for system objects. For example pages that manage allocations are protected by buffer latches. These include the Page Free Space (PFS), Global Allocation Map (GAM), Shared Global Allocation Map (SGAM) and Index Allocation Map (IAM) pages. Buffer latches are reported in sys.dm_os_wait_stats with a wait_type of PAGELATCH_*.

Non-buffer (Non-BUF) latch: used to guarantee consistency of any in-memory structures other than buffer pool pages. Any waits for non-buffer latches will be reported as a wait_type of LATCH_*.

IO latch: a subset of buffer latches that guarantee consistency of the same structures protected by buffer latches when these structures require loading into the buffer pool with an I/O operation. IO latches prevent another thread loading the same page into the buffer pool with an incompatible latch. Associated with a wait_type of PAGEIOLATCH_*.

Troubleshooting Buffer Latch contention is very nicely documented in the whitepaper here by SQLCAT Team

If you see significant PAGEIOLATCH waits it means that SQL Server is waiting on the I/O subsystem. While a certain amount of PAGEIOLATCH waits is expected and normal behavior, if the average PAGEIOLATCH wait times are consistently above 10 milliseconds (ms) you should investigate why the I/O subsystem is under pressure.

However, there is less documentation available on Non-Buffer latch but it is scattered and  hence I thought it would be worth to compose this blog

  1.  Check sys.dm_os_wait_stats to      confirm whether LATCH_XX is the major wait on the SQL Server
    SELECT * from sys.dm_os_wait_stats
    Order by wait_time_ms desc
  2. If LATCH_XX is found to be  major wait, next check sys.dm_os_latch_stats which is major non-buffer      latch on which SQL Server is waiting on
    SELECT   latch_class, wait_time_ms / 1000.0 AS [Wait In sec],
    waiting_requests_count AS [Count of wait],
    100.0 * wait_time_ms / SUM (wait_time_ms) OVER() AS Percentage,
    FROM sys.dm_os_latch_stats
    WHERE latch_class NOT IN ('BUFFER')
    AND wait_time_ms > 0
  3. Monitor master.dbo.sysprocesses  for waitresource column

Based on the above data collection, you may find the one of the following possible cause of excessive non-buffer page latch (in my experience).

  • Is there any profiler trace in particular client side tracing running on the server

When you run SQL Profiler against an instance of SQL Server 2000 or of SQL Server 2005, each active SPID of the instance requires an exclusive latch (LATCH_EX) to write the event to SQL Profiler. The SPID writes the event to the local memory buffer first. Then, the event is transferred to SQL Profiler. If the local memory buffer is full, the SPID discards the event. After the events in the local memory buffer are transferred to SQL Profiler, the SPID can write new events to the local memory buffer. If other SPIDs are waiting to write events to SQL Profiler, these SPIDs wait in a queue.

You can detect this case when you will see TRACE,TRACE_CONTROLLER,TRACE_IO_SYNC as one of the major latch waits, further in master.dbo.sysprocesses you will see TRACE related waitresource column

  • Excessive  Parallelism

While parallelism is not the direct cause of NON-BUF Latches, however whenever we have excessive parallelism on muti-core processors machine (generally more than 16+ logical processors) , almost all the queries qualifies for the parallel plans.While LATCH_XX latches are used to synchronize internal structures used in parallel plan.

You can detect this case when you see ACCESS_METHODS_SCAN_RANGE_GENERATOR, ACCESS_METHODS_KEY_RANGE_GENERATOR as one of the major waittypes in sys.dm_os_latch_stats.

Further you may also see high CXPACKETS in sys.dm_os_wait_stats alongwith LATCH_EX latches as major waits. While waitresource column in master.dbo.sysprocesses might show waits on PARALLEL_PAGE_SUPPLIER.

  • Very Heavy  insert/update/deletes on Heap/BLOB data structures on the SQL instance

When there is heavy inserts/updates/deletes on heap/BLOB data structures, the internal memory structures which is used for allocation/deallocation of the pages to heap are synchronized using Non-BUF Page Latch.In one of my customer site, they had an audit table which was a heap table which was inserted heavily since every transaction in the application is logged in the audit table which causes excessive waits on LATCH_EX waits.

You can detect this case when you see  ALLOC_FREESPACE_CACHE, ALLOC_EXTENT_CACHE, ALLOC_CREATE_FREESPACE_CACHE as one of the major waittype in sys.dm_os_latch_stats

While waitresource column in master.dbo.sysprocesses might show waits on IDES

  • Excessive  Auto-grow or Autoshrink Operation

Due to insufficient sizing or capacity planning, you might experience frequent Autogrow or if you Autoshrink turned ON, you might see frequent AUTOSHRINK. During growing or shrinking, SQL Server acquires FCB, FGCB_ALLOC latch to synchronize the access to the filegroup.You can detect this case when you see FCB, FGCB_ALLOC as one of the major waittype in sys.dm_os_latch_stats.

In this case, the waitresource columns in master.dbo.sysprocesses might show waits on FCB and FGCB_ALLOC

Please Note: The above information is purely based on my research and experience. The content should be treated AS IS and under no circumstance should be treated as offical documentation from Microsoft

Parikshit Savjani
Premier Field Engineer

SQL 2012 Resources !!!

As we all are excited with the release of SQL 2012, it is important for all of us to quickly rampup and equip ourselves with SQL 2012. In this blog post, I am sharing some of the important SQL 2012 resources which are available which can help you in your readiness.

SQL 2012 Product Guide
==================

http://www.microsoft.com/en-us/download/details.aspx?id=29418

SQL 2012 Developer Training Kit (Wiki)
=============================

http://social.technet.microsoft.com/wiki/contents/articles/6982.sql-server-2012-developer-training-kit-bom-en-us.aspx


Technet Virtual Labs: SQL 2012 Virtual Labs

====================================

http://technet.microsoft.com/en-us/virtuallabs/hh859575

SQL 2012 Technical Upgrade Guide
=========================

 http://download.microsoft.com/download/9/5/3/9533501A-6F3E-4D03-A6A3-359AF6A79877/SQL_Server_2012_Upgrade_Technical_Reference_Guide_White_Paper.pdf

Memory Configuration and Sizing Considerations in SQL 2012
==============================================

http://support.microsoft.com/kb/2663912

 
If you are aware of any other useful resources, Feel free to share and I will incorporate it in this blog post. I will update this blog post as we have more useful resources.

Happy Learning !!!

Parikshit Savjani
Premier Field Engineer

Identifying Top MDX Queries running in SSAS environment

SQL Server has management studio reports, performance dashboard reports to identify top cpu, duration, io queries which is possible due to the DMVs viz sys.dm_exec_query_stats which is available to track the queries.

However with SSAS, we do not have any DMV similar to sys.dm_exec_query_stats which tracks and captures the statistics of the queries fired against SSAS cube. In this blog post, I would discuss on how we can achieve something similar with SSAS.

While doing performance tuning of the cube, it is very important to identify top mdx queries which are running slow or consuming high amount of resources on the server.To achieve this, we need to use profiler to capture the MDX queries.

Note: Profiler is a resource intensive tool and should be run only on the server only when there is no cpu,memory or io pressure on the server.

Using Profiler, we need to capture following events (Make sure all columns is checked)

Query Begin
Query End
Entire Query Processing Events

Once the above events are configured, the profiler needs to be run on the server for around 20-30 mins during peak activity on the server to ensure most of the hard hitting mdx queries are captured.

Once the Mdx queries are captured using profiler, the trace file(s) needs to be moved to another non-production workstation which has SQL Server Database Engine installed. Like SQL Server, we can load the analysis service profiler trace into a table in SQL database.

You can open the profiler trace and click as File –> Save as –> Trace Table, specify the sql instance name, database name, table name.

Depending upon the amount of the data captured in profiler trace, the time taken to load the trace in table may vary accordingly. You can monitor the status by looking at the bottom left corner of the profiler trace which shows the status as Saving 20%

Once the data is saved in the trace table, you can use the following query to identify TOP 10 slow duration queries running in the environment.

Select TOP 10 TextData AS [Query],SPID,Duration
FROM UserLoad                      -- UserLoad is Table in which profiler trace is loaded
where EventClass=10              -- EventClass 10 represents Query End event
Order by Duration DESC

In order to tune the mdx queries, it is important to identify where the query has spent most of its time, is it the storage engine or formula engine. The tuning approach will be decided once it is clear where most of the query processing time is spent.

Hence to identify this we can use the following query

Select TOP 10 TextData AS [Query],SPID,Duration,
b.[Storage Engine],Duration-b.[Storage Engine] AS [Formula Engine]
FROM UserLoad a cross apply
(
SELECT SUM(duration) AS [Storage Engine] FROM UserLoad
where spid = a.spid and Rownumber < a.Rownumber and EventClass=11 -- Eventclass 11 represents Query Subcube events
) b
where EventClass=10
Order by Duration DESC

From the above query, we can identify where the most of the time is spent by the query.

If the query spends most of its time in Storage Engine, we can use aggregations, cache warmers or memory tuning approach to improve the response time of the query.

If the query spends most of its time in Formula Engine, we need to check for cell by cell evaluation mode, cell security or empty cell calculation all of which can cause query to spent more time in formula engine.

Using the above approach, you can perform MDX query tuning in SSAS.

Hope this helps !!!

Parikshit Savjani
Premier Field Engineer

Placing a Drillthrough Report in a different folder in SQL Server Reporting Services

When you design a drillthrough report in your reporting server project using BIDS, while adding the drillthrough actions for the report, in the dropdown you only get option to select the reports (rdl files ) designed in the same project. Hence when the entire project is uploaded, all the drillthrough reports can deployed in the same folder.

Now you may want to move the parameterized drillthrough reports on a different folder to implement security and to promote reusability. However when you move the drillthrough reports to seperate folder, the dashboard report breaks and shouts at you since it is not able to locate the drillthrough report in the same folder.

How do you achieve this ?

There are two ways to achieve the above requirements

1. Using to Jump to URL in the Drillthrough Action while designing the reports in BIDS

In this approach, you need to deploy the drillthrough reports first on the report server and get the URL to render the report. You can use this URL in the JUMP to URL option while defining the Drillthrough Actions in the Dashbaord Report.

The only drawback of this approach is, in the dashboard report when the end user clicks on the drillthrough report, the drillthrough report is rendered from here the user can’t go back to the original dashboard report.

 

2. Creating a Linked Report in the same folder

In this approach, after  we move the drillthrough report in a seperate folder, we need to create a linked report with the same name as the drillthrough report and place it in the same folder as the dashboard.

The Linked report will serve the requirement.

 

Hope this small tip helps !!!

Parikshit Savjani
Premier Field Engineer

Considerations for Configuring Log Shipping for a cross-domain clustered instance of SQL Server over a Firewall

Last week, one of my customers had engaged me for consulting to configure Log Shipping on clustered instance of SQL Servers which are running in different domains. In other words, the primary and secondary servers are in different domains.

While Log Shipping is pretty easy to configure but configuring log-shipping for a cross-domain clustered instance of SQL server over a Firewall can be trickier which requires some consideration so I plan to discuss these consideration in this blog post for the benefit of larger audience

First Let us discuss Clustering and Log Shipping as HADR Solution

Clustering and Log Shipping
=======================

Until SQL 2012 AlwaysON, we didn’t have a good HADR solution with SQL Server. Clustering serves as a very good High Availability Solution due to minimum failover time as well as SQL Network Name allowing the Application a transparent failover.  However due to the shared storage, Clustering cannot be used as a Disaster recovery solution. While Log Shipping/Database Mirroring serves as a good DR Solution but may not be a good High Availability solution since RPO and RTO depends on the frequency of tlog backup, Network latency.

So Clustering along with Database Mirroring or Log Shipping together serves as a Good HADR Solution where we can take the best of both the worlds

Let us now discuss various considerations in Log Shipping scenario

Considerations for Log Shipping for a Clustered instance
=============================================

For a clustered instance, there is one important consideration. The network share for the Log Shipping should be created as a Fileshare resource on the Cluster.

Further, the fileshare resource should be created in the SQL Group while the shared folder should be created on the shared drive. The dependency of the Fileshare resource should be on Shared Drive in the SQL Group and SQL Network Name.

The reason for creating Fileshare in the SQL Group of the Cluster is so that when the instance is failed over to Node 2, the Log shipping continues to run transparently without any impact. Further the UNC path for the share would be \\<SQL Network Name>\<Folder Name>. This will ensure the copy job will always be able to access the network share on shared drive.

How to create file shares on a cluster
http://support.microsoft.com/kb/224967

 

Considerations for Log Shipping on a cross domain servers
================================================

While it is possible to establish Log Shipping on a cross domain servers, the only requirement is the two servers should have network connectivity to each other and should be able to access each other.

However the major challenge while establishing Log Shipping on a cross domain servers is permission on the network share which should be accessible to copy job running as under SQL Agent account on the secondary server.

For the SQL Agent account which will be a domain account, to be authenticated by the Primary server, there should be two way trust between the domains.

However if you are not able to establish the two way trust between the domains due to some reason, we can still configure Log Shipping by using Network PassThrough Authentication.

With Network PassThrough Authentication, the SQL Agent account for the servers should have same name and same password.

More information on Network Passthrough Authentication
http://support.microsoft.com/kb/102716

 

Considerations for Log Shipping over a Firewall
=======================================

While Log shipping uses SMB communication to copy the log files from the Network Share to the local destination folder and hence in order to configure over Firewall, following ports should be open

If,

Netbios over TCP/IP is enabled

TCP ports 137,139
UDP ports 137,138

Direct Hosting of SMB over TCP/IP

TCP port 445

More information on port requirement for Windows Server
http://support.microsoft.com/kb/832017

Further, with Cluster, the above ports should be open in Firewall on Virtual IP address for inbound rules while for outbound rules, it should be open on Physical IP address of the Nodes.

More information for Firewall configuration on Cluster
https://msmvps.com/blogs/clustering/archive/2005/08/11/62372.aspx

While in my case, we had to take all the above consideration to ensure we are a able to setup Log shipping for a clustered instance of SQL Server across cross domain server over a Firewall.

Hope this Helps !!

Parikshit Savjani
Premier Field Engineer

SQL Server Deployment Checklist

While most experienced DBAs would maintain their checklist before installing and setting up SQL Server in the production environment. However SQL Server is a product which is very commonly used by Small and Medium Enterprise who do not necessarily have experienced DBAs. Infact the person, installing SQL Server may not be even a DBA.

I came across an excellent Blog post from John who has enlisted a comprehensive checklist of the consideration for installing SQL Server setup. Although SQL Server configuration is very easy due to its User Friendly setup and minimal administration. If all the points mentioned in this checklist is taken care off, you can avoid most problems/issues which you might encounter post deployment of SQL Server

http://blogs.msdn.com/b/johnhicks/archive/2008/03/03/sql-server-checklist.aspx

Would like to appreciate the author’s efforts to pen down a comprehensive checklist !!!

Hope this Helps !!

Parikshit Savjani
Premier Field Engineer

SQL Server Predeployment IO Stress Testing using sqlio

While there are some really good in-depth whitepapers and blogs (specifically by CSS Team and Brent Ozar & his team) to address this topic. However the information in these blogs,whitepapers is kind-off scattered and for a novice the information in some these whitepapers can be overwhelming.In this blog post I would like to explain you the approach one would want to take to perform IO Stress Testing in order to meet expected workload from Application hosted on SQL Server.

If you are migrating to a new server, it is important to capture the IO statistics which is generated by your current workload which can help you to make a decision whether the new server will be able to withstand your current IO workload while providing the expected response time.

In order to capture the current Disk IO statistics, we need to capture the following performance counters

Logical Disk & Physical Disk


Disk Read Bytes/sec   )
Disk Write Bytes/sec  )    Referred as IO bandwidth
Disk Bytes/sec             )
Disk Reads/sec            }
Disk Transfers/sec      }    Equates to IOPs in SAP World
Disk Writes/sec           }
Disk sec/reads             )
Disk sec/Writes           )     Referred as Disk Latency
Disk sec/Transfers      )
Disk Bytes/Read          }
Disk Bytes/Writes       }    IO Transfer Size (generally ranges from 8k to 64k)
Disk Bytes/Transfer   }
Avg Disk Read Queue Length   )
Avg Disk Write Queue Length  )   (Disk Transfers/sec) * (Disk sec/Transfer)
Avg Disk Queue Length             )
If you need more information on these counter, I would recommend you to read this blog.

On my server, I captured the above mentioned performance counters during peak workload and I derived at the following IO statistics generated by current workload.

 

IO Metrics E: (Log Drive) F: (Data Drive) Total Physical Disk
Disk MB/sec 100MB/sec 80MB/s 180MB/sec
Disk IOPs 125 202 344
Disk Bytes/Transfer 23KB 50KB 37KB
Avg Disk Queue Length 5.7 17 23.47
Disk Read Latency 23ms 57ms 51ms
Disk Write Latency 77ms 56ms 67ms

 

Ideally the Disk latency is expected to be within 10-15ms so the above counters already suggests the current Disk performance is poor and not meeting the expected standards.

As a result of poor Disk performance, we start seeing IO_COMPLETION, WRITELOG and PAGEIOLATCH_*  wait types in our environment.

Once we have established the IO Statistics for our current workload we can now look at tools available to benchmark our new Storage which will be hosting our SQL Server and will have to provide throughput which exceeds our current workload within the acceptable Disk latency (< 10-15ms).

Our recommended tool for performing SQL IO stress testing is sqlio. There are other third party tools available  viz IOMeter and CrystalDiskMark, however I prefer to use sqlio. it is purely your own comfort and in this blog post I will be talking about sqlio.

We have another tool called SQLIOSim but as per the guidance from Microsoft CSS Blog, It should not be used for  performance testing purpose and should only be used to test the storage from perspective of data integrity and consistency.

Before performing the Stress test, it helps to get the following information from the Storage Vendor

Which RAID configuration is used ?
RAID 10 is recommended for SQL Server.

How many LUNs are configured?
Generally we recommend separate LUNs for datafiles,log files and Tempdb files.

What is strip_unit_size used for LUN ?
It should be multiple of file_allocation_unit_size to address Disk Alignment issues

How much is the Storage Controller Cache available? How much is Read Cache and how much is the write cache?

Larger the Storage Cache, better is the response time at the expense of cost. For SQL Server, it is recommended to have 80/20 or 90/10 ratio of write cache/read cache for SQL Server since large write cache tends to absorb the checkpoint spikes and improves the write performance of SQL Server.

Once we have the above information available, we need to install sqlio.msi which by default installs itself in
C:\Program Files\sqlio                       For 32 bit
C:\Program Files(x86)\sqlio              For 64 bit

More information about SQLIO command line options is already documented by Brent Ozar in his blog post and hence I would not like to duplicate the efforts here.

To start with first we should create a file called testfile.dat in the drive which needs to be stress tested. So we need to make the necessary changes in param.txt to change the drive letter and change the file size as mentioned in the Brent’s blog.

Practically the file size should be around 3 to 4 times the Storage controller cache size. So if the cache size is 5GB we might want to create the testfile.dat of 20GB size. In my case, we create a file of 100GB size.

We can fire the following command to create the file.
sqlio -kW -s10 -fsequential -t8 -o8 -b64 -LS -Fparam.txt timeout /T 10

During this run of SQLIO, ignore the data points and just let the file be created.

The advantage of using SQLIO for IO Stress testing is gives lot of flexibility to generate various IO patterns as generated by SQL Server.

SQL Server generates both random and sequential IO patterns while transfer size of the IO varies from 8k to 64K but doesn’t exceed 64K except for backup and restore operations.

So in our test, we have 3 variable parameters viz

1. IO Pattern (Sequential or Random)  –> 2 values
2. IO Type (Read or Write) –> 2 values
2. IO Transfer Size (8k,64k,128k,256k,512k)  –> 5 values
3. Outstanding IO requests (1,2,4,……1024) –> 11  values

As we all know the rules of testing, if we have 3 variable parameters on which we need to perform the test, we need to vary one parameter at a time while keeping rest of the parameters constant.

So we will have around ( 2*2* 5* 11= 220)  test runs of sql io. I prefer to create bat files for these test runs beforehand and also capture the above mentioned perfmon counters while running the tests.


Random Read Test

sqlio -dE -BH -kR -frandom -t1 -o1 -s120 -b64 -LS testfile.dat
sqlio -dE -BH -kR -frandom -t2 -o1 -s120 -b64 -LS testfile.dat
sqlio -dE -BH -kR -frandom -t4 -o1 -s120 -b64 -LS testfile.dat
sqlio -dE -BH -kR -frandom -t4 -o2 -s120 -b64 -LS testfile.dat
sqlio -dE -BH -kR -frandom -t4 -o4 -s120 -b64 -LS testfile.dat
sqlio -dE -BH -kR -frandom -t4 -o8 -s120 -b64 -LS testfile.dat
sqlio -dE -BH -kR -frandom -t4 -o12 -s120 -b64 -LS testfile.dat
sqlio -dE -BH -kR -frandom -t4 -o16 -s120 -b64 -LS testfile.dat
sqlio -dE -BH -kR -frandom -t4 -o64 -s120 -b64 -LS testfile.dat
sqlio -dE -BH -kR -frandom -t4 -o128 -s120 -b64 -LS testfile.dat

Note: Do not exceed the no. of threads (-t) beyond the number of processors on the server.

In this test run, as mentioned earlier we keep other parameters constant while varying only 1 of the parameter. So we have random read and io transfer size is fixed to 64KB while outstanding io ( no. of threads (-t) * no. of io per threads (-o)) is increased in steps from 1 to 512.

For the above test, I created a bat files and routed the results to text file. In the output, we need to capture MBs/sec, IOs/sec and Avg_Latency(ms) for each test runs.

 

We captured the result in the excel sheet as shown below

 

Outstanding IO IO MB/sec IOPs Disk Latency
1 11.5 184 4
2 23 369 4
4 42.9 686 5
8 65.9 1054 7
16 109.6 1754.2 8
32 152.67 2442.7 12
48 151.21 2419.4 19
64 149.09 2385.45 26
128 149.77 2396.33 52
256 147.71 2363.38 107
512 151.83 2429.4 210

 

Based on the above results we plotted 3 graphs as shown below

 

So we see that as the outstanding IOs increases the IOPs and IO MB/sec (in other words Disk throughput) also increases until 32 outstanding IOs, beyond which there is no increase in Disk throughput even when there is increase in outstanding IOs.

Any guesses why the graph saturates at 32 outstanding IOs?
No marks for guessing that 32 is HBA queue depth setting which can further increased to 256 if required. For SQL Server generally, the HBA queue depth between 32 and 64 is recommended so you might want to increase the queue depth setting if Disk throughput at 32 outstanding ios doesn’t meet the current workload requirement.

However from SQL Server perspective, the most important parameter which you are interested is the Disk Response time which should be within 15 ms. As seen in the graph, the until 48 outstanding IO requests, the Disk responds under the acceptable limits beyond which it begin to increase exponentially.

Large Random Read Test

In this test, we keep the outstanding IO request constant to 32 (as found from the above test) generating random reads of various IO sizes varying from 8K to 512K.

sqlio -dE -BH -kR -frandom -t4 -o8 -s120 -b8 -LS testfile.dat
sqlio -dE -BH -kR -frandom -t4 -o8 -s120 -b64 -LS testfile.dat
sqlio -dE -BH -kR -frandom -t4 -o8 -s120 -b128 -LS testfile.dat
sqlio -dE -BH -kR -frandom -t4 -o8 -s120 -b256 -LS testfile.dat
sqlio -dE -BH -kR -frandom -t4 -o8 -s120 -b512 -LS testfile.dat

Following are the results from our testing

 

IO Transfer Size IO MB/sec IOPs Disk Latency
8KB 37.88 4849.78 6
64KB 162.33 2597.38 11
128KB 226.57 1812.63 17
256KB 326.37 1305.48 24
512KB 380.92 761.84 41

 

 

 

In this test, between 8k to 64KB if we get the acceptable response time and our IOPs requirement meets the our current workload requirement,  Disk is said to meet the test requirements.

 

 

 

 

 

Similarly,

We need to perform the following test
Random Write Test

sqlio -dE -BH -kW -frandom -t1 -o1 -s120 -b64 -LS testfile.dat
sqlio -dE -BH -kW -frandom -t2 -o1 -s120 -b64 -LS testfile.dat
sqlio -dE -BH -kW -frandom -t4 -o1 -s120 -b64 -LS testfile.dat
sqlio -dE -BH -kW -frandom -t4 -o2 -s120 -b64 -LS testfile.dat
sqlio -dE -BH -kW -frandom -t4 -o4 -s120 -b64 -LS testfile.dat
sqlio -dE -BH -kW -frandom -t4 -o8 -s120 -b64 -LS testfile.dat
sqlio -dE -BH -kW -frandom -t4 -o12 -s120 -b64 -LS testfile.dat
sqlio -dE -BH -kW -frandom -t4 -o16 -s120 -b64 -LS testfile.dat
sqlio -dE -BH -kW -frandom -t4 -o32 -s120 -b64 -LS testfile.dat
sqlio -dE -BH -kW -frandom -t4 -o64 -s120 -b64 -LS testfile.dat
sqlio -dE -BH -kW -frandom -t4 -o128 -s120 -b64 -LS testfile.dat

Large Random Write Test

sqlio -dE -BH -kW -frandom -t4 -o8 -s120 -b8 -LS testfile.dat
sqlio -dE -BH -kW -frandom -t4 -o8 -s120 -b64 -LS testfile.dat
sqlio -dE -BH -kW -frandom -t4 -o8 -s120 -b128 -LS testfile.dat
sqlio -dE -BH -kW -frandom -t4 -o8 -s120 -b256 -LS testfile.dat
sqlio -dE -BH -kW -frandom -t4 -o8 -s120 -b512 -LS testfile.dat

Sequential Read Test

sqlio -dE -BH -kR -fsequential -t1 -o1 -s120 -b64 -LS testfile.dat
sqlio -dE -BH -kR -fsequential -t2 -o1 -s120 -b64 -LS testfile.dat
sqlio -dE -BH -kR -fsequential -t4 -o1 -s120 -b64 -LS testfile.dat
sqlio -dE -BH -kR -fsequential -t4 -o2 -s120 -b64 -LS testfile.dat
sqlio -dE -BH -kR -fsequential -t4 -o4 -s120 -b64 -LS testfile.dat
sqlio -dE -BH -kR -fsequential -t4 -o8 -s120 -b64 -LS testfile.dat
sqlio -dE -BH -kR -fsequential -t4 -o12 -s120 -b64 -LS testfile.dat
sqlio -dE -BH -kR -fsequential -t4 -o16 -s120 -b64 -LS testfile.dat
sqlio -dE -BH -kR -fsequential -t4 -o64 -s120 -b64 -LS testfile.dat
sqlio -dE -BH -kR -fsequential -t4 -o128 -s120 -b64 -LS testfile.dat

Large Sequential Read Test

sqlio -dE -BH -kR -fsequential -t4 -o8 -s120 -b8 -LS testfile.dat
sqlio -dE -BH -kR -fsequential -t4 -o8 -s120 -b64 -LS testfile.dat
sqlio -dE -BH -kR -fsequential -t4 -o8 -s120 -b128 -LS testfile.dat
sqlio -dE -BH -kR -fsequential -t4 -o8 -s120 -b256 -LS testfile.dat
sqlio -dE -BH -kR -fsequential -t4 -o8 -s120 -b512 -LS testfile.dat

Sequential Write Test

sqlio -dE -BH -kW -fsequential -t1 -o1 -s120 -b64 -LS testfile.dat
sqlio -dE -BH -kW -fsequential -t2 -o1 -s120 -b64 -LS testfile.dat
sqlio -dE -BH -kW -fsequential -t4 -o1 -s120 -b64 -LS testfile.dat
sqlio -dE -BH -kW -fsequential -t4 -o2 -s120 -b64 -LS testfile.dat
sqlio -dE -BH -kW -fsequential -t4 -o4 -s120 -b64 -LS testfile.dat
sqlio -dE -BH -kW -fsequential -t4 -o8 -s120 -b64 -LS testfile.dat
sqlio -dE -BH -kW -fsequential -t4 -o12 -s120 -b64 -LS testfile.dat
sqlio -dE -BH -kW -fsequential -t4 -o16 -s120 -b64 -LS testfile.dat
sqlio -dE -BH -kW -fsequential -t4 -o32 -s120 -b64 -LS testfile.dat
sqlio -dE -BH -kW -fsequential -t4 -o64 -s120 -b64 -LS testfile.dat
sqlio -dE -BH -kW -fsequential -t4 -o128 -s120 -b64 -LS testfile.dat

Large Sequential Write Test

sqlio -dE -BH -kW -fsequential -t4 -o8 -s120 -b8 -LS testfile.dat
sqlio -dE -BH -kW -fsequential -t4 -o8 -s120 -b64 -LS testfile.dat
sqlio -dE -BH -kW -fsequential -t4 -o8 -s120 -b128 -LS testfile.dat
sqlio -dE -BH -kW -fsequential -t4 -o8 -s120 -b256 -LS testfile.dat
sqlio -dE -BH -kW -fsequential -t4 -o8 -s120 -b512 -LS testfile.dat

 

 

The results can be captured and verifies whether it meets the expected workload requirement within the acceptable response time (< 15ms).

 

 

Hope this Helps !!

Parikshit Savjani
Premier Field Engineer

Disclaimer: This post is provided “AS IS” with no warranties, and confers no rights