Added Auto Indexing & Indexing Mode parameters in Add-DocumentCollection Cmdlet

Couple of weeks back I released Powershell cmdlets for DocumentDB in Github (Further info on this can be found in my blog here). Today I merged & added  Automatic Indexing & IndexingMode parameters to the Add-DocumentCollection Cmdlet to the project in the module.

https://github.com/savjani/Azure-DocumentDB-Powershell-Cmdlets

Azure DocumentDB collections are auto-indexed by default allowing the developers  to focus on the application development & not worry about the schema or index definition for the database. Automatic indexing of documents is enabled by consistent, write optimized, lock free, and log structured index maintenance techniques.

Anyone who has worked on databases would know, while indexes are useful to expedite querying, it is an additional overhead to write, maintain & store indexes. Hence it is important & efficient that we use indexing wisely & do not index attributes, which may never be used for querying or filtering.

Secondly DocumentDB supports two modes for Indexing viz Consistent (synchronous mode) v/s Lazy (asynchronous mode). If you are someone like me who has a database background, you would know that for continuous transactional OLTP workload we prefer to have Consistent indexing wherein the indexes are updated & maintained as data is inserted, updated or deleted. However for batch Datawarehousing type workloads we prefer to drop the index, load the data & recreate the index which will ensure to give us the best write performance. Similarly in DocumentDB if we intend to perform bulk load of the documents the write performance would be improved with Indexing Mode set to Lazy as compared to synchronous consistent mode.

At the time of writing this blog, AutoIndexing & IndexingMode settings are only supported during creation of collections & cannot be altered after collection is created. hence it is important that we plan & choose the right setting for our collection while creating it.

To support these settings during creating of collection using Powershell, I have added the following Optional parameters in Add-DocumentCollection Cmdlet.

AutoIndexing – ($true or $False) : This parameter is used to specify whether we want Automatic Indexing to be turned ON or OFF in the collection. It is optional parameter & if omitted, Automatic Indexing is turned ON by default.

IndexingMode – (Consistent or Lazy): This parameter is used to specify the IndexingMode for the collection. It is optional parameter  & if omitted it is set to Consistent by default.

A good reference for DocumentDB Indexing Policy is following article from Product Group

https://azure.microsoft.com/en-us/documentation/articles/documentdb-indexing-policies/

Usage:

Creating a New Database & New Document Collection with default AutoIndexing & IndexingMode

Import-Module ‘C:\Users\pariks\Documents\GitHub\Azure-DocumentDB-Powershell-Cmdlets\Azrdocdb\Azrdocdb1\bin\Debug\Azrdocdb.dll’
$ctx = New-Context -Uri <uri> -Key <key>
$db = Add-Database -Context $ctx -Name “DocDBPS2″
$coll = Add-DocumentCollection -Context $ctx -DatabaseLink $db.SelfLink -Name “DocCollPS”

Get-database -Context $ctx -SelfLink $db.SelfLink | ft

Creating a New Database & New Document Collection & Setting Automatic Indexing Policy

Import-Module ‘C:\Users\pariks\Documents\GitHub\Azure-DocumentDB-Powershell-Cmdlets\Azrdocdb\Azrdocdb1\bin\Debug\Azrdocdb.dll’
$ctx = New-Context -Uri <uri> -Key <key>
$db = Add-Database -Context $ctx -Name ‘DocDB’
$coll = Add-DocumentCollection -Context $ctx -DatabaseLink $db.SelfLink -Name ‘DocCollPS’ -AutoIndexing $true
Get-database -Context $ctx -SelfLink $db.SelfLink | ft

Creating a New Database & New Document Collection & Setting Automatic Indexing Policy & IndexingMode

Import-Module ‘C:\Users\pariks\Documents\GitHub\Azure-DocumentDB-Powershell-Cmdlets\Azrdocdb\Azrdocdb1\bin\Debug\Azrdocdb.dll’
$ctx = New-Context -Uri <uri> -Key <key>
$db = Add-Database -Context $ctx -Name ‘DocDB’
$coll = Add-DocumentCollection -Context $ctx -DatabaseLink $db.SelfLink -Name ‘DocCollPS’ -AutoIndexing $true -IndexingMode ‘Lazy’
Get-database -Context $ctx -SelfLink $db.SelfLink | ft

Hope this Helps Smile

Parikshit Savjani
Premier Field Engineer

Releasing Powershell Cmdlets for Azure DocumentDB in Github

Azure DocumentDB is Microsoft’s new budding NoSQL Schema Free database offering which is relatively a recent addition to the mammoth Microsoft Data Platform Stack.

Azure DocumentDB provides great flexibility to the Developer world who feels a lot restricted with RDBMS data model, adhering to the data types, schema, index designs & patterns. Azure DocumentDB gives flexibility to the Developers to store entire records or JSON documents in Azure DocDB database uniquely identified by a key-value pair. All the attributes within a document are indexed by default with range or hash indexes thereby expediting the query processing against the documents & improving the query performance for reporting & analytics. Hence Developer just needs to serialize the data structures into JSON format & store them in Azure DocumentDB database while Azure DocumentDB takes care of indexing & storage of the attributes. Azure DocumentDB supports querying the attributes using the familiar SQL language over JSON structure making it compatible with most reporting & analytics platform.

While I started learning & evaluating Azure DocumentDB, I found one of component which was missing in Azure DocumentDB is out of box Powershell cmdlets for managing & administering the database. I found this was a opportunity for me to kick start a new community project in Github & try to get my hands dirty with Github public repository which I was aspiring to learn & adopt for quite sometime now.

With this excitement & modesty, I have created a project in Azure DocumentDB for Powershell cmdlets in Github to perform some basic management operation in Azure DocumentDB

https://github.com/savjani/Azure-DocumentDB-Powershell-Cmdlets

Further I invite the community members to further extend this modest list of Cmdlets to a full blown DocDB management & administration module similar to SQLPS

Powershell Cmdlet Definition & its Usage

Creating a New Database

Import-Module C:\Users\pariks\Documents\WindowsPowerShell\Modules\DocDB\Azrdocdb.dll
$ctx = New-Context -Uri <uri> -Key <key>
$db = Add-Database -Context $ctx -Name “DocDBPS2″
Get-database -Context $ctx -SelfLink $db.SelfLink | ft

Creating a New Database & New Document Collection

Import-Module C:\Users\pariks\Documents\WindowsPowerShell\Modules\DocDB\Azrdocdb.dll
$ctx = New-Context -Uri <uri> -Key <key>
$db = Add-Database -Context $ctx -Name “DocDBPS2″
$coll = Add-DocumentCollection -Context $ctx -DatabaseLink $db.SelfLink -Name “DocCollPS”
Get-database -Context $ctx -SelfLink $db.SelfLink | ft

List all the Databases in the Given Azure DocumentDB Database Account

Import-Module C:\Users\pariks\Documents\WindowsPowerShell\Modules\DocDB\Azrdocdb.dll
$ctx = New-Context -Uri <uri> -Key <key>
Get-databases -Context $ctx | ft

List Database Properties for a given Database

Import-Module C:\Users\pariks\Documents\WindowsPowerShell\Modules\DocDB\Azrdocdb.dll
$ctx = New-Context -Uri <uri> -Key <key>
$db = Get-databases -Context $ctx | Where-Object {$_.Id -eq “DocDBPS”}
Get-database -Context $ctx -SelfLink $db.SelfLink | ft

Drop Database

Import-Module C:\Users\pariks\Documents\WindowsPowerShell\Modules\DocDB\Azrdocdb.dll
$ctx = New-Context -Uri <uri> -Key <key>
$db = Get-databases -Context $ctx | Where-Object {$_.Id -eq “DocDBPS”}
remove-database -Context $ctx -SelfLink $db.SelfLink

Get Database Account Consistency Level

Import-Module C:\Users\pariks\Documents\WindowsPowerShell\Modules\DocDB\Azrdocdb.dll
$ctx = New-Context -Uri <uri> -Key <key>
Get-DatabaseAccountConsistencyLevel -Context $ctx

Set Database Account Consistency Level

Import-Module C:\Users\pariks\Documents\WindowsPowerShell\Modules\DocDB\Azrdocdb.dll
$ctx = New-Context -Uri <uri> -Key <key>
Set-DatabaseAccountConsistencyLevel -Context $ctx -DefaultConsistencyLevel Eventual
Get-DatabaseAccountConsistencyLevel -Context $ctx

Hope this helps :)

Parikshit Savjani
Premier Field Engineer

Ungrouping Column using DAX in PowerPivot or SSAS Tabular

To start this blog post, let me first explain the problem. Consider a scenario where you receive an excel sheet or data in the following format

Category SubCategory
Bikes Mountain Bikes
Road Bikes
Touring Bikes
Components Handlebars
Bottom Brackets
Brakes
Chains
Cranksets
Derailleurs
Forks
Headsets
Mountain Frames
Pedals
Road Frames
Saddles
Touring Frames
Wheels
Clothing Bib-Shorts
Caps
Gloves
Jerseys

In the above case, the data is already grouped by Category. If you try to load this data in PowerPivot Data Model & try to create a Power view Column Chart report on Category, the view would look similar to shown below

image

Most of the data is grouped in Blank Category since the Power Pivot Data Model doesn’t know or understand that data was already grouped by Category.

To ungroup the data back in PowerPivot Data Model, I would first add an Index column to the existing data which can be easily added when you are loading the data using Power Query. So the Power Pivot Data Model now looks as shown below

image

Next, we add Calculated Column viz MaxIndex to calculate & store the index value of last non blank value of the Category for each row. The DAX calculation is shown below

=MAXX(FILTER(FILTER(Table1,LEN(Table1[Category])>0),Table1[Index]<=EARLIER(Table1[Index])),Table1[Index])

The output of the above calculation looks like following

image

The above table is now like a Parent-Child Hierarchy where MaxIndex is parent id. So to ungroup the Category column, we can now use LOOKUPVALUE Function as shown below

=LOOKUPVALUE([Category],[Index],[MaxIndex])

The output of the above calculation is as shown below

image

We can now hide the Category, MaxIndex & Index Columns from Client tools to make it invisible from Power view. Now if you visualize the same data in Column chart in Power view, the result is what we are looking for

image

Hope this helps !!!

Parikshit Savjani
Sr. Premier Field Engineer

How SQL Server Writes Transaction Log Sequentially

Last week in of the SQL Community Forum, one of my friend asked the following question

If I have 2 Tlog files for one database, how will these files fillup?

A) File 1 will be filled first and then File 2
B) Both the files will be filled together.

If B is true then how will sequentially be maintained?

The reason this question was asked since he observed that if he has multiple logfiles, if he performed a single bulk insert operation, both the log files were 50% filled rather than single log file 100 % filled.

If you are looking for the reason why this would happened please look at the end of post but in this blog post, I would like to further show how can we study the internals of SQL Server with the help of Xevents. To monitor the file writes we can use the sqlserver.file_write_completed Xevents to understand how SQL Server performs the sequential write operation on each SQL Server Log File in case of multiple Log files.

For my testing, I reused the same example used by Jonathan Kehayias in his blog where he explain the proportional fill algorithm

Create a Database with single Data File & 5 Log Files & a single empty table in which we perform the bulk insert of database.

 
USE [master]
GO
/****** Object: Database [MultipleLogFiles] Script Date: 5/30/2015 4:54:10 PM ******/
CREATE DATABASE [MultipleLogFiles]
CONTAINMENT = NONE
ON PRIMARY
( NAME = N'MultipleLogFiles', FILENAME = N'C:\Program Files\Microsoft SQL Server\MSSQL11.MSSQLSERVER\MSSQL\DATA\MultipleLogFiles.mdf' , SIZE = 5316608KB , MAXSIZE = UNLIMITED, FILEGROWTH = 0)
LOG ON
( NAME = N'MultipleLogFiles_log_1', FILENAME = N'C:\Program Files\Microsoft SQL Server\MSSQL11.MSSQLSERVER\MSSQL\DATA\MultipleLogFiles_log_1.ldf' , SIZE = 1048576KB , MAXSIZE = 2048GB , FILEGROWTH = 0),
( NAME = N'MultipleLogFiles_log_2', FILENAME = N'C:\Program Files\Microsoft SQL Server\MSSQL11.MSSQLSERVER\MSSQL\DATA\MultipleLogFiles_log_2.ldf' , SIZE = 1048576KB , MAXSIZE = 2048GB , FILEGROWTH = 0),
( NAME = N'MultipleLogFiles_log_3', FILENAME = N'C:\Program Files\Microsoft SQL Server\MSSQL11.MSSQLSERVER\MSSQL\DATA\MultipleLogFiles_log_3.ldf' , SIZE = 1048576KB , MAXSIZE = 2048GB , FILEGROWTH = 0),
( NAME = N'MultipleLogFiles_log_4', FILENAME = N'C:\Program Files\Microsoft SQL Server\MSSQL11.MSSQLSERVER\MSSQL\DATA\MultipleLogFiles_log_4.ldf' , SIZE = 1048576KB , MAXSIZE = 2048GB , FILEGROWTH = 0),
( NAME = N'MultipleLogFiles_log_5', FILENAME = N'C:\Program Files\Microsoft SQL Server\MSSQL11.MSSQLSERVER\MSSQL\DATA\MultipleLogFiles_log_5.ldf' , SIZE = 1048576KB , MAXSIZE = 2048GB , FILEGROWTH = 0)
GO

IF OBJECT_ID('SalesOrderHeader') IS NOT NULL
DROP TABLE [dbo].[SalesOrderHeader]
GO

SET NOCOUNT ON
GO

CREATE TABLE [dbo].[SalesOrderHeader](
[SalesOrderID] [int] IDENTITY(1,1) NOT NULL PRIMARY KEY,
[RevisionNumber] [tinyint] NOT NULL,
[OrderDate] Date NOT NULL,
[DueDate] Date NOT NULL,
[ShipDate] Date NULL,
[Status] [tinyint] NOT NULL,
[OnlineOrderFlag] [bit] NOT NULL,
[SalesOrderNumber] [nvarchar](25) NOT NULL,
[PurchaseOrderNumber] [nvarchar](25) NULL,
[AccountNumber] [nvarchar](15) NULL,
[CustomerID] [int] NOT NULL,
[SalesPersonID] [int] NULL,
[TerritoryID] [int] NULL,
[BillToAddressID] [int] NOT NULL,
[ShipToAddressID] [int] NOT NULL,
[ShipMethodID] [int] NOT NULL,
[CreditCardID] [int] NULL,
[CreditCardApprovalCode] [varchar](15) NULL,
[CurrencyRateID] [int] NULL,
[SubTotal] [money] NOT NULL,
[TaxAmt] [money] NOT NULL,
[Freight] [money] NOT NULL,
[TotalDue] [money] NOT NULL,
[Comment] [nvarchar](128) NULL,
[rowguid] [uniqueidentifier] NOT NULL,
[ModifiedDate] Date NOT NULL
)
GO

Create an Xevent Session to capture checkpoint_begin, checkpoint_end, file_written, file_write_completed, database_log_flush Xevents during the bulk insert operation

IF EXISTS(SELECT * FROM sys.server_event_sessions WHERE name='MultipleLogFiles')
DROP EVENT SESSION [MultipleLogFiles] ON SERVER;

DECLARE @sqlcmd nvarchar(4000) = '
CREATE EVENT SESSION MultipleLogFiles
ON SERVER
ADD EVENT sqlserver.checkpoint_begin
( WHERE (sqlserver.database_id = '+ cast(DB_ID() as varchar(3))+')),
ADD EVENT sqlserver.checkpoint_end
( WHERE (sqlserver.database_id = '+ cast(DB_ID() as varchar(3))+')),
ADD EVENT sqlserver.file_written
( WHERE (sqlserver.database_id = '+ cast(DB_ID() as varchar(3))+')),
ADD EVENT sqlserver.file_write_completed
( WHERE (sqlserver.database_id = '+ cast(DB_ID() as varchar(3))+')),
ADD EVENT sqlserver.databases_log_flush
( WHERE (sqlserver.database_id = '+ cast(DB_ID() as varchar(3))+')),
ADD TARGET package0.asynchronous_file_target(
SET filename=''C:\XEvents\MultipleLogFiles.xel'',
metadatafile=''C:\XEvents\MultipleLogFiles.xem'')
WITH (MAX_MEMORY = 8MB, EVENT_RETENTION_MODE = ALLOW_SINGLE_EVENT_LOSS, TRACK_CAUSALITY = ON, MAX_DISPATCH_LATENCY=5SECONDS)'

EXEC (@sqlcmd)

Start the Xevent, Perform the Bulk Insert & checkpoint to flush the dirty buffers to data files & finally STOP the Xevent Session

ALTER EVENT SESSION MultipleLogFiles
ON SERVER
STATE=START
GO
-- Load data into the test database
INSERT INTO dbo.SalesOrderHeader
SELECT RevisionNumber,
DATEADD(DD, 1126+number, OrderDate),
DATEADD(DD, 1126+number, DueDate),
DATEADD(DD, 1126+number, ShipDate),
soh.Status, OnlineOrderFlag, SalesOrderNumber,
PurchaseOrderNumber, AccountNumber,
CustomerID, SalesPersonID, TerritoryID,
BillToAddressID, ShipToAddressID,
ShipMethodID, CreditCardID, CreditCardApprovalCode,
CurrencyRateID, SubTotal, TaxAmt, Freight,
TotalDue, Comment, rowguid,
DATEADD(DD, 1126+number, ModifiedDate)
FROM AdventureWorks2012.Sales.SalesOrderHeader AS soh
CROSS JOIN master.dbo.spt_values AS sv
WHERE sv.type = N'P'
AND sv.number > 0 AND sv.number < 6
GO 50
-- Flush all dirty pages to disk
CHECKPOINT
GO
-- Stop the Event Session
ALTER EVENT SESSION MultipleLogFiles
ON SERVER
STATE=STOP
GO

Load the Xevent data from xel file into a Table & later extract the XML eventdata into another table called MultipleLogFileResultsParsed for reporting

IF OBJECT_ID('MultipleLogFileResults') IS NOT NULL
DROP TABLE MultipleLogFileResults
GO
IF OBJECT_ID('MultipleLogFileResultsParsed') IS NOT NULL
DROP TABLE MultipleLogFileResultsParsed
GO
-- Create results table to load data from XE files
CREATE TABLE MultipleLogFileResults
(RowID int identity primary key, event_data XML)
GO
-- Load the event data from the file target
INSERT INTO MultipleLogFileResults(event_data)
SELECT
CAST(event_data AS XML) AS event_data
FROM sys.fn_xe_file_target_read_file('C:\XEvents\MultipleLogFiles*.xel',
'C:\XEvents\MultipleLogFiles*.xem',
null, null)
GO
-- Parse the event data
SELECT
RowID,
event_data.value('(event/@name)[1]', 'varchar(50)') AS event_name,
DATEADD(hh,
DATEDIFF(hh, GETUTCDATE(), CURRENT_TIMESTAMP),
event_data.value('(event/@timestamp)[1]', 'datetime2')) AS [timestamp],
COALESCE(event_data.value('(event/data[@name="database_id"]/value)[1]', 'int'),
event_data.value('(event/action[@name="database_id"]/value)[1]', 'int')) AS database_id,
event_data.value('(event/data[@name="mode"]/text)[1]', 'nvarchar(4000)') AS [mode],
event_data.value('(event/data[@name="file_handle"]/value)[1]', 'nvarchar(4000)') AS [file_handle],
event_data.value('(event/data[@name="offset"]/value)[1]', 'bigint') AS [offset],
event_data.value('(event/data[@name="page_id"]/value)[1]', 'int') AS [page_id],
event_data.value('(event/data[@name="file_id"]/value)[1]', 'int') AS [file_id],
event_data.value('(event/data[@name="file_group_id"]/value)[1]', 'int') AS [file_group_id],
event_data.value('(event/data[@name="wait_type"]/text)[1]', 'nvarchar(100)') AS [wait_type],
event_data.value('(event/data[@name="duration"]/value)[1]', 'bigint') AS [duration],
event_data.value('(event/action[@name="sql_text"]/value)[1]', 'nvarchar(4000)') AS [sql_text],
event_data.value('(event/data[@name="cpu"]/value)[1]', 'int') AS [cpu],
event_data.value('(event/data[@name="reads"]/value)[1]', 'bigint') AS [reads],
event_data.value('(event/data[@name="writes"]/value)[1]', 'bigint') AS [writes],
CAST(SUBSTRING(event_data.value('(event/action[@name="attach_activity_id"]/value)[1]', 'varchar(50)'), 1, 36) AS uniqueidentifier) as activity_id,
CAST(SUBSTRING(event_data.value('(event/action[@name="attach_activity_id"]/value)[1]', 'varchar(50)'), 38, 10) AS int) as event_sequence
INTO MultipleLogFileResultsParsed
FROM MultipleLogFileResults
ORDER BY Rowid

To analyze the Xevent data, I used Power view which is my favorite tool for Data Analytics & Reporting.

LogWrites

LogWritesfiltered

From the above screenshot we observe the following

  • When the transaction started, it started filling the Transaction Log 5 (in this case file id 6)
  • Next it sequentially started filling Transaction Log 1 file id 2
  • Next it sequentially started filling Transaction Log 2 file id 3
  • Next it sequentially started filling Transaction Log 3 file id 4
  • Lastly we see writes on Data File id 1 since we fired CHECKPOINT at the end of transaction to flush & write all dirty buffers to Data Files

At this point, if we fire DBCC LOGINFO or looks at the size of Tlog, it might appear why  SQL Server skipped filling of  transaction log 4 (file id 5) & filled transaction log 5 (file id 6) instead. In fact after seeing the above data you would understand that when the transaction started the current active VLF was somewhere in the middle of the Transaction log 5 (file id 6) from where it started filling the logs & sequentially moved to 1st tlog file (file id 2) , 2nd tlog file (file id 3) & then 3rd Tlog file (file id 4).

The following is the snapshot of DBCC LOGINFO at the end of transaction.

LogInfo

Hence to answer my friend’s question.

Yes SQL Server writes sequentially into VLFs of single file & then moves to the next file in round robin manner & loops back to first file after the Tlog is truncated.

So why did we observe that both the Tlog were 50% used rather single transaction file 100% file?

This is because in his case, when the transaction started the current active VLF file would be in the middle of Tlog file id 1 & hence when the data flush reached end of file Tlog 1, it sequentially moved to following VLFs in second file & filled 50% on first file & 50 % on second file.

As observed in the output of DBCC LOGINFO earlier, when we have multiple log files, the VLFs are ordered sequentially from 1st log file until the last one, hence if the current active VLF (status 2) is somewhere in between of the Tlog, after reaching the end of 1st file, it moves to the next sequential VLF (FSeqNo in DBCC LOGINFO) which at the start of second Log file & not at the start of 1st Logfile which is why both the Log files are partially filled.

Further after the above snapshot of DBCC LOGINFO, when we perform next few transactions, the current active VLF is 92… after 92 is filled, SQL Server moves to the next VLF & renames the FSeqNo of VLF from 53 to 93, status changes to 2 & parity changes to 128.

Loginfo2

Hope this explains the sequential writing of Transaction Log

Parikshit Savjani
Premier Field Engineer

SQL Server AlwaysON Group Failover in Cluster causes Node to lose network connectivity

Last week, I worked on an issue which might sound weird or strange when you hear it for the first time.

“Whenever we fail SQL Server AlwaysON Group from Cluster, Node A loses its network connectivity.” (Sounds weird, Right ???)

Here is the email which I received from my customer

We’re able to re-create the issue on demand now, as every time we try to fail the SQL cluster group over from A to B, as soon as the SQL network name resource goes offline on the A server, the default gateway on the public NIC gets deleted and the box drops off the network. This only happens when failing over from A to B (B to A does not cause any issues), and also only explicitly occurs when the SQL Network resource is taken offline on the A server when being failed over to B (the core cluster name resource and group can fail over between nodes without issue).

The first obvious thing which comes to our mind is, it can’t be SQL Server application causing this since it sits on top on Windows platform & the failover mechanism doesn’t have anything to do with Default Gateway setting in NICs but as Sherlock Homes says

“It is a capital mistake to theorize before one has data. Insensibly one begins to twist facts to suit theories, instead of theories to suit facts.”

After spending few days on troubleshooting & research we discovered & found the root cause of the issue was incorrect IP routing on Cluster Nodes.

More Specifically,

The issue was caused due to SQL Resource IP registered in persistent route of Public NIC for Node A. This incorrect persistent route registered for Public NIC on Node A which is not bound to the Interface causes the Active Route & Persistent Route to go down when the SQL IP resource goes offline or is failed over

More Details on this issue can be found in the blog

http://blogs.technet.com/b/networking/archive/2009/05/21/active-route-gets-removed-on-windows-server-2008-offline-cluster-ip-address.aspx

http://support.microsoft.com/kb/2161341

To resolve the issue, the Windows Networking team was involved to rectify the persistent route setting & IP routing table.

Hope this helps for anyone who hears or faces this strange issue w.r.t SQL AlwaysON Cluster on Windows 2008 or above Cluster

Parikshit Savjani
Premier Field Engineer

Consideration for changing the default port of default SQL instance on Multi-subnet Failover Cluster

Security Hardening or Security Audits in an enterprise are some of the motivation for DBAs to run SQL Server default instance on a non-default port. For standalone instances, there are no challenges in changing the default instance port to non-default port, however for clustered instance there are some considerations since cluster.exe performs IsAlive check by logging into SQL Server instance.

First to avoid some confusion or myth if any. When a client is connecting to default instance of SQL Server listening on default port (1433) or non-default port, SQL Browser service is not involved & has no role to play in case of default instance. This is the same reason why you observe SQL Browser service is disabled, when you install only default instance of SQL Server on a given server.  SQL Browser service has a role to play only for Named instances & not when Default instances is listening on non-default ports.

For a clustered default instance of SQL Server, when we change the port of the SQL Instance to non-default port, the SQL Server resource might fail to come online & you might see the following error messages in Cluster Log

<< The following abstract is for Clustered default instance of SQL 2012 running on non-default port>>
00002840.00003ef0::2015/01/16-23:25:19.637 INFO [RES] SQL Server : [sqsrvres] Server name is SQLFCI
00002840.00003ef0::2015/01/16-23:25:19.637 INFO [RES] SQL Server : [sqsrvres] Service name is MSSQLSERVER
00002840.00003ef0::2015/01/16-23:25:19.637 INFO [RES] SQL Server : [sqsrvres] Dependency expression for resource ‘SQL Network Name (SQLFCI)’ is ‘([5c22a982-9400-416b-b5bd-95da07998601] or [50660f59-4f33-4cd1-9dc0-84f6b32ebb76])’
00002840.00003ef0::2015/01/16-23:25:19.637 INFO [RES] SQL Server : [sqsrvres] This instance is running with IP OR dependency. Resource DLL Will enable MultiSubnetFailover=yes in connection string
00002840.00003ef0::2015/01/16-23:25:19.637 WARN [RES] SQL Server : [sqsrvres] This is default instance. If non-default TCP/IP listening port number is specified, and SQL Server fails to come online, configure an alias on each possible owner node and try again. If it does not work, please contact customer support
00002840.00003ef0::2015/01/16-23:25:19.637 INFO [RES] SQL Server : [sqsrvres] Starting service MSSQLSERVER…
0000125c.00004990::2015/01/16-23:25:19.886 INFO [NM] Received request from client address SQLFCINodeA.
00002840.00003ef0::2015/01/16-23:25:20.842 INFO [RES] SQL Server : [sqsrvres] Service status checkpoint was changed from 0 to 1 (wait hint 20000). Pid is 19508
0000125c.00003fe8::2015/01/16-23:25:23.639 INFO [NM] Received request from client address SQLFCINodeA.
00002840.00003ef0::2015/01/16-23:25:23.842 INFO [RES] SQL Server : [sqsrvres] Service is started. SQL Server pid is 19508
00002840.00003ef0::2015/01/16-23:25:23.842 INFO [RES] SQL Server : [sqsrvres] Connect to SQL Server …
00002840.00003ef0::2015/01/16-23:25:54.855 ERR [RES] SQL Server : [sqsrvres] ODBC Error: [08001] [Microsoft][SQL Server Native Client 11.0]TCP Provider: The remote computer refused the network connection.
00002840.00003ef0::2015/01/16-23:25:54.855 ERR [RES] SQL Server : [sqsrvres] ODBC Error: [HYT00] [Microsoft][SQL Server Native Client 11.0]Login timeout expired (0)
00002840.00003ef0::2015/01/16-23:25:54.855 ERR [RES] SQL Server : [sqsrvres] ODBC Error: [08001] [Microsoft][SQL Server Native Client 11.0]A network-related or instance-specific error has occurred while establishing a connection to SQL Server. Server is not found or not accessible. Check if instance name is correct and if SQL Server is configured to allow remote connections. For more information see SQL Server Books Online. (1225)
00002840.00003ef0::2015/01/16-23:25:54.855 INFO [RES] SQL Server : [sqsrvres] Could not connect to SQL Server (rc -1)
00002840.00003ef0::2015/01/16-23:25:54.855 INFO [RES] SQL Server : [sqsrvres] SQLDisconnect returns following information
00002840.00003ef0::2015/01/16-23:25:54.855 ERR [RES] SQL Server : [sqsrvres] ODBC Error: [08003] [Microsoft][ODBC Driver Manager] Connection not open (0)
00002840.00003ef0::2015/01/16-23:25:59.857 INFO [RES] SQL Server : [sqsrvres] Connect to SQL Server …
00002840.00003ef0::2015/01/16-23:26:30.860 ERR [RES] SQL Server : [sqsrvres] ODBC Error: [08001] [Microsoft][SQL Server Native Client 11.0]TCP Provider: The remote computer refused the network connection.

With Enhanced Verbose Logging in SQL 2012 as seen above, we now get an message saying that if you are running default instance on non-default TCP/IP port, we should create an alias (TCP Or NP )  to bring SQL Server resource online.

00002840.00003ef0::2015/01/16-23:25:19.637 WARN  [RES] SQL Server <SQL Server>: [sqsrvres] This is default instance. If non-default TCP/IP listening port number is specified, and SQL Server fails to come online, configure an alias on each possible owner node and try again. If it does not work, please contact customer support

In case of Multi-Subnet Failover Cluster we observed the creation of alias brought the SQL Server resources online, however SQL Agent resource failed to come online. When we observed the SQLAgent.out Log we see the following error message


2015-01-16 17:36:54 – ! [150] SQL Server does not accept the connection (error: 1225). Waiting for Sql Server to allow connections. Operation attempted was: Verify Connection On Start.
2015-01-16 17:36:54 – ! [000] Unable to connect to server ‘(local)’; SQLServerAgent cannot start
2015-01-16 17:36:59 – ! [298] SQLServer Error: 1225, TCP Provider: The remote computer refused the network connection. [SQLSTATE 08001]
2015-01-16 17:36:59 – ! [165] ODBC Error: 0, Login timeout expired [SQLSTATE HYT00]
2015-01-16 17:36:59 – ! [298] SQLServer Error: 1225, A network-related or instance-specific error has occurred while establishing a connection to SQL Server. Server is not found or not accessible. Check if instance name is correct and if SQL Server is configured to allow remote connections. For more information see SQL Server Books Online. [SQLSTATE 08001]
2015-01-16 17:36:59 – ! [382] Logon to server ‘(local)’ failed (DisableAgentXPs)
2015-01-16 17:36:59 – ? [098] SQLServerAgent terminated (normally)

This issue was specific to Multi-Subnet Failover Clustering since we observed that if we disable the cluster checkpoints & Modify the following registry value from 1 to 0, SQL Agent resource would come online.

HKEY_LOCAL_MACHINE\SOFTWARE\Microsoft\Microsoft SQL Server\MSSQL11.MSSQLSERVER\Cluster\RunWithIPORDependency

However it would remain online until SQL Server resource is restarted. When SQL Server resource comes online again, the SQL Server detects it as Multi-Subnet Cluster & flips the bit from 0 to 1 (Note: this is not caused due to Cluster Checkpointing)

This prevents SQL Agent resource to come online. To work around this issue, we can set the ServerHost registry for SQL Agent to point to use the Alias created which will ensure SQL Agent can connect to SQL Server & the resource comes online.

  1. 1.       Remove Cluster Checkpoint

Cluster res “SQL Network Name(SQLFCI)” /removecheckpoint:”Software\Microsoft\Microsoft SQL Server\MSSQL11.MSSQLSERVER\SQLServerAgent”

  1. 2.       Set ServerHost registry key to Alias name
  1. 3.       Add Cluster Checkpoint

Cluster res “SQL Network Name(SQLFCI)” /addcheckpoint:”Software\Microsoft\Microsoft SQL Server\MSSQL11.MSSQLSERVER\SQLServerAgent”

After pointing ServerHost key to Alias which resolves to non-default port, SQL Agent is able to connect SQL Server & comes online.

I have filed a connect bug for the same issue & feel free to vote to bump up the priority

https://connect.microsoft.com/SQLServer/feedback/details/1090514/sql-agent-fails-to-come-online-in-multi-subnet-clustered-default-instance-of-sql-server-running-on-non-default-port

Parikshit Savjani
Premier Field Engineer

What is your role in Cloud?

As I interact with more & more customers who are onboarding on Azure which is the Microsoft Cloud solution, one of the common myth or assumptions which I hear or encounter is

 ”We just need to deploy the application or database in cloud &  Microsoft will take care of the OS, Storage, patching etc”

“After applications move to cloud,  we no longer need Infrastructure team or DBAs to manage the platform”

The above statements may hold true to some extent when organizations adopt Cloud in Software as a Service (SaaS) or Platform as a Service (PaaS) scenarios. However in Infastructure As a Service (IaaS)  & in Hybrid Cloud scenarios the above statements  are not completely true & In this blog post, I would like to (re)define the roles & responsibilities in an enterprise as they start their journey towards cloud.

In my opinion, Most Enterprise Organization will not be able to completely migrate to Public Cloud Infrastructure due to multiple reasons so hybrid scenarios would be the most common cloud adoption which is where Microsoft has a clear edge over its competition.

These roles & responsibilities are my personal imagination of future roles based on my experience on the field

But before I start, a disclaimer

“This is a personal weblog. The opinions expressed here represent my own and not those of my employer. In addition, my thoughts and opinions change from time to time…I consider this a necessary consequence of having an open mind. This weblog is intended to provide a semi-permanent point in time snapshot and manifestation of the various memes running around my brain, and as such any thoughts and opinions expressed within out-of-date posts may not the same, nor even similar, to those I may hold today. “

System Administrators

In Cloud scenarios, System Administrator would be responsible for following

  • Building VM Images which adheres to the company standards which can leveraged by the DBAs or other application team to spin up & start within few mins.  It is one time effort which can be reused by DBAs or application team to spin up their Servers within few mins.
  • Update the VM Images with the latest patches & anti-virus updates.
  • Using or configuring Agents (runbooks, Chef & Puppet Add-ins)  to update & patch the Server VMs already in use in the organization’s environment.
  • Monitor System performance & Resource Utilization

Network Administrators

In Hybrid Cloud Scenarios, this will be the most important role since an enterprise cloud should be designed as an extension of on-premise datacenters which will be made possible by Network Administrators.

  • Configure site-to-site VPN to ensure seamless connectivity from on-premise to cloud data centers.
  • Configure IP Sec, Network Security & Firewalls for the cloud data centers.
  • Configure DNS Servers & assign IP address to the VMs running on cloud.
  • Monitor Network Performance & optimize performance over Internet by using network compression techniques or other options to get the max out of the bandwidth.

Domain Administrators

In Hybrid Cloud scenarios, the servers needs to be added to a domain which will allow the users, groups & service account to secure & access the resources based on the policies governed by Domain Administrators

  • Configure Azure Directory Sync to extend the on-premise AD & Exchange to Azure
  • Ensure the Group Policies are applied to the VM Images
  • Ensure Same Organizational account is integrated & used by users to login VMs, Office 365 or databases in cloud.

 Storage Administrators

In Cloud scenarios, this role can be thinned a bit & can be merged with System Administrators role nevertheless this role can have the following responsibilities

  • Configure Storage Pools to combine multiple Data Disks to obtain higher throughput & IOPS.
  • Configure Azure Storage for backups & BLOB storage.
  • Manage & govern storage to decide which storage should be used to achieve the desired throughput within a given cost budget
  • Monitor Storage performance

 Security Administrators

In Cloud scenarios, data security is a big concern which is where this role will be important & will have the following responsibilities

  • Based on the sensitivity of data define data classification layers to govern & decide which data can move to public cloud & which should remain within on-premise.
  • Harden the Network Security & Firewalls.
  • Govern & define policies on which data can be geographically replicated & which cannot for political reasons

 Database Administrators (My Favorite)

In IaaS Cloud Scenarios, the DBA role is almost unchanged & will have the similar responsibilities as on-premise

  • Setup SQL Instance, configure tempdb, create logins, install database, configure user roles.
  • Configure Server Properties .
  • Backups & restore of databases.
  • Monitor SQL Performance & identify blocking or bad performing queries.

In PaaS Cloud Scenarios

  • Configure database, create users, roles & assign permissions.
  • Backup & Restore Databases.
  • Manage SQL Performance & identify blocking or bad performing queries.

One of the skillsets which all the roles would need to possess is Powershell & leveraging Azure Automation to create orchestrator runbooks which will empower the roles to automate some of repetitive Jobs when managing large scale enterprise cloud environment.

In some organizations the above defined roles might overlap or merge across different teams while in large complex enterprise organization they might be isolated & very well defined.

Hope this helps !!!

Parikshit Savjani
Premier Field Engineer

Cloud Revolution & our adoption

This is my first attempt of blogging a non-technical topic (non-technical but still related to technology) since I never felt it is my cup of tea. However the recent changes & developments which I am witnessing around me has provoked a lot of new thoughts in me which I wanted to pen down for my own reference & for the benefit of others who think alike.

But before I start, a disclaimer

“This is a personal weblog. The opinions expressed here represent my own and not those of my employer. In addition, my thoughts and opinions change from time to time…I consider this a necessary consequence of having an open mind. This weblog is intended to provide a semi-permanent point in time snapshot and manifestation of the various memes running around my brain, and as such any thoughts and opinions expressed within out-of-date posts may not the same, nor even similar, to those I may hold today. “

Since past couple of years or even more we have been hearing a lot of buzz on cloud & there is cloud all around us. The more important fact to realize is, cloud is no longer a buzz but it is now a Reality & if we as IT pros ignore this fact we are going to miss out the boat. Huge Tech companies have made large investments to make it a reality now & businesses have even started to realize the cost savings & benefits out of cloud.

While there is enough proof & content out there to prove on how business can benefit from cloud, I like the hotel analogy which I can relate easily. Being consultants or tourists, we like to travel places but do we have to buy real estate every place we visit to stay. Thanks to the hotels we just book the room, checkin at the desired time & check out when u leave & pay for the days you have spent there. Just imagine, how limited our travel or our reach would have been if hotels didn’t exist & we were supposed to buy a room every place we visit.

Similarly, the business can now think of growing their reach by adopting to cloud & not worrying about the cost of data center, real estate, server, RAM, CPUs, storage etc. Businesses can now afford to experiment or innovate even more with IT since the cost of spinning a VMs in cloud is negligible & if they fail they aren’t going to lose much.

The most strong argument which goes against the cloud is data security.  However I am assuming there would have been a same argument when internet was evolving, sending data packets outside your organization was it safe?  But today, Can organizations grow or even survive without their presence in internet?  There are & will be processes, protocols developed for data security in cloud.

However are IT Pros ready for this Cloud Revolution. We are so much in love with our on-premise technologies & that is our comfort zone which has improved & matured over a period of time to become more & more perfect & easy to work with. While the road to cloud is not so matured & still evolving.

An IT pro like myself might be reluctant to move to cloud since we are going out of our comfort zone, we have invested years of our time to learn & become expert on a product or technology & now we need to move on to something else. Further there are some gaps between on-premise technology where features are available out of the box & in cloud where we are expected to code it.

The bad news is whether we like it or not it is coming our way & at a rapid pace. For instance, the features which were not available 6 months back are now available & as I witness, the offerings on cloud might mature at rapid pace then the time taken by their respective on-premise offerings since the Product Teams exactly knows which features are liked by the consumers the most.

As I see, we are in the times who are witnessing the cloud revolution & we should take every opportunity to learn cloud before it’s too late rather than ignoring it, assuming it is still not matured yet. In 2000′s people who can spell .Net were hired & it was a skillset which lasted long & was worth the investment of time which gave & is still giving people handsome returns. Cloud skillset is definitely one such investment which is going to give returns to IT Pros for a very long time.

To conclude, I will leave you with the following thought

You can either witness this change or “Be the change”.

Hope this helps !!!

Parikshit Savjani
Premier Field Engineer

The mysterious case of data load in a partitioned table

I have chosen this title for my blog post since troubleshooting this issue was like solving a mystery. One of my customer had recently faced a slow performance while loading the data in a partitioned table. Whenever there is performance issue for an insert statement, the usual suspects which come to our mind is blocking or Disk IO so we started investigating the waits for the query in sys.dm_exec_requests & sys.dm_os_waiting_tasks but to our surprise the insert query was either in running or runnable state which didn’t justify the slow performance of the query.

Further, tracking the Disk IO activity we found that actual insert operation is completed within few mins however before the writes are performed, there is increased read activity on the sql instance & some kind of wait which cannot be identified from the waiting_tasks & dm_exec_requests DMV since they appear to be in running or runnable state.

The table is a partitioned table which is around 1TB in size & during the data load, there is an internal memory pressure on the SQL Server with PLE & Free Pages on the SQL Server going down while Lazy writes/sec increasing which too some extend is expected during the data load but we didn’t have any performance baseline to compare against.

Normally, I wouldn’t check the query plan for slower insert performance since unlike a select query which can be optimized by indexing or by stats refresh, there is limited scope for tuning insert/update or deletes. However I captured the query plan to understand the execution path chosen by the query which helped me solved the mystery.

BadPlan

In the Execution Plan,  we see a Sorted Merge Join Operator  & Index Scan operation which is used to perform Unique Key validation before performing the actual insert of data. Performing an index scan on 1TB table is definitely not a good plan.

First question,  why did optimizer chose a Merge Join & Index Scan for Unique Key validation, the answer to that is yellow exclamation warning shown in the Index Scan operator which is missing statistics indication.

When we see more details on the Index Scan operator, we find missing statistics on Ptnid which is hidden column in partitioned index added starting SQL 2008. More details on Ptn id can be found in Craig’s blog below http://blogs.msdn.com/b/craigfr/archive/2008/08/05/partitioned-indexes-in-sql-server-2008.aspx

IndexScan

The only supported way to rebuild the missing statistics for Ptnid is to rebuild the partitioned index on the partitioned table. Although rebuilding a large partitioned table is a not an easy task & can run upto multiple days, it was worth the time & resources spend.

After rebuilding the indexes on the partitioned table, the performance of the data load improved as per the expectation & we saw the following Execution Plan which improved the performanceGoodPlan

The Good Execution Plan uses Nested Loop Join which is non-blocking operator & performs Index Seek operation on the Partitioned table which reduces the memory footprint,  Disk IO & improves the performance of insert statement drastically.

Hope you enjoyed the mysterious case of data loading in a partitioned table

Hope this helps !!!

Parikshit Savjani
Premier Field Engineer