FUN – Self-Assessment of your social presence in fb using Power BI

If you read the title of this post & were wondering, What the hell is that ? , Trust me I have no clue how I made this up but I felt it would be an appropriate title when someone wants to perform data analysis on their fb data to understand how much active they are on fb YoY or month-on-month, who are your good friends, who like your posts the most & discover some long lost friends who were your good friends earlier but have lost touch.

Assumption: The modern definition of good friend is the friend who likes your all your post or activity in Facebook ;)

One might feel data analysis of your fb activity may be completely useless task until you discover some insights about your life or persona changes reflected in your fb activity. Data Analysis is like a treasure hunt where all your efforts are useless until you hit the jackpot. In this blog I would reflect some of the insights which I discovered after analyzing my fb data using Power BI and relating it to some of the changes in my life.

Warning: The following data analysis might be a complete waste of your time and highly non-productive. Any accidental learning from this post might be completely coincidental & at your own risk.

Here is the PowerBI dashboard of my Facebook activity
Fbdashboard

Lets dig into some insights which I discovered from the dashboard

  • So I have posted 203 posts from 2009 till date with around 2835 likes which averages to around 14 likes per post which is not that bad. At any given point in time I have atleast 14 friends who likes my activity on fb.
  • I have got the most number of likes (210 & 135) when my books got published and my friends really liked me for getting published which means I should continuing doing it :)
  • I was most active in Facebook in the year 2012 but after that my activity dropped significantly in the year 2013 which is the year I got married ( No further insights might be warranted on this as it is self-explanatory).
  • The Word cloud which is the new custom visual available in Power BI gallery emphasizes & amplifies some of the good friends I have who have liked my posts & activity on fb. If I slice the data for the year 2013, the word cloud looks as shown below
    Year 2013 is the only time when my wife Charmi has liked my fb posts the most and then we got married & rest is history. So here is the data I have that I can share with my wife that post marriage Husbands are no longer appreciated any more. On the other side, Can husbands afford to not like any posts from their wives ???

WordCloud

On the other hand, When I sliced my data for the year 2009, I found few friends who used to like me very much but have lost touch somewhere so I am planning to connect with them again while when I slice the data for the year 2015, I have found some new friends in there.

  • Another interesting insight which I discovered from the type of my posts in FB is, until the year 2013 majority of my posts are status updates but off late I don’t like sharing much of the status updates & occasionally I just log in facebook and re-share videos or link I found interesting. I guess that has something to do with my maturity where in earlier, I just posted anything or everything which I did as status updates but slowly I don’t like doing status update as much.

Postsbytype

  • Finally I discovered that I am most active on FB in the month of December which may be due to the Holidays and least active in the month of Jun & Nov which are the times when I like to take vacations since June is our anniversary while Nov is my wife’s birthday.

PostsbyMonth

If anyone is interested to reuse my Power BI dashboard, Please little r me or message me and I will be happy to share the *.pbix file for your self-assessment.

Hope this helps !!!

Parikshit Savjani
Sr. Premier Field Engineer

My book on SQL on Azure succintlly released

I feel extremely excited to share the latest release of my book on SQL on Azure Succintly with Syncfusion publications which talks about the deployments of SQL Server to Azure.

You can download the latest version of the ebook from the following link
http://www.syncfusion.com/resources/techportal/details/ebooks/sqlazure

In recent years, we have witnessed how information technology has contributed to the growth and reach of businesses around the globe. Companies which are primarily technology companies, like Amazon, Facebook, and Uber, have built their businesses solely on the basis of information technology and have grown unimaginably. In today’s world we cannot imagine a business without a website, app, or web presence, but does having a website, app, or web presence guarantee a business will be successful? Until a business is successful, companies aren’t sure whether their investment in information technology and IT infrastructure is fruitful or not.

Setting up and maintaining a website, app, database, or data warehouse incurs a lot of capital and operational expenses (CAPEX and OPEX) for a company whose core business may not be IT. As a result, small and medium business around the world are looking towards cloud solution providers to offer IT as a service and pay only as they use with elastic scale up-down flexibility. Even large enterprises who can afford their own data centers and infrastructure are looking towards cloud providers since data is growing at a rapid pace and investing in new hardware and infrastructure for new projects, development, and testing may not make sense until the project becomes critical and revenue generating.

Cloud services allow big enterprises to innovate and research at an optimized cost without investing upfront on the infrastructure in their data centers. Further, when products are designed to run as a service, the product release cycle can be agile and fast. All of these benefits make cloud services an attractive offering and optimal alternative for businesses today.

In this book, I introduce you to Microsoft SQL Server (RDBMS) Offerings on Azure and talk about the use cases, considerations, configurations, and design optimizations for running SQL Server on Azure to give you predictable performance at optimized cost.

This book is primarily focused on instructing individuals with prior knowledge of SQL Server how to manage, optimize, and design applications running SQL Server on Azure.

Disabling Non-Clustered Columnstore Index doesn’t appear to execute in Stored Procedure in SQL 2012

Consider the following scenario, you have a large fact table with non-clustered columnstore index. In SQL 2012, Columnstore index is not updatable hence at night when you load or purge the table, the first step would be to disable non-clustered columnstore index followed by insert/update/delete to the table and the final step would be to rebuild the non-clustered columnstore index to ensure the table is available for querying & reporting on the following day to give you the desired reporting performance.

The Stored Procedure which is used to load or purge the table would be of the form as shown

CREATE PROCEDURE dbo.usp_Update_Fact_table

AS

BEGIN

AlterIndex [csindex_wk_fact] ON [dbo].[Wk_Fact] disable;– Disable Columnstore Index before Update

UPDATE [dbo].[Wk_fact] – Update Fact Table

Alterindex csindex_wk_fact  ON dbo.Wk_Fact rebuild;– Rebuild Index After Index

END

When you execute the above stored procedure, it fails with following error

UPDATE statement failed because data cannot be updated in a table with a columnstore index. Consider disabling the columnstore index before issuing the UPDATE statement, then rebuilding the columnstore index after UPDATE is complete.

The above error message might appear that ALTER INDEX… DISABLE statement is skipped during the execution of stored procedure and the stored procedure throws the above error while executing the UPDATE query. But the error message doesn’t occur during execution, it actually occurs during compilation of the stored procedure.

When the stored procedure is being compiled, it checks that the Fact table has non-clustered columnstore index which cannot be modified and hence throws the above error at the compile time. This can be further verified by trying to get the estimated showplan of the stored procedure which fails with the same error as above.

SET SHOWPLAN_XML ON
EXEC dbo.usp_Update_Fact_table

The Query Optimizer detects the Non-Clustered Columnstore index during the Query Optimization phase of the Query Plan generation and is an optimization to prevent plan generation to save some resources required to compile, execute and cache the query plan which might ultimately error out in the execution phase.

To overcome the above error, you have following options

  1. Run ALTER INDEX….disable outside stored procedure prior to the insert/update/del stored procedure as a first step of the Batch Load/purge job.
  2. If you need to run the disable index as part of same stored procedure, you can execute the DML to Fact table as dynamicSQL which will avoid the compile time check as shown below

    CREATE PROCEDURE dbo.usp_Update_Fact_table
    AS
    BEGIN

    Alter Index [csindex_wk_fact] ON [dbo].[Wk_Fact] disable;– Disable Columnstore Index before UpdateEXEC sp_executesql  ‘UPDATE [dbo].[Wk_fact]‘;– Update Fact Table as Dynamic SQL

    Alter index csindex_wk_fact  ON dbo.Wk_Fact rebuild;– Rebuild Index After Index

    END

  3. Upgrade to SQL 2014 or above as the columnstore Index is updatable in SQL 2014 and above version where these compile checks do not occur.

Hope this helps !!!

Parikshit Savjani
Senior Premier Field Engineer

Added Auto Indexing & Indexing Mode parameters in Add-DocumentCollection Cmdlet

Couple of weeks back I released Powershell cmdlets for DocumentDB in Github (Further info on this can be found in my blog here). Today I merged & added  Automatic Indexing & IndexingMode parameters to the Add-DocumentCollection Cmdlet to the project in the module.

https://github.com/savjani/Azure-DocumentDB-Powershell-Cmdlets

Azure DocumentDB collections are auto-indexed by default allowing the developers  to focus on the application development & not worry about the schema or index definition for the database. Automatic indexing of documents is enabled by consistent, write optimized, lock free, and log structured index maintenance techniques.

Anyone who has worked on databases would know, while indexes are useful to expedite querying, it is an additional overhead to write, maintain & store indexes. Hence it is important & efficient that we use indexing wisely & do not index attributes, which may never be used for querying or filtering.

Secondly DocumentDB supports two modes for Indexing viz Consistent (synchronous mode) v/s Lazy (asynchronous mode). If you are someone like me who has a database background, you would know that for continuous transactional OLTP workload we prefer to have Consistent indexing wherein the indexes are updated & maintained as data is inserted, updated or deleted. However for batch Datawarehousing type workloads we prefer to drop the index, load the data & recreate the index which will ensure to give us the best write performance. Similarly in DocumentDB if we intend to perform bulk load of the documents the write performance would be improved with Indexing Mode set to Lazy as compared to synchronous consistent mode.

At the time of writing this blog, AutoIndexing & IndexingMode settings are only supported during creation of collections & cannot be altered after collection is created. hence it is important that we plan & choose the right setting for our collection while creating it.

To support these settings during creating of collection using Powershell, I have added the following Optional parameters in Add-DocumentCollection Cmdlet.

AutoIndexing – ($true or $False) : This parameter is used to specify whether we want Automatic Indexing to be turned ON or OFF in the collection. It is optional parameter & if omitted, Automatic Indexing is turned ON by default.

IndexingMode – (Consistent or Lazy): This parameter is used to specify the IndexingMode for the collection. It is optional parameter  & if omitted it is set to Consistent by default.

A good reference for DocumentDB Indexing Policy is following article from Product Group

https://azure.microsoft.com/en-us/documentation/articles/documentdb-indexing-policies/

Usage:

Creating a New Database & New Document Collection with default AutoIndexing & IndexingMode

Import-Module ‘C:\Users\pariks\Documents\GitHub\Azure-DocumentDB-Powershell-Cmdlets\Azrdocdb\Azrdocdb1\bin\Debug\Azrdocdb.dll’
$ctx = New-Context -Uri <uri> -Key <key>
$db = Add-Database -Context $ctx -Name “DocDBPS2″
$coll = Add-DocumentCollection -Context $ctx -DatabaseLink $db.SelfLink -Name “DocCollPS”

Get-database -Context $ctx -SelfLink $db.SelfLink | ft

Creating a New Database & New Document Collection & Setting Automatic Indexing Policy

Import-Module ‘C:\Users\pariks\Documents\GitHub\Azure-DocumentDB-Powershell-Cmdlets\Azrdocdb\Azrdocdb1\bin\Debug\Azrdocdb.dll’
$ctx = New-Context -Uri <uri> -Key <key>
$db = Add-Database -Context $ctx -Name ‘DocDB’
$coll = Add-DocumentCollection -Context $ctx -DatabaseLink $db.SelfLink -Name ‘DocCollPS’ -AutoIndexing $true
Get-database -Context $ctx -SelfLink $db.SelfLink | ft

Creating a New Database & New Document Collection & Setting Automatic Indexing Policy & IndexingMode

Import-Module ‘C:\Users\pariks\Documents\GitHub\Azure-DocumentDB-Powershell-Cmdlets\Azrdocdb\Azrdocdb1\bin\Debug\Azrdocdb.dll’
$ctx = New-Context -Uri <uri> -Key <key>
$db = Add-Database -Context $ctx -Name ‘DocDB’
$coll = Add-DocumentCollection -Context $ctx -DatabaseLink $db.SelfLink -Name ‘DocCollPS’ -AutoIndexing $true -IndexingMode ‘Lazy’
Get-database -Context $ctx -SelfLink $db.SelfLink | ft

Hope this Helps Smile

Parikshit Savjani
Premier Field Engineer

Releasing Powershell Cmdlets for Azure DocumentDB in Github

Azure DocumentDB is Microsoft’s new budding NoSQL Schema Free database offering which is relatively a recent addition to the mammoth Microsoft Data Platform Stack.

Azure DocumentDB provides great flexibility to the Developer world who feels a lot restricted with RDBMS data model, adhering to the data types, schema, index designs & patterns. Azure DocumentDB gives flexibility to the Developers to store entire records or JSON documents in Azure DocDB database uniquely identified by a key-value pair. All the attributes within a document are indexed by default with range or hash indexes thereby expediting the query processing against the documents & improving the query performance for reporting & analytics. Hence Developer just needs to serialize the data structures into JSON format & store them in Azure DocumentDB database while Azure DocumentDB takes care of indexing & storage of the attributes. Azure DocumentDB supports querying the attributes using the familiar SQL language over JSON structure making it compatible with most reporting & analytics platform.

While I started learning & evaluating Azure DocumentDB, I found one of component which was missing in Azure DocumentDB is out of box Powershell cmdlets for managing & administering the database. I found this was a opportunity for me to kick start a new community project in Github & try to get my hands dirty with Github public repository which I was aspiring to learn & adopt for quite sometime now.

With this excitement & modesty, I have created a project in Azure DocumentDB for Powershell cmdlets in Github to perform some basic management operation in Azure DocumentDB

https://github.com/savjani/Azure-DocumentDB-Powershell-Cmdlets

Further I invite the community members to further extend this modest list of Cmdlets to a full blown DocDB management & administration module similar to SQLPS

Powershell Cmdlet Definition & its Usage

Creating a New Database

Import-Module C:\Users\pariks\Documents\WindowsPowerShell\Modules\DocDB\Azrdocdb.dll
$ctx = New-Context -Uri <uri> -Key <key>
$db = Add-Database -Context $ctx -Name “DocDBPS2″
Get-database -Context $ctx -SelfLink $db.SelfLink | ft

Creating a New Database & New Document Collection

Import-Module C:\Users\pariks\Documents\WindowsPowerShell\Modules\DocDB\Azrdocdb.dll
$ctx = New-Context -Uri <uri> -Key <key>
$db = Add-Database -Context $ctx -Name “DocDBPS2″
$coll = Add-DocumentCollection -Context $ctx -DatabaseLink $db.SelfLink -Name “DocCollPS”
Get-database -Context $ctx -SelfLink $db.SelfLink | ft

List all the Databases in the Given Azure DocumentDB Database Account

Import-Module C:\Users\pariks\Documents\WindowsPowerShell\Modules\DocDB\Azrdocdb.dll
$ctx = New-Context -Uri <uri> -Key <key>
Get-databases -Context $ctx | ft

List Database Properties for a given Database

Import-Module C:\Users\pariks\Documents\WindowsPowerShell\Modules\DocDB\Azrdocdb.dll
$ctx = New-Context -Uri <uri> -Key <key>
$db = Get-databases -Context $ctx | Where-Object {$_.Id -eq “DocDBPS”}
Get-database -Context $ctx -SelfLink $db.SelfLink | ft

Drop Database

Import-Module C:\Users\pariks\Documents\WindowsPowerShell\Modules\DocDB\Azrdocdb.dll
$ctx = New-Context -Uri <uri> -Key <key>
$db = Get-databases -Context $ctx | Where-Object {$_.Id -eq “DocDBPS”}
remove-database -Context $ctx -SelfLink $db.SelfLink

Get Database Account Consistency Level

Import-Module C:\Users\pariks\Documents\WindowsPowerShell\Modules\DocDB\Azrdocdb.dll
$ctx = New-Context -Uri <uri> -Key <key>
Get-DatabaseAccountConsistencyLevel -Context $ctx

Set Database Account Consistency Level

Import-Module C:\Users\pariks\Documents\WindowsPowerShell\Modules\DocDB\Azrdocdb.dll
$ctx = New-Context -Uri <uri> -Key <key>
Set-DatabaseAccountConsistencyLevel -Context $ctx -DefaultConsistencyLevel Eventual
Get-DatabaseAccountConsistencyLevel -Context $ctx

Hope this helps :)

Parikshit Savjani
Premier Field Engineer

Ungrouping Column using DAX in PowerPivot or SSAS Tabular

To start this blog post, let me first explain the problem. Consider a scenario where you receive an excel sheet or data in the following format

Category SubCategory
Bikes Mountain Bikes
Road Bikes
Touring Bikes
Components Handlebars
Bottom Brackets
Brakes
Chains
Cranksets
Derailleurs
Forks
Headsets
Mountain Frames
Pedals
Road Frames
Saddles
Touring Frames
Wheels
Clothing Bib-Shorts
Caps
Gloves
Jerseys

In the above case, the data is already grouped by Category. If you try to load this data in PowerPivot Data Model & try to create a Power view Column Chart report on Category, the view would look similar to shown below

image

Most of the data is grouped in Blank Category since the Power Pivot Data Model doesn’t know or understand that data was already grouped by Category.

To ungroup the data back in PowerPivot Data Model, I would first add an Index column to the existing data which can be easily added when you are loading the data using Power Query. So the Power Pivot Data Model now looks as shown below

image

Next, we add Calculated Column viz MaxIndex to calculate & store the index value of last non blank value of the Category for each row. The DAX calculation is shown below

=MAXX(FILTER(FILTER(Table1,LEN(Table1[Category])>0),Table1[Index]<=EARLIER(Table1[Index])),Table1[Index])

The output of the above calculation looks like following

image

The above table is now like a Parent-Child Hierarchy where MaxIndex is parent id. So to ungroup the Category column, we can now use LOOKUPVALUE Function as shown below

=LOOKUPVALUE([Category],[Index],[MaxIndex])

The output of the above calculation is as shown below

image

We can now hide the Category, MaxIndex & Index Columns from Client tools to make it invisible from Power view. Now if you visualize the same data in Column chart in Power view, the result is what we are looking for

image

Hope this helps !!!

Parikshit Savjani
Sr. Premier Field Engineer

How SQL Server Writes Transaction Log Sequentially

Last week in of the SQL Community Forum, one of my friend asked the following question

If I have 2 Tlog files for one database, how will these files fillup?

A) File 1 will be filled first and then File 2
B) Both the files will be filled together.

If B is true then how will sequentially be maintained?

The reason this question was asked since he observed that if he has multiple logfiles, if he performed a single bulk insert operation, both the log files were 50% filled rather than single log file 100 % filled.

If you are looking for the reason why this would happened please look at the end of post but in this blog post, I would like to further show how can we study the internals of SQL Server with the help of Xevents. To monitor the file writes we can use the sqlserver.file_write_completed Xevents to understand how SQL Server performs the sequential write operation on each SQL Server Log File in case of multiple Log files.

For my testing, I reused the same example used by Jonathan Kehayias in his blog where he explain the proportional fill algorithm

Create a Database with single Data File & 5 Log Files & a single empty table in which we perform the bulk insert of database.

 
USE [master]
GO
/****** Object: Database [MultipleLogFiles] Script Date: 5/30/2015 4:54:10 PM ******/
CREATE DATABASE [MultipleLogFiles]
CONTAINMENT = NONE
ON PRIMARY
( NAME = N'MultipleLogFiles', FILENAME = N'C:\Program Files\Microsoft SQL Server\MSSQL11.MSSQLSERVER\MSSQL\DATA\MultipleLogFiles.mdf' , SIZE = 5316608KB , MAXSIZE = UNLIMITED, FILEGROWTH = 0)
LOG ON
( NAME = N'MultipleLogFiles_log_1', FILENAME = N'C:\Program Files\Microsoft SQL Server\MSSQL11.MSSQLSERVER\MSSQL\DATA\MultipleLogFiles_log_1.ldf' , SIZE = 1048576KB , MAXSIZE = 2048GB , FILEGROWTH = 0),
( NAME = N'MultipleLogFiles_log_2', FILENAME = N'C:\Program Files\Microsoft SQL Server\MSSQL11.MSSQLSERVER\MSSQL\DATA\MultipleLogFiles_log_2.ldf' , SIZE = 1048576KB , MAXSIZE = 2048GB , FILEGROWTH = 0),
( NAME = N'MultipleLogFiles_log_3', FILENAME = N'C:\Program Files\Microsoft SQL Server\MSSQL11.MSSQLSERVER\MSSQL\DATA\MultipleLogFiles_log_3.ldf' , SIZE = 1048576KB , MAXSIZE = 2048GB , FILEGROWTH = 0),
( NAME = N'MultipleLogFiles_log_4', FILENAME = N'C:\Program Files\Microsoft SQL Server\MSSQL11.MSSQLSERVER\MSSQL\DATA\MultipleLogFiles_log_4.ldf' , SIZE = 1048576KB , MAXSIZE = 2048GB , FILEGROWTH = 0),
( NAME = N'MultipleLogFiles_log_5', FILENAME = N'C:\Program Files\Microsoft SQL Server\MSSQL11.MSSQLSERVER\MSSQL\DATA\MultipleLogFiles_log_5.ldf' , SIZE = 1048576KB , MAXSIZE = 2048GB , FILEGROWTH = 0)
GO

IF OBJECT_ID('SalesOrderHeader') IS NOT NULL
DROP TABLE [dbo].[SalesOrderHeader]
GO

SET NOCOUNT ON
GO

CREATE TABLE [dbo].[SalesOrderHeader](
[SalesOrderID] [int] IDENTITY(1,1) NOT NULL PRIMARY KEY,
[RevisionNumber] [tinyint] NOT NULL,
[OrderDate] Date NOT NULL,
[DueDate] Date NOT NULL,
[ShipDate] Date NULL,
[Status] [tinyint] NOT NULL,
[OnlineOrderFlag] [bit] NOT NULL,
[SalesOrderNumber] [nvarchar](25) NOT NULL,
[PurchaseOrderNumber] [nvarchar](25) NULL,
[AccountNumber] [nvarchar](15) NULL,
[CustomerID] [int] NOT NULL,
[SalesPersonID] [int] NULL,
[TerritoryID] [int] NULL,
[BillToAddressID] [int] NOT NULL,
[ShipToAddressID] [int] NOT NULL,
[ShipMethodID] [int] NOT NULL,
[CreditCardID] [int] NULL,
[CreditCardApprovalCode] [varchar](15) NULL,
[CurrencyRateID] [int] NULL,
[SubTotal] [money] NOT NULL,
[TaxAmt] [money] NOT NULL,
[Freight] [money] NOT NULL,
[TotalDue] [money] NOT NULL,
[Comment] [nvarchar](128) NULL,
[rowguid] [uniqueidentifier] NOT NULL,
[ModifiedDate] Date NOT NULL
)
GO

Create an Xevent Session to capture checkpoint_begin, checkpoint_end, file_written, file_write_completed, database_log_flush Xevents during the bulk insert operation

IF EXISTS(SELECT * FROM sys.server_event_sessions WHERE name='MultipleLogFiles')
DROP EVENT SESSION [MultipleLogFiles] ON SERVER;

DECLARE @sqlcmd nvarchar(4000) = '
CREATE EVENT SESSION MultipleLogFiles
ON SERVER
ADD EVENT sqlserver.checkpoint_begin
( WHERE (sqlserver.database_id = '+ cast(DB_ID() as varchar(3))+')),
ADD EVENT sqlserver.checkpoint_end
( WHERE (sqlserver.database_id = '+ cast(DB_ID() as varchar(3))+')),
ADD EVENT sqlserver.file_written
( WHERE (sqlserver.database_id = '+ cast(DB_ID() as varchar(3))+')),
ADD EVENT sqlserver.file_write_completed
( WHERE (sqlserver.database_id = '+ cast(DB_ID() as varchar(3))+')),
ADD EVENT sqlserver.databases_log_flush
( WHERE (sqlserver.database_id = '+ cast(DB_ID() as varchar(3))+')),
ADD TARGET package0.asynchronous_file_target(
SET filename=''C:\XEvents\MultipleLogFiles.xel'',
metadatafile=''C:\XEvents\MultipleLogFiles.xem'')
WITH (MAX_MEMORY = 8MB, EVENT_RETENTION_MODE = ALLOW_SINGLE_EVENT_LOSS, TRACK_CAUSALITY = ON, MAX_DISPATCH_LATENCY=5SECONDS)'

EXEC (@sqlcmd)

Start the Xevent, Perform the Bulk Insert & checkpoint to flush the dirty buffers to data files & finally STOP the Xevent Session

ALTER EVENT SESSION MultipleLogFiles
ON SERVER
STATE=START
GO
-- Load data into the test database
INSERT INTO dbo.SalesOrderHeader
SELECT RevisionNumber,
DATEADD(DD, 1126+number, OrderDate),
DATEADD(DD, 1126+number, DueDate),
DATEADD(DD, 1126+number, ShipDate),
soh.Status, OnlineOrderFlag, SalesOrderNumber,
PurchaseOrderNumber, AccountNumber,
CustomerID, SalesPersonID, TerritoryID,
BillToAddressID, ShipToAddressID,
ShipMethodID, CreditCardID, CreditCardApprovalCode,
CurrencyRateID, SubTotal, TaxAmt, Freight,
TotalDue, Comment, rowguid,
DATEADD(DD, 1126+number, ModifiedDate)
FROM AdventureWorks2012.Sales.SalesOrderHeader AS soh
CROSS JOIN master.dbo.spt_values AS sv
WHERE sv.type = N'P'
AND sv.number > 0 AND sv.number < 6
GO 50
-- Flush all dirty pages to disk
CHECKPOINT
GO
-- Stop the Event Session
ALTER EVENT SESSION MultipleLogFiles
ON SERVER
STATE=STOP
GO

Load the Xevent data from xel file into a Table & later extract the XML eventdata into another table called MultipleLogFileResultsParsed for reporting

IF OBJECT_ID('MultipleLogFileResults') IS NOT NULL
DROP TABLE MultipleLogFileResults
GO
IF OBJECT_ID('MultipleLogFileResultsParsed') IS NOT NULL
DROP TABLE MultipleLogFileResultsParsed
GO
-- Create results table to load data from XE files
CREATE TABLE MultipleLogFileResults
(RowID int identity primary key, event_data XML)
GO
-- Load the event data from the file target
INSERT INTO MultipleLogFileResults(event_data)
SELECT
CAST(event_data AS XML) AS event_data
FROM sys.fn_xe_file_target_read_file('C:\XEvents\MultipleLogFiles*.xel',
'C:\XEvents\MultipleLogFiles*.xem',
null, null)
GO
-- Parse the event data
SELECT
RowID,
event_data.value('(event/@name)[1]', 'varchar(50)') AS event_name,
DATEADD(hh,
DATEDIFF(hh, GETUTCDATE(), CURRENT_TIMESTAMP),
event_data.value('(event/@timestamp)[1]', 'datetime2')) AS [timestamp],
COALESCE(event_data.value('(event/data[@name="database_id"]/value)[1]', 'int'),
event_data.value('(event/action[@name="database_id"]/value)[1]', 'int')) AS database_id,
event_data.value('(event/data[@name="mode"]/text)[1]', 'nvarchar(4000)') AS [mode],
event_data.value('(event/data[@name="file_handle"]/value)[1]', 'nvarchar(4000)') AS [file_handle],
event_data.value('(event/data[@name="offset"]/value)[1]', 'bigint') AS [offset],
event_data.value('(event/data[@name="page_id"]/value)[1]', 'int') AS [page_id],
event_data.value('(event/data[@name="file_id"]/value)[1]', 'int') AS [file_id],
event_data.value('(event/data[@name="file_group_id"]/value)[1]', 'int') AS [file_group_id],
event_data.value('(event/data[@name="wait_type"]/text)[1]', 'nvarchar(100)') AS [wait_type],
event_data.value('(event/data[@name="duration"]/value)[1]', 'bigint') AS [duration],
event_data.value('(event/action[@name="sql_text"]/value)[1]', 'nvarchar(4000)') AS [sql_text],
event_data.value('(event/data[@name="cpu"]/value)[1]', 'int') AS [cpu],
event_data.value('(event/data[@name="reads"]/value)[1]', 'bigint') AS [reads],
event_data.value('(event/data[@name="writes"]/value)[1]', 'bigint') AS [writes],
CAST(SUBSTRING(event_data.value('(event/action[@name="attach_activity_id"]/value)[1]', 'varchar(50)'), 1, 36) AS uniqueidentifier) as activity_id,
CAST(SUBSTRING(event_data.value('(event/action[@name="attach_activity_id"]/value)[1]', 'varchar(50)'), 38, 10) AS int) as event_sequence
INTO MultipleLogFileResultsParsed
FROM MultipleLogFileResults
ORDER BY Rowid

To analyze the Xevent data, I used Power view which is my favorite tool for Data Analytics & Reporting.

LogWrites

LogWritesfiltered

From the above screenshot we observe the following

  • When the transaction started, it started filling the Transaction Log 5 (in this case file id 6)
  • Next it sequentially started filling Transaction Log 1 file id 2
  • Next it sequentially started filling Transaction Log 2 file id 3
  • Next it sequentially started filling Transaction Log 3 file id 4
  • Lastly we see writes on Data File id 1 since we fired CHECKPOINT at the end of transaction to flush & write all dirty buffers to Data Files

At this point, if we fire DBCC LOGINFO or looks at the size of Tlog, it might appear why  SQL Server skipped filling of  transaction log 4 (file id 5) & filled transaction log 5 (file id 6) instead. In fact after seeing the above data you would understand that when the transaction started the current active VLF was somewhere in the middle of the Transaction log 5 (file id 6) from where it started filling the logs & sequentially moved to 1st tlog file (file id 2) , 2nd tlog file (file id 3) & then 3rd Tlog file (file id 4).

The following is the snapshot of DBCC LOGINFO at the end of transaction.

LogInfo

Hence to answer my friend’s question.

Yes SQL Server writes sequentially into VLFs of single file & then moves to the next file in round robin manner & loops back to first file after the Tlog is truncated.

So why did we observe that both the Tlog were 50% used rather single transaction file 100% file?

This is because in his case, when the transaction started the current active VLF file would be in the middle of Tlog file id 1 & hence when the data flush reached end of file Tlog 1, it sequentially moved to following VLFs in second file & filled 50% on first file & 50 % on second file.

As observed in the output of DBCC LOGINFO earlier, when we have multiple log files, the VLFs are ordered sequentially from 1st log file until the last one, hence if the current active VLF (status 2) is somewhere in between of the Tlog, after reaching the end of 1st file, it moves to the next sequential VLF (FSeqNo in DBCC LOGINFO) which at the start of second Log file & not at the start of 1st Logfile which is why both the Log files are partially filled.

Further after the above snapshot of DBCC LOGINFO, when we perform next few transactions, the current active VLF is 92… after 92 is filled, SQL Server moves to the next VLF & renames the FSeqNo of VLF from 53 to 93, status changes to 2 & parity changes to 128.

Loginfo2

Hope this explains the sequential writing of Transaction Log

Parikshit Savjani
Premier Field Engineer

SQL Server AlwaysON Group Failover in Cluster causes Node to lose network connectivity

Last week, I worked on an issue which might sound weird or strange when you hear it for the first time.

“Whenever we fail SQL Server AlwaysON Group from Cluster, Node A loses its network connectivity.” (Sounds weird, Right ???)

Here is the email which I received from my customer

We’re able to re-create the issue on demand now, as every time we try to fail the SQL cluster group over from A to B, as soon as the SQL network name resource goes offline on the A server, the default gateway on the public NIC gets deleted and the box drops off the network. This only happens when failing over from A to B (B to A does not cause any issues), and also only explicitly occurs when the SQL Network resource is taken offline on the A server when being failed over to B (the core cluster name resource and group can fail over between nodes without issue).

The first obvious thing which comes to our mind is, it can’t be SQL Server application causing this since it sits on top on Windows platform & the failover mechanism doesn’t have anything to do with Default Gateway setting in NICs but as Sherlock Homes says

“It is a capital mistake to theorize before one has data. Insensibly one begins to twist facts to suit theories, instead of theories to suit facts.”

After spending few days on troubleshooting & research we discovered & found the root cause of the issue was incorrect IP routing on Cluster Nodes.

More Specifically,

The issue was caused due to SQL Resource IP registered in persistent route of Public NIC for Node A. This incorrect persistent route registered for Public NIC on Node A which is not bound to the Interface causes the Active Route & Persistent Route to go down when the SQL IP resource goes offline or is failed over

More Details on this issue can be found in the blog

http://blogs.technet.com/b/networking/archive/2009/05/21/active-route-gets-removed-on-windows-server-2008-offline-cluster-ip-address.aspx

http://support.microsoft.com/kb/2161341

To resolve the issue, the Windows Networking team was involved to rectify the persistent route setting & IP routing table.

Hope this helps for anyone who hears or faces this strange issue w.r.t SQL AlwaysON Cluster on Windows 2008 or above Cluster

Parikshit Savjani
Premier Field Engineer

Consideration for changing the default port of default SQL instance on Multi-subnet Failover Cluster

Security Hardening or Security Audits in an enterprise are some of the motivation for DBAs to run SQL Server default instance on a non-default port. For standalone instances, there are no challenges in changing the default instance port to non-default port, however for clustered instance there are some considerations since cluster.exe performs IsAlive check by logging into SQL Server instance.

First to avoid some confusion or myth if any. When a client is connecting to default instance of SQL Server listening on default port (1433) or non-default port, SQL Browser service is not involved & has no role to play in case of default instance. This is the same reason why you observe SQL Browser service is disabled, when you install only default instance of SQL Server on a given server.  SQL Browser service has a role to play only for Named instances & not when Default instances is listening on non-default ports.

For a clustered default instance of SQL Server, when we change the port of the SQL Instance to non-default port, the SQL Server resource might fail to come online & you might see the following error messages in Cluster Log

<< The following abstract is for Clustered default instance of SQL 2012 running on non-default port>>
00002840.00003ef0::2015/01/16-23:25:19.637 INFO [RES] SQL Server : [sqsrvres] Server name is SQLFCI
00002840.00003ef0::2015/01/16-23:25:19.637 INFO [RES] SQL Server : [sqsrvres] Service name is MSSQLSERVER
00002840.00003ef0::2015/01/16-23:25:19.637 INFO [RES] SQL Server : [sqsrvres] Dependency expression for resource ‘SQL Network Name (SQLFCI)’ is ‘([5c22a982-9400-416b-b5bd-95da07998601] or [50660f59-4f33-4cd1-9dc0-84f6b32ebb76])’
00002840.00003ef0::2015/01/16-23:25:19.637 INFO [RES] SQL Server : [sqsrvres] This instance is running with IP OR dependency. Resource DLL Will enable MultiSubnetFailover=yes in connection string
00002840.00003ef0::2015/01/16-23:25:19.637 WARN [RES] SQL Server : [sqsrvres] This is default instance. If non-default TCP/IP listening port number is specified, and SQL Server fails to come online, configure an alias on each possible owner node and try again. If it does not work, please contact customer support
00002840.00003ef0::2015/01/16-23:25:19.637 INFO [RES] SQL Server : [sqsrvres] Starting service MSSQLSERVER…
0000125c.00004990::2015/01/16-23:25:19.886 INFO [NM] Received request from client address SQLFCINodeA.
00002840.00003ef0::2015/01/16-23:25:20.842 INFO [RES] SQL Server : [sqsrvres] Service status checkpoint was changed from 0 to 1 (wait hint 20000). Pid is 19508
0000125c.00003fe8::2015/01/16-23:25:23.639 INFO [NM] Received request from client address SQLFCINodeA.
00002840.00003ef0::2015/01/16-23:25:23.842 INFO [RES] SQL Server : [sqsrvres] Service is started. SQL Server pid is 19508
00002840.00003ef0::2015/01/16-23:25:23.842 INFO [RES] SQL Server : [sqsrvres] Connect to SQL Server …
00002840.00003ef0::2015/01/16-23:25:54.855 ERR [RES] SQL Server : [sqsrvres] ODBC Error: [08001] [Microsoft][SQL Server Native Client 11.0]TCP Provider: The remote computer refused the network connection.
00002840.00003ef0::2015/01/16-23:25:54.855 ERR [RES] SQL Server : [sqsrvres] ODBC Error: [HYT00] [Microsoft][SQL Server Native Client 11.0]Login timeout expired (0)
00002840.00003ef0::2015/01/16-23:25:54.855 ERR [RES] SQL Server : [sqsrvres] ODBC Error: [08001] [Microsoft][SQL Server Native Client 11.0]A network-related or instance-specific error has occurred while establishing a connection to SQL Server. Server is not found or not accessible. Check if instance name is correct and if SQL Server is configured to allow remote connections. For more information see SQL Server Books Online. (1225)
00002840.00003ef0::2015/01/16-23:25:54.855 INFO [RES] SQL Server : [sqsrvres] Could not connect to SQL Server (rc -1)
00002840.00003ef0::2015/01/16-23:25:54.855 INFO [RES] SQL Server : [sqsrvres] SQLDisconnect returns following information
00002840.00003ef0::2015/01/16-23:25:54.855 ERR [RES] SQL Server : [sqsrvres] ODBC Error: [08003] [Microsoft][ODBC Driver Manager] Connection not open (0)
00002840.00003ef0::2015/01/16-23:25:59.857 INFO [RES] SQL Server : [sqsrvres] Connect to SQL Server …
00002840.00003ef0::2015/01/16-23:26:30.860 ERR [RES] SQL Server : [sqsrvres] ODBC Error: [08001] [Microsoft][SQL Server Native Client 11.0]TCP Provider: The remote computer refused the network connection.

With Enhanced Verbose Logging in SQL 2012 as seen above, we now get an message saying that if you are running default instance on non-default TCP/IP port, we should create an alias (TCP Or NP )  to bring SQL Server resource online.

00002840.00003ef0::2015/01/16-23:25:19.637 WARN  [RES] SQL Server <SQL Server>: [sqsrvres] This is default instance. If non-default TCP/IP listening port number is specified, and SQL Server fails to come online, configure an alias on each possible owner node and try again. If it does not work, please contact customer support

In case of Multi-Subnet Failover Cluster we observed the creation of alias brought the SQL Server resources online, however SQL Agent resource failed to come online. When we observed the SQLAgent.out Log we see the following error message


2015-01-16 17:36:54 – ! [150] SQL Server does not accept the connection (error: 1225). Waiting for Sql Server to allow connections. Operation attempted was: Verify Connection On Start.
2015-01-16 17:36:54 – ! [000] Unable to connect to server ‘(local)’; SQLServerAgent cannot start
2015-01-16 17:36:59 – ! [298] SQLServer Error: 1225, TCP Provider: The remote computer refused the network connection. [SQLSTATE 08001]
2015-01-16 17:36:59 – ! [165] ODBC Error: 0, Login timeout expired [SQLSTATE HYT00]
2015-01-16 17:36:59 – ! [298] SQLServer Error: 1225, A network-related or instance-specific error has occurred while establishing a connection to SQL Server. Server is not found or not accessible. Check if instance name is correct and if SQL Server is configured to allow remote connections. For more information see SQL Server Books Online. [SQLSTATE 08001]
2015-01-16 17:36:59 – ! [382] Logon to server ‘(local)’ failed (DisableAgentXPs)
2015-01-16 17:36:59 – ? [098] SQLServerAgent terminated (normally)

This issue was specific to Multi-Subnet Failover Clustering since we observed that if we disable the cluster checkpoints & Modify the following registry value from 1 to 0, SQL Agent resource would come online.

HKEY_LOCAL_MACHINE\SOFTWARE\Microsoft\Microsoft SQL Server\MSSQL11.MSSQLSERVER\Cluster\RunWithIPORDependency

However it would remain online until SQL Server resource is restarted. When SQL Server resource comes online again, the SQL Server detects it as Multi-Subnet Cluster & flips the bit from 0 to 1 (Note: this is not caused due to Cluster Checkpointing)

This prevents SQL Agent resource to come online. To work around this issue, we can set the ServerHost registry for SQL Agent to point to use the Alias created which will ensure SQL Agent can connect to SQL Server & the resource comes online.

  1. 1.       Remove Cluster Checkpoint

Cluster res “SQL Network Name(SQLFCI)” /removecheckpoint:”Software\Microsoft\Microsoft SQL Server\MSSQL11.MSSQLSERVER\SQLServerAgent”

  1. 2.       Set ServerHost registry key to Alias name
  1. 3.       Add Cluster Checkpoint

Cluster res “SQL Network Name(SQLFCI)” /addcheckpoint:”Software\Microsoft\Microsoft SQL Server\MSSQL11.MSSQLSERVER\SQLServerAgent”

After pointing ServerHost key to Alias which resolves to non-default port, SQL Agent is able to connect SQL Server & comes online.

I have filed a connect bug for the same issue & feel free to vote to bump up the priority

https://connect.microsoft.com/SQLServer/feedback/details/1090514/sql-agent-fails-to-come-online-in-multi-subnet-clustered-default-instance-of-sql-server-running-on-non-default-port

Parikshit Savjani
Premier Field Engineer