SQL Server Faq

Transactional Replication in SQL 2008 Fails “The distribution agent failed to create temporary files in ‘C:Program FilesMicrosoft SQL Server100COM’ directory. System returned errorcode 5.” Tweet

| 4 Comments

 

It has been really long time since I have updated this blog as my Job involves a lot of travel however I try my best to update it frequently. In this blog I am going to discuss a issue which one of Customers encountered recently.

Recently One of Customers had upgraded to SQL 2008 with Transactional Replication and after upgrading they found that distribution agent fails intermittently with following error

2010-07-22 11:36:51.609 Agent message code 21100. The distribution agent failed to create temporary files in ‘C:Program FilesMicrosoft SQL Server100COM’ directory. System returned errorcode 5.
2010-07-22 11:36:51.656 ErrorId = 2988, SourceTypeId = 2
ErrorCode = ’21100′  ErrorText = ‘The distribution agent failed to create temporary files in ‘C:Program FilesMicrosoft SQL Server100COM’ directory. System returned errorcode 5.’

Initially it appeared that we were hitting one of the known issue in SQL 2008 which is documented in the following KB article.  http://support.microsoft.com/kb/956032/
However
we tried to apply the hotfix mentioned in the KB article but that did not help.

Error Code 5 in the above error message clearly indicates Access is denied on COM Folder so we tried giving Full permission to Everyone but still we were receiving the above errors intermittently.

We also observed that the issue happens for just one Publication with one table in it , this table has a XML data column. There are other similar publications in your environment but those are not heavily used. All the failure happens for update on XML column of that table.

From SQL 2008 its seems that we are doing streaming in the background whenever we update a BLOG Column , in our case XML column. I am still not sure if this streaming for XML is strictly OLEDB streaming or a variant for the same but definitely there is some streaming going on because of which replication is creating temp files with extension name of “.LOB” (e.g. “C:Program FilesMicrosoft SQL Server100COM1a8b7ed71f0bda4183193049a69c0113_18_0.LOB”)

If you do not change the XML column this streaming is not happening and thus no .LOB file being generated , thus its sure that any update to the XML column is generating this stream by default.  We do not see this behavior in SQL 2005. The issue goes away in the next retry of the distribution agent. Thus its very intermittent issue. So we decided to capture procmon while the distribution Agent was running by filtering just on the folder path ‘C:Program FilesMicrosoft SQL Server100COM

This is what we see from the procmon.

 

distrib.exe          6068       CreateFile           C:Program FilesMicrosoft SQL Server100COM1a8b7ed71f0bda4183193049a69c0113_0_0.LOB             DELETE PENDING             Access: Generic Read/Write, Delete, Disposition: OverwriteIf, Options: Sequential Access, Synchronous IO Non-Alert, Non-Directory File, Delete On Close, Open No Recall, Attributes: HT, ShareMode: , AllocationSize: 0             

McShield.exe    1588       CloseFile              C:Program FilesMicrosoft SQL Server100COM1a8b7ed71f0bda4183193049a69c0113_0_0.LOB             SUCCESS                              7/29/2010 17:52               

Looking at the chain of events above , we can see that Distrib.exe got a DELETE PENDINGnotification , this could be because of the fact that McShield.exe was still holding the file, immediately followed by this we see Distrib.exe restarting with the different process ID 5296 and these events match with the time frame when we last saw  the error message in Replication Monitor.

Based on the above chain of events , we found McShield.exe  is not releasing the file handle soon and thus forcing the Distrib.Exe to fail/restart with a Access Denied Error message.

So to resolve the above error message we need to exclude the folder C:Program FilesMicrosoft SQL Server100COM from scanning in the Anti-virus.

In the KB Article http://support.microsoft.com/kb/309422 we have already documented the folders to be excluded for running antivirus on running on servers which run sql server. However the above folder is not mentioned so I thought I would document this in my Blog to help other to avoid failure of replication.

Credits: The issue was drilled down by one of our Escalation Engineer and my colleaque Rishi Maini while from the Customer end Ragvendra T helped us to resolve it.

Parikshit Savjani
Premier Field Engineer|Microsoft Services

 

4 Comments

  1. You rock!!!

    “So to resolve the above error message we need to exclude the folder C:Program FilesMicrosoft SQL Server100COM from scanning in the Anti-virus.”

    Symantec in my customer’s case.

    This was about to turn into a real hair-puller until I found your information.
    Thanks!!!

  2. Thank you! we tried the MS hotfix but did not work for us. The AV was the cause after doing this changes we don’t have problem any more :)

  3. I am having this issue but our disritbution instance is running on a windows 2008 cluster. I’m concerned that using the C drive on one of the nodes may have an issue in case of a fail over.

    I would like to change the location to a file share or one of the clustered luns . Any idea how to change the default location from C:Program FilesMicrosoft SQL Server100COM?

  4. For Cluster, you need not worry for these files which are created in COM Folder since they are just Temp files and will be created once the instance is failed over to another folder in the local path for COM folder.

    For Cluster, what you need to worry about is the location of the snapshot folder which should be in the fileshare.

Leave a Reply

Required fields are marked *.

*