Error in Exchange 2013 CU2 when adding LAG copy

During my deployment of Exchange 2013 I noticed an interesting error in my application logs that took me by surprise: Error MapiExceptionTooManyMountedDatabases. I would have expected this had I been on Exchange 2013 CU1, but I had recently upgrade all my servers to CU2 and this was totally unexpected. You all remember the big announcement in July on the features available in CU2, the most anticipated one was the increase of the number of databases per server to Exchange 2010 levels. This received praise from System Admins everywhere! So if CU2  can support 100 databases and I’m on CU2,  then what gives? Let’s first give you a brief description of my environment.

My 2013 environment consists of 20 Dell R810 mailbox servers running Windows 2012, these servers are split into 2 DAGs of 10 server each and those are further split across 2 datacenters in an Active/Active design of 5 mailbox servers each. Each DAG has a File Share Witness at a 3rd primary site and a secondary FSW at a 4th site for further redundancy. As far as databases go, I have 120 databases per DAG with 3 passive copies and a 4th lag for a total of 5 copies. Each server should have 60 DB copies so without CU2 I wasn’t able to add my lags. I do run separate CAS roles, but I will save that stuff for another article.

I had already run the Exchange 2013 scripts you find in the Exchange 2013 calculator prior to CU2, so I could only use them for the first 4 copies. For the LAG copy, I had to perform the arduous steps of doing this process manually…240 times. I am sure you are all asking why couldn’t I script it and to answer your question, it was due to all the diskpart, directory and database creation commands that went into the original calculator script and I decided it was quicker to do this rather than deconstruct the script to serve my needs. Plus I use PowerShell ISE and it is really easy to do this type of work in this GUI.

So, I begin my journey and fired off my cmdlets on server number one. All the databases were created but when I ran a Get-MailboxDatabaseCopyStatus I noticed that the LAGs showed a Failed and Suspended status on the copies I just created.

copystatus

If you were to open the Exchange Admin Center, you would also see the database suspended and if you opened up the copy window, you will see an error stating there are too many database.

MapiError

The same error above is also present in the logs.

 Log Name:      Application
Source:        MSExchangeRepl
Date:          9/24/2013 3:14:21 PM
Event ID:      4057
Task Category: Service
Level:         Error
Keywords:      Classic
User:          N/A
Computer:      Servername.contoso.com
Description:
The Microsoft Exchange Replication service encountered an unexpected error in log replay for database ‘databasename\servername’. Error MapiExceptionTooManyMountedDatabases: Unable to mount database. (hr=0x8004060e, ec=-2147219954)
Diagnostic context:
Lid: 65256
Lid: 10722   StoreEc: 0x8004060E
Lid: 1494    —- Remote Context Beg —-
Lid: 37952   dwParam: 0x241EE58B
Lid: 39576   StoreEc: 0x977
Lid: 35200   dwParam: 0x95AC
Lid: 58864   StoreEc: 0x8004060E
Lid: 43248   StoreEc: 0x8004060E
Lid: 48432   StoreEc: 0x8004060E
Lid: 54336   dwParam: 0x241EF045
Lid: 1750    —- Remote Context End —-
Lid: 1047    StoreEc: 0x8004060E
Event Xml:
<Event xmlns=”http://schemas.microsoft.com/win/2004/08/events/event“>
<System>
<Provider Name=”MSExchangeRepl” />
<EventID Qualifiers=”49156″>4057</EventID>
<Level>2</Level>
<Task>1</Task>
<Keywords>0x80000000000000</Keywords>
<TimeCreated SystemTime=”2013-09-24T19:14:21.000000000Z” />
<EventRecordID>897512</EventRecordID>
<Channel>Application</Channel>
<Computer>BrokeServer.contoso.com</Computer>
<Security />
</System>
<EventData>
<Data>databasename\servername</Data>
<Data>MapiExceptionTooManyMountedDatabases: Unable to mount database. (hr=0x8004060e, ec=-2147219954)
Diagnostic context:
Lid: 65256
Lid: 10722   StoreEc: 0x8004060E
Lid: 1494    —- Remote Context Beg —-
Lid: 37952   dwParam: 0x241EE58B
Lid: 39576   StoreEc: 0x977
Lid: 35200   dwParam: 0x95AC
Lid: 58864   StoreEc: 0x8004060E
Lid: 43248   StoreEc: 0x8004060E
Lid: 48432   StoreEc: 0x8004060E
Lid: 54336   dwParam: 0x241EF045
Lid: 1750    —- Remote Context End —-
Lid: 1047    StoreEc: 0x8004060E</Data>
</EventData>
</Event>

These errors are not indicating that the doubling of the supported databases in CU2 are being applied. The only thing I could do at this point is contact support, open a ticket and then call Tim McMichael to help me out. I don’t like queues, they suck to be honest and since my company is Federated with Microsoft, I IM Escalation Engineers over Lync and see if they can take my ticket. Tim McMichael is the most informed individual on Exchange cluster then anyone I know and he is on my speed dial. I want to say that all the credit for this fix is his and the dev team at Microsoft, I am only documenting this find so others may save some time should this issue occur. For some great reading head over to Tim’s blog here.

Tim and I did all the basic troubleshooting, we escalated that to Time Traces on the server, several of them, but they weren’t showing anything productive. We next ran a TList debugger command on the server. If you haven’t run a TList before, it’s pretty cool. You need the windows debuggers installed, but it essentially gives you a list of all running processes, their PID, threads running in each process and any errors it may have. The full information on this debug tool is on the Windows Dev Center site.

In the output of this command, we were looking at the total number of Store Worker Processes running on the server.

     Command Line: “C:\Program Files\Microsoft\Exchange Server\V15\bin\Microsoft.Exchange.Store.Worker.exe” -id:217f0c3a-d483-425b-b7ea-71184cc3430e -pipe:3936 -readykey:Global\WorkerReadyKey-28d3fb96-6c8b-4b95-832e-f4b4383904aa
28316 Microsoft.Exchange.Store.Worker.exe

We confirmed that the server was only running 50 store worker processes not the 60 as expected. We then ran an LDP dump of the Servers Information Store in AD and this is where we received our aha moment. The setting in error is the msExchMaxStoresTotal attribute on the server as indicated in red below. This attribute is the total number of databases allowed per server should not equal 50, but rather 100. 

Dn: CN=InformationStore,CN=BrokeServer,CN=Servers,CN=Exchange Administrative Group (FYDIBOHF23SPDLT),CN=Administrative Groups,CN=Cox,CN=Microsoft Exchange,CN=Services,CN=Configuration,DC=domain,DC=com
adminDisplayName: InformationStore;
cn: InformationStore;
distinguishedName: CN=InformationStore,BrokeServer,CN=Servers,CN=Exchange Administrative Group (FYDIBOHF23SPDLT),CN=Administrative Groups,CN=Cox,CN=Microsoft Exchange,CN=Services,CN=Configuration,DC=domain,DC=com;
instanceType: 0x4 = ( WRITE );
msExchESEParamCircularLog: 0;
msExchESEParamCommitDefault: 0;
msExchESEParamDbExtensionSize: 256;
msExchESEParamEnableIndexChecking: TRUE;
msExchESEParamEnableOnlineDefrag: TRUE;
msExchESEParamLogFileSize: 5120;
msExchESEParamPageFragment: 8;
msExchESEParamPageTempDBMin: 0;
msExchESEParamZeroDatabaseDuringBackup: 0;
msExchMaxRestoreStorageGroups: 1;
msExchMaxStorageGroups: 100;
msExchMaxStoresPerGroup: 5;
msExchMaxStoresTotal: 50;
msExchMinAdminVersion: -2147453113;
msExchVersion: 4535486012416;
name: InformationStore;
objectCategory: CN=ms-Exch-Information-Store,CN=Schema,CN=Configuration,DC=COX,DC=com;
objectClass (3): top; container; msExchInformationStore;
objectGUID: 732ccbb0-e55c-4e71-924b-baa15c221108;
showInAdvancedViewOnly: TRUE;
uSNChanged: 886324806;
uSNCreated: 886292709;
whenChanged: 7/22/2013 3:31:02 PM Eastern Daylight Time;
whenCreated: 7/22/2013 12:00:13 PM Eastern Daylight Time;

If you are a GUI person, then open ADSI Edit and go to the Configuration > Services > Microsoft Exchange > Domain > Administrative Groups > Exchange Administrative Group (FYDIBOHF23SPDLT) > Servers > ServerName > Information Store. On the properties of the IS, you will see following.

InformationStore_Before

All 20 of my production mailbox servers had 50 as the value and all 20 were CU1 to CU2 upgrades. The 6 mailboxes I have for my staging environment were direct CU2 installs and showed the attribute as it’s expected value. I confirmed my results by installing a new Exchange 2013 server with CU1, and prior to applying the enterprise product key the msExchMaxStoresTotal value was the expected 5, the value then changed to 50 after setting the key and after the installation of CU2 the attribute maintained the incorrect value.

The fix for this issue was easy after we located the fault, just increase the value to 100 on each of your servers in the DAG, as seen below, and reboot. Once you servers is online, resume your database copies and potentially restart the search service to clear the suspended indexes.

InformationStore_After

Being this is my first technical blog post, I welcome and appreciate your feedback.

Advertisements

New Blog

2013 has been a really busy year for me. I’m continuing to take courses towards my college degree, I renovated the basement, build two pieces of furniture for the house and deployed Exchange 2013 in my company’s environment. Needless to say, I have a hard time catching my breath with the above items coupled with family time. I’ve been so busy that I failed to use some PTO and when I mean some I mean just about all of it, so I have to use almost a month in the next 10 weeks. With all this spare time I decided It’s about time I begin a blog to discuss one of my favorite subjects; Microsoft Exchange. In this blog I hope to pass on information I’ve learned over the course of my IT career and hopefully in kind, learn from all of you. I hope you find this blog informative as well as entertaining. When I’m not writing in depth technical articles I will be using this forum to vent my frustrations with just about anything that rubs me in an uncomfortable manner. I’m sure there will be no short of criticism and I welcome it when done constructively. I have a lot to learn about blogging and writing in general but I am an eager student and look forward to the challenge. Thanks for your support!