Recently I have encountered a critical incident with database restoration failure. I decided to restore a database for DR (Backup-Restore) Testing. As a Scary DBA we should do random database restore testing on to Test Server prior any disaster strike on our real Production Server. So that we can ensure our backup is valid and restorable without any issues.
My "Test" Database has memory optimized table and my DB size is 575 GB. This is fairly a new topic for me how to troubleshoot memory optimized table database. A memory optimized table, starting in SQL Server 2014, is simply a table that has two copies, one is active and one durable on disk whether that includes data or just schema only. Since memory is flushed upon restart of SQL Services, SQL Server keeps a physical copy of the table that is recoverable.
Issue:--Restoration of "XYZ" database failed on our Test Server with the following error:--
Msg 41379, Level 16, State 0, Line 0
Restore operation failed for database 'XYZ' due to insufficient memory in the resource pool 'default'. Close other applications to increase the available memory, ensure that both SQL Server memory configuration and resource pool memory limits are correct and try again. See 'http://go.microsoft.com/fwlink/?LinkID=507574' for more information.
Msg 3456, Level 16, State 1, Line 2
Could not redo log record (310751:4331:4), for transaction ID (2:-741982062), on page (1:229325), allocation unit 72057637921488896, database 'XYZ' (database ID 7). Page: LSN = (310735:185062:258), allocation unit = 72057637921488896, type = 1. Log: OpCode = 2, context 2, PrevPageLSN: (310749:76109:7). Restore from a backup of the database, or repair the database.
Expression: db->TransactionMap->ComputeCount() == 0
Process ID: 7392
Restore was successful but deferred transactions remain. These transactions cannot be resolved because there are data that is unavailable. Either use RESTORE to make that data available or drop the filegroups if you never need this data again. Dropping the filegroup results in a defunct filegroup.
RESTORE WITH CONTINUE_AFTER_ERROR was successful but some damage was encountered. Inconsistencies in the database are possible.
RESTORE DATABASE successfully processed 15513188 pages in 1867.005 seconds (64.915 MB/sec).
Msg 3313, Level 16, State 1, Line 2
During redoing of a logged operation in database 'XYZ', an error occurred at log record ID (310751:4331:4). Typically, the specific failure is previously logged as an error in the Windows Event Log service. Restore the database from a full backup, or repair the database.
This is completely a bogus error message i believe strongly which is indicating that restore the database a full backup or repair the database and also some damage was encountered.
Well, Even I ran restore verifyonly command and it returned my backup was valid/good. No doubt....
Let's do more troubleshoot.....while restoring
- Error Log suggested that restoration failed because of insufficient memory to SQL Server
- From metadata information of backup file received by "Restore HeaderOnly" have shown in memory file stream table.
- Memory allocated to SQL Server got fully utilized before restoration was failed.
- On Production checked the memory usage of "memory optimize tables" feature, which came to 70 GB
- Test Server OS memory was 60 GB
Let's find out memory usage table, basically i will look for very interesting thing which is Memory Optimized Table using GUI...
Even Not much clear.. for me. Let's fo a further investigate ....usnig query
Resolution: I have allocated 60 GB+40 GB =100 GB memory to SQL Server and the database restored successfully but i requested SySOps for aditional 5 GB more total 105 GB. The Server, I restore a database to must have enough available memory for the memory-optimized tables in the database backup, otherwise the database will not come online, and will be marked as Suspect mode/Recovery Pending. After all this work, I was finally able to recover the "Test" database on our DBA QA/Test Server
What's your thought? Feel free to comments!!! In the next article i will touch base on how i handled TB over Database in Recovery Pending State....Yes i had this horrible situation on my production box and i spent almost 11 hours, you may encounter this similar issues on your Production env, who knows btw i took this challenge and brought online this culprit db anyway...:)