Data is your business, and maintaining a healthy backup and recovery plan is vital to protecting your data.
SQL Server's transaction logs, which bridge the most recent changes to the last backup, should be a part of that backup and recovery plan. Without a healthy, well-maintained transaction log, you can recover old data but your users will still have to re-enter all of their changes since the last backup. Fail to maintain transaction logs and you may fail to keep your job. Fortunately, SQL Server's transaction logs are easy to maintain.
How SQL Server Transaction Logs Work
Most client/server databases offer a transaction log, which is simply a separate file where the server tracks operations in the following sequence:
- The log notes that the server expects a change.
- The log notes that the server made a change.
- The log notes that the server committed the change to the data file.
When users change data, SQL Server doesn't write that change directly to the data. Rather, SQL Server locates the appropriate data and then loads it into a special area of RAM called the data cache. Changes are made in RAM. Then, SQL Server copies changes waiting in RAM to the transaction log. Only then does SQL Server write changes to the actual data file.
This is called a write-ahead log because SQL Server writes changes to the log before it writes changes to the actual data file. This approach is quite a bit faster than writing directly to the data file.
Perhaps more important than performance is the transaction log's role in data recovery. Thanks to the transaction log, you can recover changes right up to the error from which you're recovering. During the recovery process, SQL Server scans the log for changes that weren't committed. That way, the database can finish what it started.
The log stores changes in three parts:
Backed-up: This section contains changes that were committed the last time you backed up the database.
Inactive: This section contains committed changes that haven't been backed-up yet.
Active: This section contains committed and uncommitted changes (depending on their sequence and relation to other changes).
SQL Server identifies each event with a log sequence number (LSN) as follows:
|101||Begin transaction 1|
|102||Update transaction 1|
|103||Begin transaction 2|
|104||Update transaction 2|
|105||Commit transaction 1|
|107||Commit transaction 2|
When SQL Server begins a backup, it records the current LSN. For instance, from the checkpoint at LSN 106, SQL Server searches for the oldest open transaction. In this case, that's 103 because that transaction is uncommitted at the checkpoint. Therefore, transactions 103 and higher represent transactions that occurred during the actual backup process. When the backup is complete, SQL Server backs up the transactions from 103 to the most current transaction.
Avoid a Full Transaction Log at All Costs
As great as full transaction logs are, they'll work for you only if you maintain them properly. A full transaction log file grinds production to a halt. SQL Server will simply refuse to write changes, leaving your users unable to work. You'll probably suffer this error only once, because once is enough.
When it does happen, you must truncate the log by hand using BACKUP LOG as follows:
BACKUP LOG databasename WITH TRUNCATE_ONLY
Executing this statement forces SQL Server to dump the inactive area of the log, which gets your database running and lets your users get back to work. However, truncating the log has a downside (you just knew there would be): you lose the history necessary to restore the database if it crashes. To protect the data, run a BACKUP DATABASE statement on the entire database immediately following the BACKUP LOG statement. Your users can continue to work, and although the unexpected backup might be a bit of a pain, it's necessary.
The key is to avoid a quickly filling transaction log altogether. Fortunately, you can easily do that by heeding the following administrative guidelines:
Avoid large transactions. A large transaction can be adding or editing several million records with one INSERT or UPDATE.
- Avoid long-running BEGIN TRANSACTION blocks. Everything between BEGIN TRANSACTION and COMMIT remains active.
Avoid using the KILL statement or canceling a transaction from Management Studio. Either action will render a statement active--forever. (An occasional KILL is sometimes necessary and harmless; just don't make a habit of it.)
Shrink the Log
Truncating the log frees up space inside the log for new transactions, but it doesn't reclaim disk space. As a matter of habit, check the log's size. If it's truly large, shrink it after truncating it as follows:
DBCC SHRINKFILE (databasename_log, targetsize)
Targetsize represents, as an integer, the size that you want the file to be in megabytes. If you omit this value, SQL Server reduces it to the default file size. In addition, if the log is already larger than targetsize, SQL Server shrinks the file to the size needed to store the current records.
Although this command will free up some space, it'll also play havoc with file fragmentation at the disk level, so use it infrequently. Perform this action manually only when necessary. Or, if you're specific about conditions, you can execute this statement via an alert script.
Back It Up
SQL Server offers simple, full, and bulk-logged recovery models. For the most part, you should choose full, which allows you to back up both the database and the transaction log. You can back up a transaction log quickly and frequently; every few minutes isn't too often if data is critical.
If the worst happens, back up the current transaction log first. Then, restore the last full database backup, and all subsequent transaction log backups. For instance, suppose you adhere to the following backup schedule and a failure occurs at 9:00 PM:
|8:00 AM||Back up database|
|10:00 AM||Back up transaction log|
|12:00 PM||Back up database|
|2:00 PM||Back up transaction log|
|4:00 PM||Back up transaction log|
|6:00 PM||Back up database|
|8:00 PM||Back up transaction log|
First, you'd back up the 8:00 PM transaction log. Then, you'd restore the database using the last database backup from 6:00 PM. Finally, you'd apply the 8:00 PM transaction log backup and the active transaction log. (Differential backups are a bit more complex.)
After backing up a transaction log, SQL Server truncates the log's inactive section and reuses it to store new transactions. That way, the log doesn't grow uncontrollably large. Remember, SQL Server doesn't use the inactive items during recovery because those transactions are already complete.
If possible, don't store a database and its backup and transaction logs on the same server. Store these files on different physical disks, ideally located in different buildings.
Warning: Simple Recovery May Not Be Enough
Some experts suggest using the simple recovery model because SQL Server truncates the transaction log at every checkpoint, which keeps the transaction log at a manageable size. If you follow this advice, however, you'll be living on the wild side. In a crash, you'll lose everything up to the last backup because the simple recovery model offers no transaction log with which to restore from the last backup to the crash. So be sure to back up the database frequently if you opt for simple recovery.
On the other hand, if you seldom change data, or change only a few items frequently, simple can be more efficient, but that's really the only good reason to consider it. Otherwise, the full model is the way to go.
About the Author
Susan Sales Harkins is an independent consultant and the author of several articles and books on database technologies. Her most recent book is Mastering Microsoft SQL Server 2005 Express, with Mike Gunderloy, published by Sybex.
This article was first published on Devx.com.