unit- IV notes continued




Database Recovery



the concept of database recovery as a service that should be provided by the DBMS to ensure that the database is reliable and remains in a consistent state in the presence of failures.
The Need for Recovery
The storage of data generally includes four different types of media with an increasing
degree of reliability: main memory, magnetic disk, magnetic tape, and optical disk. Main
memory is volatile storage that usually does not survive system crashes. Magnetic disks
provide online non-volatile storage.
There are many different types of failure that can affect database processing, each
of which has to be dealt with in a different manner. Some failures affect main memory
only, while others involve non-volatile (secondary) storage. Among the causes of failure
are:
·         system crashes due to hardware or software errors, resulting in loss of main memory;
·         media failures, such as head crashes or unreadable media, resulting in the loss of parts
of secondary storage;
·         application software errors, such as logical errors in the program that is accessing the
database, which cause one or more transactions to fail;
·         natural physical disasters, such as fires, floods, earthquakes, or power failures;
·         carelessness or unintentional destruction of data or facilities by operators or users;
·         sabotage, or intentional corruption or destruction of data, hardware, or software
facilities.

Recovery Facilities
A DBMS should provide the following facilities to assist with recovery:
1.     A backup mechanism
2.     Log file
3.     Checkpoint
4.     Recovery manager

1)      Backup mechanism The DBMS should provide a mechanism to allow backup copies of the database and the log file to be made at regular intervals without necessarily having to stop the system first. The backup copy of the database can be used in the event that the database has been damaged or destroyed. A backup can be a complete copy of the entire database or an incremental backup, consisting only of modifications made since the last complete or incremental backup. Typically, the backup is stored on offline storage, such as magnetic tape.

2)    Log file
To keep track of database transactions, the DBMS maintains a special file called a log
(or journal) that contains information about all updates to the database. The log may contain
the following data:
Transaction records,- – transaction identifier;
– type of log record (transaction start, insert, update, delete, abort, commit);
– identifier of data item affected by the database action (insert, delete, and update
operations);
before-image of the data item, that is, its value before change (update and delete
operations only);
after-image of the data item, that is, its value after change (insert and update operations
only);
– log management information, such as a pointer to previous and next log records for
that transaction (all operations).
Owing to the importance of the transaction log file in the recovery process, the log may
be duplexed or triplexed (that is, two or three separate copies are maintained) so that if one
copy is damaged, another can be used. In the past, log files were stored on magnetic tape
because tape was more reliable and cheaper than magnetic disk. However, nowadays
DBMSs are expected to be able to recover quickly from minor failures. This requires that
the log file be stored online on a fast direct-access storage device.
3)    Checkpointing


The information in the log file is used to recover from a database failure. One difficulty
with this scheme is that when a failure occurs we may not know how far back in the log
to search and we may end up redoing transactions that have been safely written to the
database. To limit the amount of searching and subsequent processing that we need to
carry out on the log file, we can use a technique called checkpointing.
Checkpoints are scheduled at predetermined intervals and involve the following
operations:
·         writing all log records in main memory to secondary storage;
·         writing the modified blocks in the database buffers to secondary storage;
·         writing a checkpoint record to the log file. This record contains the identifiers of all
transactions that are active at the time of the checkpoint.
4)       recovery manager
allows the system to restore the database to a consistent state following a failure.

Recovery Techniques

·         If the database has been extensively damaged, for example a disk head crash has occurred and destroyed the database, then it is necessary to restore the last backup copy of the database using the log file. This assumes, that the log file has not been damaged as well. the log file can be stored on a disk separate from the main database files. This reduces the risk of both the database files and the log file being damaged at the same time.
·         If the database has not been physically damaged but has become inconsistent, for example the system crashed while transactions were executing, then it is necessary to undo the changes that caused the inconsistency. Here, we do not need to use the backup copy of the database but can restore the database to a consistent state using the before- and after-images held in the log file.

1)    Recovery techniques using deferred update

Using the deferred update recovery protocol, updates are not written to the database until
after a transaction has reached its commit point. If a transaction fails before it reaches this
point, it will not have modified the database and so no undoing of changes will be necessary.
we use the log file to protect against system failures in the following way:

àWhen a transaction starts, write a transaction start record to the log.
àWhen any write operation is performed, write a log record containing all the log data
specified previously (excluding the before-image of the update).
àWhen a transaction is about to commit, write a transaction commit log record, write all
the log records for the transaction to disk, and then commit the transaction. Use the log
records to perform the actual updates to the database.
àIf a transaction aborts, ignore the log records for the transaction and do not perform the
writes.

2)    Recovery techniques using immediate update

Using the immediate update recovery protocol, updates are applied to the database as they
occur without waiting to reach the commit point. As well as having to redo the updates of
committed transactions following a failure, it may now be necessary to undo the effects of
transactions that had not committed at the time of failure. In this case, we use the log file
to protect against system failures in the following way:

When a transaction starts, write a transaction start record to the log.
àWhen a write operation is performed, write a record containing the necessary data to the
log file.
àOnce the log record is written, write the update to the database buffers.
àThe updates to the database itself are written when the buffers are next flushed to
secondary storage.
àWhen the transaction commits, write a transaction commit record to the log.

If the system fails, recovery involves using the log to undo or redo transactions:
·         For any transaction for which both a transaction start and transaction commit record appear in the log, we redo using the log records to write the after-image of updated fields
·         For any transaction for which the log contains a transaction start record but not a transaction commit record, we need to undo that transaction. This time the log records are used to write the before-image of the affected fields

3)    Shadow paging

An alternative to the log-based recovery schemes described above is shadow paging. This scheme maintains two-page tables during the life of a transaction: a current page table and a shadow page table. When the transaction starts, the two-page tables are the same. The shadow page table is never changed thereafter, and is used to restore the database in the event of a system failure. During the transaction, the current page table is used to record all updates to the database. When the transaction completes, the current page table becomes the shadow page table. Shadow paging has several advantages over the log-based schemes: the overhead of maintaining the log file is eliminated, and recovery is significantly faster since there is no need for undo or redo operations.

Advanced Transaction Models

Nested Transaction Model



was introduced by Moss (1981). In this model, the complete transaction forms a tree, or hierarchy, of subtransactions. There is a top-level transaction that can have a number of child transactions; each child transaction can also have nested transactions. In Moss’s original proposal, only the leaf-level subtransactions (the subtransactions at the lowest level of nesting) are allowed to perform the database operations.

For example, we have a reservation transaction (T1) that consists of booking flights (T2), hotel (T5), and hire car (T6). The flight reservation booking itself is split into two subtransactions: one to book a flight from London to Paris (T3) and a second to book a connecting flight from Paris to New York (T4). Transactions have to commit from the bottom upwards. Thus, T3 and T4 must commit before parent transaction T2, and T2 must commit before parent T1.

The main advantages of the nested transaction model are its support for:
·         Modularity A transaction can be decomposed into a number of subtransactions for the purposes of concurrency and recovery.
·         A finer level of granularity for concurrency control and recovery Occurs at the level of the subtransaction rather than the transaction.
·         Intra-transaction parallelism Subtransactions can execute concurrently.
·         Intra-transaction recovery Uncommitted subtransactions can be aborted and rolled back without any side-effects to other subtransactions.

Emulating nested transactions using savepoints



One of the objectives of the nested transaction model is to provide a unit of recovery at
a finer level of granularity than the transaction. During the execution of a transaction,
the user can establish a savepoint, for example using a SAVE WORK statement.† This
generates an identifier that the user can subsequently use to roll the transaction back to,
for example using a ROLLBACK WORK <savepoint_identifier> statement.

                                 
Serializability and Recoverability
The objective of a concurrency control protocol is to schedule transactions in such a way as to avoid any interference between them. One obvious solution is to allow only one transaction to execute at a time: one transaction is committed before the next transaction is allowed to begin. However, the aim of a multi-user DBMS is also to maximize the degree of concurrency or parallelism in the system, so that transactions that can execute without interfering with one another can run in parallel.


for each transaction Ti in schedule S, the order of the operations in Ti must be the same in schedule S.



In a serial schedule, the transactions are performed in serial order. For example, if we
have two transactions T1 and T2, serial order would be T1 followed by T2, or T2 followed
by T1.



Types of serializability

1)     Conflict serializability

In serializability, the ordering of read and write operations is important:
-- If two transactions only read a data item, they do not conflict and order is not important.
--If two transactions either read or write completely separate data items, they do not and order is not important.
-- If one transaction writes a data item and another either reads or writes the same data
item, the order of execution is important.

The figure containing operations from two concurrently executing transactions T7 and T8. The write (balx) instruction of T7 conflicts with the read(balx) instruction of T8. However, the write (balx) instruction of T2 does not conflict with the read(baly) instruction of T7, because the two instructions access different data items.

2)      Non- Conflict serializability

If we also now change the order of the following non-conflicting operations, we produce the equivalent serial schedule S3 shown in Figure.  Here T7 and T8 are non confilicting transactions and also in serial schedule.

3)      View serializability

Two schedules S1 and S2 consisting of the same operations from n transactions T1, T2, . . . , Tn are view equivalent if the following three conditions hold:
·         For each data item x, if transaction Ti reads the initial value of x in schedule S1, then transaction Ti must also read the initial value of x in schedule S2.
·         For each read operation on data item x by transaction Ti in schedule S1, if the value read by x has been written by transaction Tj, then transaction Ti must also read the value of x produced by transaction Tj in schedule S2.
·         For each data item x, if the last write operation on x was performed by transaction Ti in schedule S1, the same transaction must perform the final write on data item x in schedule S2.



Recoverability


Serializability identifies schedules that maintain the consistency of the database, assuming
that none of the transactions in the schedule fails. An alternative perspective examines the
recoverability of transactions within a schedule. If a transaction fails, the atomicity property
requires that we undo the effects of the transaction. In addition, the durability property states
that once a transaction commits, its changes cannot be undone



Comments

  1. Thank you so much for the information on recovery as a service. I was looking for these information since I was facing some trouble in my business. So hope they can solve my problem.

    ReplyDelete

Post a Comment

Popular posts from this blog

DS unit-wise important questions

web lab programs