Transactions and ACID Properties

Small exploration of database transactions

·

2 min read

A transaction is a unit of execution in need of being carried by the database, we're mainly talking about queries.

A transaction can at a given moment be in one of these states:

The diagram in itself is self-explanatory: a transaction begins its life entering an active state, from there it either ultimately enters a fully committed state and terminates or for some reason fails entering first a failed state and then having its already made progress be rolled back leaving the transaction aborted.

The reasons for failure might be both critical system failures like an unexpected system shutdown and minor failures in the executed logic like a table addressed in an operation being non existant).

The only part that I feel is non-self-explanatory is the "partially committed" state, this is a state that depends on specific commit politics defined in one of these cases:

  • an initial database connection phase

  • defined in a transaction by a specific command named COMMIT

  • defined by database connector internal politics (for e.g how a db.query() function works internally)

In these cases, the program might choose to commit partial results during the transaction execution in case of system failure so that we would have a checkpoint over our work that the database system would save on some internal logs.

The actual default behavior in most cases is a COMMIT command at the end of the transaction but these behaviors, as said before, can be modified to suit specific needs: for example, having checkpoints in huge transactions to have partial progress saved.

A.C.I.D compliance

A transaction needs to satisfy four common points described by the ACID acronym to allow for a good amount of data integrity and reliability:

  • A stands for Atomic, meaning that the transaction needs to be carried out altogether or if for any reason this is not possible the transaction undergoes what is called a "rollback" and the already executed alterations to tables and rows get undone and the remaining portion of the transaction gets ignored.

  • C stands for Consistent, this means that the results of a transaction must be consistent especially in situations of concurrency (very common in a database system) where multiple transactions are interleaved and risk operating on inconsistent data states.

  • I stands for Isolated, with this we mean that a transaction is granted isolation in its execution free from depending on other transactions, again, a very useful property given the highly concurrent nature of database transactions.

  • D stands for Durable, as the results once the database system agrees on the transaction state need to be permanent and recorded as such by the database internals.