In the fast-paced world of data management, the seamless flow of information is paramount. Behind the scenes of every powerful application and website lies a complex network of databases working tirelessly to process, store, and retrieve data. However, as multiple users and applications access these databases simultaneously, a critical challenge arises: how to ensure data integrity and prevent conflicts that could lead to corrupted information. Enter the enigmatic realm of database locking – a fundamental concept that empowers databases to handle concurrent requests while maintaining order and consistency. In this article, we will embark on a journey to demystify the world of database locking, delving into its importance, various types, and best practices for implementation. So, buckle up and get ready to unlock the secrets behind this crucial aspect of data management that keeps our digital world running smoothly.
Understanding types of locks
There are two broad classifications of locks used in databases - Read and Write locks.
Read locks basically lock the data in such a manner that other transactions can read the locked data but are prohibited from updating it. Whereas Write locks are a little more restrictive in such a way that they do not allow reads or updates on the locked data.
Now when we talk about locking the data it actually means locking one or more row or columns or even an entire table in some cases. So lets understand how locks are attained by different SQL statements.
Digging deeper into locks
According to the official documentation of MySQL,
"A locking read, an UPDATE, or a DELETE generally set record locks on every index record that is scanned in the processing of an SQL statement. It does not matter whether there are WHERE conditions in the statement that would exclude the row."
"If you have no indexes suitable for your statement and MySQL must scan the entire table to process the statement, every row of the table becomes locked, which in turn blocks all inserts by other users to the table. It is important to create good indexes so that your queries do not scan more rows than necessary."
Lets try to understand this statement. Here we have used the term locking read. The normal SELECT ... FROM that we use is just a consistent read which reads from a snapshot of the database without setting any transaction locks. InnoDB provides SELECT ... FOR SHARE and SELECT ... FOR UPDATE for making locking reads. Locking reads locks the rows that are processed for executing the query. Here the role of indexes become really important. If your select query has condition on columns which are not indexed and the database engine needs to scan the entire table to fetch the results then the entire table will be locked for any updates. But in the presence of indexes it just needs to lock the index record rather than the actual rows of the table. So it becomes very important for you to have proper indexes created before using such locking read queries.
Locking Reads
As discussed earlier InnoDB provides two types of locking reads. This is required as normal reads do not provide proper protection against concurrent reads and writes. Now there are two levels on which locking is performed. SELECT ... FOR SHARE provides shared mode locks such that any other transaction can read the locked rows but cannot perform updates on it. SELECT ... FOR UPDATE provides exclusive locks such that other transactions can neither read the locked rows nor can update the locked rows. SELECT ... FOR UPDATE comes in handy during transactional workload where multiple transactions are trying to work on the same set of rows. So in order to maintain consistency in data these locks are useful. I have given an example of this in my post on Concurrency Control. So you can go check it out to have some practical example of using locks in a highly concurrent ticket booking system.
Avoiding Locks
Whenever a transaction has acquired a lock on certain rows, other transactions have to wait to access the rows until the lock is released by the previous transaction. But in some case this is not acceptable. The transaction should either exit raising an error or just skip the rows that are locked and move forward. NOWAIT and SKIP LOCKED options can be used along with SELECT ... FOR SHARE and SELECT ... FOR UPDATE to avoid waiting for transactions to realease row locks. A very common example of this is a ticket booking system. NOWAIT returns with an error if it finds the row in its result set to be lcoked by some other transaction. While on the other hand SKIP LOCKED skips the locked rows from its result set returning the unlocked rows. Lets say we want to book a ticket to a movie Oppenheimer. In the opening weekend you would expect alot many people to be booking the seats simultaneously. So when a person is proceeding to book a seat then this seat is locked for that person for some time so that he can successfully make the payment. But for others who are still looking for their favourite seats shouldn't be kept waiting until the other person has booked the seat. So in order to avoid waiting for the other transaction to complete and release the lock we can use the SKIP LOCKED method to display the available seats which are not locked.
Conclusion
The careful orchestration of locks provides delicate balance between concurrency and consistency. By employing the insights gained here, one can navigate the complexities of data locking, leveraging techniques such as SELECT ... FOR SHARE and SELECT ... FOR UPDATE to synchronize concurrent operations effectively. Moreover, the tools of NOWAIT and SKIP LOCKED offer a pragmatic solution for scenarios where waiting is not an option.
In the next part of this post i will dive into the practical implementation of locking by actually locking the rows through raw sql queries which will give you a better understanding of how to use the locking concepts in real life.
Comments