Re: A few beginner's questions concerning concurrency control

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Yoram Biberman <yoramb(at)hadassah-col(dot)ac(dot)il>
Cc: "'pgsql-general(at)postgresql(dot)org'" <pgsql-general(at)postgresql(dot)org>
Subject: Re: A few beginner's questions concerning concurrency control
Date: 2004-06-29 14:04:14
Message-ID: 28784.1088517854@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

Yoram Biberman <yoramb(at)hadassah-col(dot)ac(dot)il> writes:
> ... But we cannot say that the schedule is equivalent
> nor to the serial schedule that run T1 first (as in this schedule T2 would
> read the value of B after it was modified by T1), neither to the schedule
> that run T2 first (from a symmetric argument concerning item B). So the
> schedule is not serializable in the sense of the theory of database systems
> (e.g. Ullman=92s Principles of Database Systems book). Am I right?

Right. To make the behavior mathematically serializable we would have
to add predicate locking --- that is, when T2 reads A it would have to
take out a read-lock on A (in fact, on all rows potentially matching
the WHERE condition it used for its SELECT). This would be amazingly
expensive and it wouldn't actually improve functionality for most
applications :-(

> I was not able to understand the difference between all the lock modes, when
> would a transaction (or the db system) use each lock, and which data
> structures each lock locks.

All the locks are table locks, and the only differences between them are
which other lock types are blocked by a given lock type. The reason for
having so many is to allow fine-grained control of what sorts of things
can be done to a table concurrently. For instance, we can allow reads
and writes to occur concurrently, but we can't allow reads or writes to
occur concurrently with a table schema modification (such as DROP INDEX).

> In some places it is said that a transaction that only reads does not lock
> any table or row, and is never blocked. But if a transaction T1 modifies a
> row r, and at the same time transaction T2 selects r, then T2 need to wait
> until T1 finishes (as T1 might have deleted the row, or modified it in a way
> that would cause T1 not to need it, as the row does not satisfy T2=92s WHERE
> clause). Am I right?

No, because T2 is selecting the row or not based on its state at T2's
snapshot time. What T1 did to it immediately after that time is not
interesting.

> On the other hand in order to read a table T2 gets an
> ACCESS SHARE lock on the table, so it blocks transactions that want to drop
> the table (and I do not understand why it does not block transactions that
> want to add/delete/update rows of the table).

We do that mainly because the physical act of dropping the table (ie,
removing the storage file) isn't transactional. We could make this work
if we found a way to postpone the file unlink until after the table is
no longer visible to any running transaction --- but that seems like
more trouble than it's really worth. In practice, concurrent read and
write is what you want for real applications; concurrent schema changes
are not important enough to install a large amount of mechanism to
support. (IMHO anyway.)

regards, tom lane

In response to

Browse pgsql-general by date

  From Date Subject
Next Message Tom Lane 2004-06-29 14:10:18 Re: dup(0) failed after 3195 successes: Bad file descriptor
Previous Message Együd Csaba 2004-06-29 13:55:11 Re: Performance problem on RH7.1