Re: Update blocking a select count(*)?

From: Benedict Holland <benedict(dot)m(dot)holland(at)gmail(dot)com>
To: Kevin Grittner <Kevin(dot)Grittner(at)wicourts(dot)gov>
Cc: pgsql-performance(at)postgresql(dot)org
Subject: Re: Update blocking a select count(*)?
Date: 2012-06-15 19:45:48
Message-ID: CAD+mzoxCUbQJb4XMuNizFXU9mVYcUg9myGomKXW1TwXuKnauTg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

You were completely correct. I stopped the Alter Table and the select is
now running. Is it a bug that the blocking process reported is the finial
process but really the process blocking the intermediate? If alter table
can block a select but the update can't, then I personally would consider
this a rather large bug because from, the DB perspective, the wrong
information is being presented. This also means I am now very skeptical
that the blocking processes are correct in these sorts of situations. I
can't be the first person to discover this and thank you for bearing with
me.

> In PostgreSQL SELECT count(*) must scan the table to see which rows
> are visible to the executing database transaction. Without that, it
> can't give a completely accurate count from a transactional
> perspective. If you can settle for a non-transactional
> approximation, select the reltuples value from the pg_class row for
> the table.

I agree with you somewhat. I would assume that "select count(*)" is special
in the sense that it is table schema independent. I would actually hope
that this is an order 1 operation since the total table length should be
stored somewhere as it's a reasonably useful source of information. Because
it is schema independent, even an alter table shouldn't block it as why
should it? The transaction comes in when you are adding more data to the
end of the table so the select count(*) needs a transaction to guarantee a
finish only. This should not block or be blocked on anything. The where
clause, a group by or a distinct clause etc. should block on an Alter
table. Is this just an edge case which is not worth looking at?

Thank you so much for your help.
~Ben

On Fri, Jun 15, 2012 at 3:32 PM, Kevin Grittner <Kevin(dot)Grittner(at)wicourts(dot)gov
> wrote:

> Benedict Holland <benedict(dot)m(dot)holland(at)gmail(dot)com> wrote:
>
> > Yes. I needed to do a full vacuum. Again, the database is very
> > large. I batch inserted quite a lot of data and then modified that
> > data. The vacuum isn't blocking anything. It was blocking other
> > tables (as expected) but continues to run and clean. My tables in
> > general are around 10GB, each update seems to nearly double the
> > size of it so I required a full vacuum.
>
> I was trying to suggest techniques which would prevent that bloat
> and make the VACUUM FULL unnecessary. But, now that I've had a
> chance to format the attachment into a readable format, I agree that
> it isn't part of the problem. The iceberg in this case is the ALTER
> TABLE, which is competing with two other queries.
>
> > The blocked statements are the select count(*) and the alter
> > table.
>
> OK.
>
> > Both are blocked on the update table command.
>
> Not directly. The lock held by the UPDATE would *not* block the
> SELECT; but it *does* block the ALTER TABLE command, which can't
> share the table while it changes the structure of the table. The
> SELECT is blocked behind the ALTER TABLE.
>
> > The alter table command SHOULD be blocked and that is fine.
>
> I'm glad we're on the same page there.
>
> > The select count(*) should never be blocked as that is the whole
> > point of running an MVCC operation at least to my understanding. I
> > can even accept the use case that the select should block with an
> > Alter Table operation if data is retrieved from the table, but a
> > select count(*) only returns the number of rows and should be
> > table space independent.
>
> In PostgreSQL SELECT count(*) must scan the table to see which rows
> are visible to the executing database transaction. Without that, it
> can't give a completely accurate count from a transactional
> perspective. If you can settle for a non-transactional
> approximation, select the reltuples value from the pg_class row for
> the table.
>
> > I also don't understand why a select count(*) requires an
> > AccessShareLock. I don't understand why a select should lock
> > anything at all.
>
> So that the table isn't dropped or truncated while the count is
> scanning the table.
>
> -Kevin
>

In response to

Responses

Browse pgsql-performance by date

  From Date Subject
Next Message Kevin Grittner 2012-06-15 19:51:26 Re: Update blocking a select count(*)?
Previous Message Kevin Grittner 2012-06-15 19:32:10 Re: Update blocking a select count(*)?