Re: Piggybacking vacuum I/O

From: "Pavan Deolasee" <pavan(dot)deolasee(at)gmail(dot)com>
To: "Heikki Linnakangas" <heikki(at)enterprisedb(dot)com>
Cc: PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Piggybacking vacuum I/O
Date: 2007-01-24 04:02:24
Message-ID: 2e78013d0701232002u1b4a192fqe5422ff3d1fdc259@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 1/23/07, Heikki Linnakangas <heikki(at)enterprisedb(dot)com> wrote:
>
> Pavan Deolasee wrote:
> > Another source of I/O is perhaps the CLOG read/writes for checking
> > transaction status. If we are talking about large tables like accounts
> in
> > pgbench or customer/stock in DBT2, the tables are vacuumed much later
> than
> > the actual UPDATEs. I don't have any numbers to prove yet, but my sense
> is
> > that CLOG pages holding the status of many of the transactions might
> have
> > been already flushed out of the cache and require an I/O. Since the
> default
> > CLOG SLRU buffers is set to 8, there could be severe CLOG SLRU thrashing
> > during VACUUM as the transaction ids will be all random in a heap page.
>
> 8 log pages hold 8*8192*4=262144 transactions. If the active set of
> transactions is larger than that, the OS cache will probably hold more
> clog pages. I guess you could end up doing some I/O on clog on a vacuum
> of a big table, if you have a high transaction rate and vacuum
> infrequently...

On a typical desktop class 2 CPU Dell machine, we have seen pgbench
clocking more than 1500 tps. That implies CLOG would get filled up in less
than 262144/1500=174 seconds. VACUUM on accounts table takes much
longer to trigger.

> Would it help to set the status of the XMIN/XMAX of tuples early enough
> > such
> > that the heap page is still in the buffer cache, but late enough such
> that
> > the XMIN/XMAX transactions are finished ?

Yeah, we could do that. First I'd like to see some more evidence that
> clog trashing is a problem, though.

Here are some numbers from a 4 hour DBT2 run with 270 warehouses and 50
connections.

2007-01-23 07:40:30 PST_17428 LOG: vacuuming "public.warehouse"
2007-01-23 07:40:30 PST_17428 LOG: CLOG r(1), w(0)- vacuum start
2007-01-23 07:40:30 PST_17428 LOG: CLOG r(1), w(0)- vacuum end
2007-01-23 07:40:30 PST_17428 LOG: "warehouse": found 1214 removable,
2275 nonremovable row versions in 111 pages

<snip>

2007-01-23 11:11:43 PST_30356 LOG: vacuuming "public.stock"
2007-01-23 11:11:43 PST_30356 LOG: CLOG r(545323), w(91)- vacuum start
2007-01-23 12:03:14 PST_30356 LOG: CLOG r(1181851), w(133) - vacuum end
2007-01-23 12:03:14 PST_30356 LOG: "stock": found 5645264 removable,
27003788 nonremovable row versions in 1554697 pages

I have just counted the number of read/write calls on the CLOG blocks. As
you can
see the total number of CLOG reads jumped from 545323 to 1181851 i.e.
1181851 - 545323 = 636528 CLOG block reads for 1554697 pages of stock table.
This figure is only indicative since it also includes the CLOG block reads
which would
have happened as part of other backend operations (VACUUM took almost 55
minutes to
complete). Still in the first 210 minutes of the run, the total reads were
only 545323. So
most of the 636528 reads in the next 55 minutes can be attributed to VACUUM.

The writes are very small though, may be because most of the CLOG pages are
accessed
read-only. A simple patch that I used to get these numbers is attached.

Thanks,
Pavan

EnterpriseDB http://www.enterprisedb.com

Attachment Content-Type Size
CLOG-stats.patch.gz application/x-gzip 1.9 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Jim Nasby 2007-01-24 05:21:43 Re: About PostgreSQL certification
Previous Message John Bartlett 2007-01-24 03:54:14 Re: Updateable cursors