Re: Deleting bytea, autovacuum, and 8.2/8.4 differences

From: Dave Crooke <dcrooke(at)gmail(dot)com>
To: "fkater(at)googlemail(dot)com" <fkater(at)googlemail(dot)com>
Cc: pgsql-performance(at)postgresql(dot)org
Subject: Re: Deleting bytea, autovacuum, and 8.2/8.4 differences
Date: 2010-03-14 03:04:41
Message-ID: ca24673e1003131904p1e17e607o17eb67f08df1db30@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

Hi there

I'm not an expert on PG's "toast" system, but a couple of thoughts inline
below.

Cheers
Dave

On Sat, Mar 13, 2010 at 3:17 PM, fkater(at)googlemail(dot)com <
fkater(at)googlemail(dot)com> wrote:

> Hi all,
>
> my posting on 2010-01-14 about the performance when writing
> bytea to disk caused a longer discussion. While the fact
> still holds that the overall postgresql write performance is
> roughly 25% of the serial I/O disk performance this was
> compensated for my special use case here by doing some other
> non-postgresql related things in parallel.
>
> Now I cannot optimize my processes any further, however, now
> I am facing another quite unexpected performance issue:
> Deleting rows from my simple table (with the bytea column)
> having 16 MB data each, takes roughly as long as writing
> them!
>
> Little more detail:
>
> * The table just has 5 unused int columns, a timestamp,
> OIDs, and the bytea column, *no indices*; the bytea storage
> type is 'extended', the 16 MB are compressed to approx. the
> half.
>

Why no indices?

>
> * All the usual optimizations are done to reach better
> write through (pg_xlog on another disk, much tweaks to the
> server conf etc), however, this does not matter here, since
> not the absolute performance is of interest here but the
> fact that deleting roughly takes 100% of the writing time.
>
> * I need to write 15 rows of 16 MB each to disk in a maximum
> time of 15 s, which is performed here in roughly 10 seconds,
> however, now I am facing the problem that keeping my
> database tidy (deleting rows) takes another 5-15 s (10s on
> average), so my process exceeds the maximum time of 15s for
> about 5s.
>
> * Right now I am deleting like this:
>
> DELETE FROM table WHERE (CURRENT_TIMESTAMP -
> my_timestamp_column) > interval '2 minutes';
>

You *need* an index on my_timestamp_column

>
> while it is planned to have the interval set to 6 hours in
> the final version (thus creating a FIFO buffer for the
> latest 6 hours of inserted data; so the FIFO will keep
> approx. 10.000 rows spanning 160-200 GB data).
>

That's not the way to keep a 6 hour rolling buffer ... what you need to do
is run the delete frequently, with *interval '6 hours'* in the SQL acting
as the cutoff.

If you really do want to drop the entire table contents before refilling it,
do a *DROP TABLE* and recreate it.

> * This deletion SQL command was simply repeatedly executed
> by pgAdmin while my app kept adding the 16 MB rows.
>

Are you sure you are timing the delete, and not pgAdmin re-populating some
kind of buffer?

>
> * Autovacuum is on; I believe I need to keep it on,
> otherwise I do not free the disk space, right? If I switch
> it off, the deletion time reduces from the average 10s down
> to 4s.
>

You may be running autovaccum too aggressively, it may be interfering with
I/O to the tables.

Postgres vacuuming does not free disk space (in the sense of returning it to
the OS), it removes old versions of rows that have been UPDATEd or DELETEd
and makes that space in the table file available for new writes.

> * I am using server + libpq version 8.2.4, currently on
> WinXP. Will an upgrade to 8.4 help here?
>

8.4 has a lot of performance improvements. It's definitely worth a shot. I'd
also consider switching to another OS where you can use a 64-bit version of
PG and a much bigger buffer cache.

> Do you have any other ideas to help me out?
> Oh, please...
>
> Thank You
> Felix
>
>
>
>
>
> --
> Sent via pgsql-performance mailing list (pgsql-performance(at)postgresql(dot)org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-performance
>

In response to

Responses

Browse pgsql-performance by date

  From Date Subject
Next Message David Newall 2010-03-14 08:01:37 pg_dump far too slow
Previous Message fkater@googlemail.com 2010-03-13 21:17:42 Deleting bytea, autovacuum, and 8.2/8.4 differences