Re: Autovacuum - what does it actually do?

From: "Kasia Tuszynska" <ktuszynska(at)esri(dot)com>
To: "Brad Nicholson" <bnichols(at)ca(dot)afilias(dot)info>
Cc: <pgsql-novice(at)postgresql(dot)org>
Subject: Re: Autovacuum - what does it actually do?
Date: 2007-11-12 23:23:54
Message-ID: D7BFFE348C53EF4E8AA0698B1E395FA90B061BA2@flybywire.esri.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-novice

-----Original Message-----
From: Brad Nicholson [mailto:bnichols(at)ca(dot)afilias(dot)info]
Sent: Friday, November 09, 2007 12:23 PM
To: Kasia Tuszynska
Cc: pgsql-novice(at)postgresql(dot)org
Subject: Re: [NOVICE] Autovacuum - what does it actually do?

On Fri, 2007-11-09 at 09:06 -0800, Kasia Tuszynska wrote:
> Hello Everybody,
> I am trying to establish what does Autovaccum actually do, as it is
causing data load failures?

Autovacuum is a daemon that triggers vacuums when enough dead tuples are
generated in a table. The idea is that if it set up properly, the
database can automatically vacuum the tables when they need to get
vacuumed as opposed to the administrator having to do it manually.
There is still some tuning involved, and it's not perfect, but that's
another story.

There is no difference in the work that is actually done between a
vacuum triggered by autovacuum and one triggered manually.

> I ask because I am running into a weird scenario with large dataloads
and the autovacuum process.
> My setup:
> Windows 2003 server, sp2
> PostgreSQL 8.2.4 installed from the gui (autovacuum on in
postmaster.conf by default, no changes to postmaster.conf at all)
> Type of data loaded: large spatial (non postgis) -> lots of blob
data(4 gigs)
>
> This is my scenario:
> I am loading various sets of data, on several ones (that happen to be
very large)it seems like the autovacuum threshold is triggered and on
these select few datasets it crashes the postmaster:
> With code -1073741819, which I looked up to mean an access violation
to an address space, it was apparently bugged:

Can you provide a bit more info on this data load procedure? Are you
deleting from a table then loading the data?
Are you loading into a table that already has data that is being
changed? How are you loading it? Is it all in
one transaction?

A data load itself should not trigger a vacuum. Dead tuples are created
by updating or deleting tuples, not by inserting
them. You will need to analyze the table afterwards, it's possibe that
autovac is triggering that analyze.

> BUG #3427: Autovacuum crashed server
>
> http://archives.postgresql.org/pgsql-bugs/2007-07/msg00016.php
>
> The bug does not explain what is actually happening, what is
autovacuum doing to get an access violation, why would it happen on very
specific data everytime ?

I really don't know anything about Windows, but on UNIX, if a process
dies a hard enough death, it can corrupt the shared memory segment and
bring the whole postmaster down. That's what it looks like is
happening. I have no why though.
--
Brad Nicholson 416-673-4106
Database Administrator, Afilias Canada Corp.

Hi Brad,
Thank you for taking the time to answer my question.
A little bit more about what is happening.

"Can you provide a bit more info on this data load procedure? Are you
deleting from a table then loading the data?
Are you loading into a table that already has data that is being
changed? How are you loading it? Is it all in
one transaction?"
- this is a fresh load, the table is created and data is inserted, there
is no pre- exiting data to be considered for a delete or an update.
- potentially a gist index would be created on the table as the table
is created and than, it is immediately dropped to be recreated after the
bulk load has completed.
- the data is has a spatial component stored in a user defined spatial
type, similar in usage and construction to the postgis spatial type, it
stored data in binary and it is what utilizes the gist
- we are loading data in many transactions with an autocommitt interval
set on the client app that is loading.
- turning autovacuum off in postmaster.conf solves the issue and data is
loaded without a problem.

Thank you,
Kasia

In response to

Browse pgsql-novice by date

  From Date Subject
Next Message Devrim GÜNDÜZ 2007-11-13 00:44:02 Re: postgresql 8.1.10-1PGDG (32 bit) installaion on RHEL 4
Previous Message gokul 2007-11-12 22:20:35 postgresql 8.1.10-1PGDG (32 bit) installaion on RHEL 4