Periodically slow inserts

From: Gael Le Mignot <gael(at)pilotsystems(dot)net>
To: pgsql-performance(at)postgresql(dot)org
Subject: Periodically slow inserts
Date: 2010-10-21 12:25:44
Message-ID: plop8762wvspfb.fsf@aoskar.kilobug.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance


Hello,

We are using PostgreSQL for storing data and full-text search indexes
for the webiste of a daily newspaper. We are very happy overall with the
results, but we have one "weird" behaviour that we would like to solve.

The problem is when we index objects into the full-text search part of
the database (which a DELETE and then an INSERT into a specific table),
the INSERT sometimes take a long time (from 10s to 20s), but the same
insert (and many other similar ones) are fast (below 0.2s).

This slowness comes regularly, about every 200 objects indexed,
regardless of the frequency of the inserts. If I reindex one object
every 5 seconds for one hour, or one object every second for 10 minutes,
I've the same kind of results : around 0.5% of the time, indexing took
more than 10s.

The positive point is that this slowness doesn't block the rest of
queries to the database, but it's still painful to have to wait (even if
only once in a while) for 10s or 20s when the end-user hits the "save"
button.

This slowness is associated with very high IO load on the operating
system. I tried playing with checkpoint parameters (making them more
frequent or less frequent, but I didn't notice any siginificant
difference).

Do you have any hint on how to smooth the process, so we don't have this
regular huge slowdown ?

If you want more details about the setup :

- server is a Xen virtual machine with 8Gb of memory, disks being 15000
rpm SAS disks on RAID 1, and CPU being one core of a Nehalem processor
(but CPU load is low anyway).

- the database schema is like :

CREATE TABLE sesql_index (
classname varchar(255),
id integer,
created_at timestamp,
modified_at timestamp,
created_by integer,
modified_by integer,
workflow_state integer,
site_id integer,
title_text text,
title_tsv tsvector,
subtitle_text text,
subtitle_tsv tsvector,
fulltext_text text,
fulltext_tsv tsvector,
authors integer[],
folders integer[],
[...]
indexed_at timestamp DEFAULT NOW(),
PRIMARY KEY (classname, id)
);

CREATE TABLE sesql_author (CHECK (classname = 'Author'),
PRIMARY KEY (classname, id)) INHERITS (sesql_index);

CREATE TABLE sesql_program (CHECK (classname = 'Program'),
PRIMARY KEY (classname, id)) INHERITS (sesql_index);

CREATE TABLE sesql_default (CHECK (classname = 'Slideshow' OR classname
= 'Book' OR classname = 'Article' OR classname = 'Publication' OR
classname = 'Forum'), PRIMARY KEY (classname, id)) INHERITS (sesql_index);

(with a few other similar tables for different objects).

Inserts/deletes are done directly into the child tables, searches are
done either on the master table (sesql_index) or on the child tables
depending of the use case (but search works fine anyway).

In addition to that we have several indexes, created on each child
tables :

CREATE INDEX sesql_default_classname_index ON sesql_default (classname);
CREATE INDEX sesql_default_id_index ON sesql_default (id);
CREATE INDEX sesql_default_created_at_index ON sesql_default (created_at);
CREATE INDEX sesql_default_modified_at_index ON sesql_default (modified_at);
CREATE INDEX sesql_default_created_by_index ON sesql_default (created_by);
CREATE INDEX sesql_default_modified_by_index ON sesql_default (modified_by);
CREATE INDEX sesql_default_workflow_state_index ON sesql_default (workflow_state);
CREATE INDEX sesql_default_site_id_index ON sesql_default (site_id);
CREATE INDEX sesql_default_publication_date_index ON sesql_default (publication_date);
CREATE INDEX sesql_default_authors_index ON sesql_default USING GIN (authors);
CREATE INDEX sesql_default_folders_index ON sesql_default USING GIN (folders);

And the heavy ones, for each fulltext field, we have two columns, the
text and the tsv, with an index on the tsv, and the tsv itself is
updated via a trigger :

CREATE INDEX sesql_default_fulltext_index ON sesql_default USING GIN (fulltext_tsv);

CREATE TRIGGER sesql_default_fulltext_update BEFORE INSERT OR UPDATE
ON sesql_default FOR EACH ROW EXECUTE PROCEDURE
tsvector_update_trigger(fulltext_tsv, 'public.lem_french', fulltext_text);

Thanks a lot for reading me until here ;)

Regards,

--
Gaël Le Mignot - gael(at)pilotsystems(dot)net
Pilot Systems - 9, rue Desargues - 75011 Paris
Tel : +33 1 44 53 05 55 - www.pilotsystems.net
Gérez vos contacts et vos newsletters : www.cockpit-mailing.com

Responses

Browse pgsql-performance by date

  From Date Subject
Next Message Kenneth Marshall 2010-10-21 13:03:08 Re: Periodically slow inserts
Previous Message Scott Marlowe 2010-10-21 07:54:00 Re: Index scan is not working, why??