Re: 8.3rc1 Out of memory when performing update

From: "Stephen Denne" <Stephen(dot)Denne(at)datamail(dot)co(dot)nz>
To: "Tom Lane" <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: <pgsql-performance(at)postgresql(dot)org>
Subject: Re: 8.3rc1 Out of memory when performing update
Date: 2008-01-25 03:20:49
Message-ID: F0238EBA67824444BC1CB4700960CB480482B5B7@dmpeints002.isotach.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

> "Stephen Denne" <Stephen(dot)Denne(at)datamail(dot)co(dot)nz> writes:
> > A simple update query, over roughly 17 million rows,
> > populating a newly added column in a table, resulted in an
> > out of memory error when the process memory usage reached
> > 2GB. Could this be due to a poor choice of some configuration
> > parameter, or is there a limit on how many rows I can update
> > in a single statement?
>
> Do you have any triggers or foreign keys on that table? For that
> matter, let's see its whole schema definition.
>
> regards, tom lane

No triggers on that table, one primary key, one foreign key, two indexes.
The foreign key references a primary key which is also an integer.
No other tables which reference document_file.
No inherited tables.
There are as many document_file rows as there are rows in the document table,
document_file.document_id is unique, though not constrained.
(Designed as a one to many relationship, but only ever used as one to one.)

I altered the update statement slightly, and reran the query.

I disabled autovacuum after a while and cancelled the autovacuum process that was trying to vacuum analyze document_file.

The altered query has been running over 3 hours now,
without using lots of memory (38M private bytes).
2046 temp files were created (2.54GB worth),
which have recently changed from slowly growing in size
to very very slowly reducing in number.

Altered query that has not crashed:
UPDATE ONLY document_file AS df SET document_type_id = d.document_type_id FROM document AS d WHERE d.id = document_id;

Hash Join (cost=674810.80..6701669.63 rows=16972702 width=621)
Hash Cond: (df.document_id = d.id)
-> Seq Scan on document_file df (cost=0.00..750298.65 rows=27702365 width=617)
-> Hash (cost=396352.02..396352.02 rows=16972702 width=8)
-> Seq Scan on document d (cost=0.00..396352.02 rows=16972702 width=8)

c.f. original (re-explained):
UPDATE document_file SET document_type_id = (SELECT document_type_id FROM document d where d.id = document_id);

Seq Scan on document_file (cost=0.00..281183329.64 rows=27702834 width=617)
SubPlan
-> Index Scan using pk_document_id on document d (cost=0.00..10.12 rows=1 width=4)
Index Cond: (id = $0)

Schema as reported by pgadmin:

CREATE TABLE document_file
(
id integer NOT NULL DEFAULT nextval(('document_file_seq'::text)::regclass),
document_id integer NOT NULL,
archive_directory_location character varying(255) NOT NULL,
mime_type character varying(255),
file_name character varying(255) NOT NULL,
life_cycle_status character varying(255),
version integer DEFAULT 0,
is_current boolean DEFAULT true,
file_size integer NOT NULL,
document_type_id integer,
CONSTRAINT pk_document_file_id PRIMARY KEY (id),
CONSTRAINT fk_document_id FOREIGN KEY (document_id)
REFERENCES document (id) MATCH SIMPLE
ON UPDATE NO ACTION ON DELETE CASCADE
)
WITH (OIDS=FALSE);
ALTER TABLE document_file OWNER TO postgres;
GRANT ALL ON TABLE document_file TO postgres;
GRANT ALL ON TABLE document_file TO vapps;
GRANT ALL ON TABLE document_file TO vrconfig;

CREATE INDEX location_ix
ON document_file
USING btree
(archive_directory_location);

CREATE INDEX tc_file_document
ON document_file
USING btree
(document_id);

Disclaimer:
At the Datamail Group we value team commitment, respect, achievement, customer focus, and courage. This email with any attachments is confidential and may be subject to legal privilege. If it is not intended for you please advise by reply immediately, destroy it and do not copy, disclose or use it in any way.

__________________________________________________________________
This email has been scanned by the DMZGlobal Business Quality
Electronic Messaging Suite.
Please see http://www.dmzglobal.com/services/bqem.htm for details.
__________________________________________________________________

In response to

Responses

Browse pgsql-performance by date

  From Date Subject
Next Message Tom Lane 2008-01-25 04:50:25 Re: 8.3rc1 Out of memory when performing update
Previous Message Tom Lane 2008-01-25 02:31:14 Re: 8.3rc1 Out of memory when performing update