BUG #1541: Unusually long INSERT times after fresh clean/CREATE TABLES

From: "John Engelhart" <johne(at)zang(dot)com>
To: pgsql-bugs(at)postgresql(dot)org
Subject: BUG #1541: Unusually long INSERT times after fresh clean/CREATE TABLES
Date: 2005-03-13 23:13:49
Message-ID: 20050313231349.A512FF0F3E@svr2.postgresql.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-bugs


The following bug has been logged online:

Bug reference: 1541
Logged by: John Engelhart
Email address: johne(at)zang(dot)com
PostgreSQL version: 8.0.1
Operating system: FreeBSD 5.3-p5
Description: Unusually long INSERT times after fresh clean/CREATE
TABLES
Details:

Recently upgraded from 7.4.7 to 8.0.1 and have discovered an oddity.

Since I'm developing an SQL based application, I routinely "start from
scratch" with a script that deletes all the tables in my database and
rebuilds them. A problem started when I upgraded from 7.4.7 to 8.0.1 in
that the first run after the clean takes an UNUSUALLY long time to complete,
on the order of a few hundred inserts/sec. A normal run takes ~30 seconds.
A from scratch run takes ~15 minutes, with the next run completing in 30
seconds with zero changes.

I believe I've eliminated any host or build anomolies due to a simple fact:
Starting fresh, running the data inserter, and then aborting a few seconds
later... then re-running the data-inserter runs at normal speed. It's as if
it need a few records in the DB in order to work at normal speeds. It
almost seems as if it's doing a worst case INSERT update if there are no
primary keys when the BEGIN transaction starts?

I have also dropped all CREATE INDEX statements from my rebuild script and
it behaves exactly the same. The only other 'unusual' thing remaining would
be the various SERIAL NOT NULL UNIQUE PRIMARY KEY statements.

The specific tables and data are quite lengthy, and not appropriate for
pasting to fully recreate the problem. I can send a tarball if requested.

As an example of the times involved (same average times, always
repeatable):

[johne(at)new] simp_sql> time simp_sql <test8.cpp
INFO: Opening database connection @136/initialize/main_sql.c
INFO: Begining parse @47/main/main_sql.c
INFO: GC- After parse: Heap 348Kb Free 144Kb @49/main/main_sql.c
INFO: Performing house keeping and index rebuild @58/main/main_sql.c
INFO: Marking all blocks @61/main/main_sql.c
INFO: Performing house keeping and index rebuild @76/main/main_sql.c
INFO: GC- Before exiting: Heap 348Kb Free 320Kb @79/main/main_sql.c
INFO: Closing database connection @95/cleanup/main_sql.c
3.861u 3.138s 17:21.25 0.6% 83+4732k 0+0io 0pf+0w
[johne(at)new] simp_sql>

[IMMEDIATLY after, no accesses to the database anywhere inbetween]

[johne(at)new] simp_sql> time simp_sql < test8.cpp
INFO: Opening database connection @136/initialize/main_sql.c
INFO: Begining parse @47/main/main_sql.c
INFO: GC- After parse: Heap 348Kb Free 24Kb @49/main/main_sql.c
INFO: Performing house keeping and index rebuild @58/main/main_sql.c
INFO: Marking all blocks @61/main/main_sql.c
INFO: Performing house keeping and index rebuild @76/main/main_sql.c
INFO: GC- Before exiting: Heap 348Kb Free 324Kb @79/main/main_sql.c
INFO: Closing database connection @95/cleanup/main_sql.c
2.990u 2.200s 0:40.00 12.9% 82+4398k 0+0io 0pf+0w
[johne(at)new] simp_sql>

FYI, my inserter uses the boehm garbage collection library, so the heap/free
numbers are non deterministic between runs. This should not be a
contributing issue at all. The "performing house cleaning" is a "vacuum
full analyze" command since it's just inserted a large amount data both
times.

All the INSERTS are wrapped in a BEGIN/COMMIT pair (per run, not per
statement).

Now a clean refresh, run the inserter with a ^C break, and immediatly re-run
it:

[johne(at)new] simp_sql> reload [drops the database, creates it, then recreates
the tables/functions]
[johne(at)new] simp_sql> time simp_sql < test8.cpp
INFO: Opening database connection @136/initialize/main_sql.c
INFO: Begining parse @47/main/main_sql.c
^C0.094u 0.075s 0:01.04 15.3% 36+361k 0+0io 0pf+0w
[johne(at)new] simp_sql> time simp_sql < test8.cpp
INFO: Opening database connection @136/initialize/main_sql.c
INFO: Begining parse @47/main/main_sql.c
INFO: GC- After parse: Heap 348Kb Free 136Kb @49/main/main_sql.c
INFO: Performing house keeping and index rebuild @58/main/main_sql.c
INFO: Marking all blocks @61/main/main_sql.c
INFO: Performing house keeping and index rebuild @76/main/main_sql.c
INFO: GC- Before exiting: Heap 348Kb Free 324Kb @79/main/main_sql.c
INFO: Closing database connection @95/cleanup/main_sql.c
2.943u 2.218s 0:33.20 15.5% 90+5051k 0+0io 0pf+0w
[johne(at)new] simp_sql>

Hopefully this is enough info to track down and recreate the problem
independently. If not, I can send you the program, db dumps, etc to help
you recreate it, it's a bit to big to include here.

Responses

Browse pgsql-bugs by date

  From Date Subject
Next Message Qingqing Zhou 2005-03-14 02:23:00 random FlushRelationBuffers() failed on Pg8.0.1 Win32
Previous Message Richard Neill 2005-03-13 07:51:38 BUG #1540: Enhancement request: 'ambiguous' column reference in psql