Re: Slowdown problem when writing 1.7million records

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: "Stephen Livesey" <ste(at)exact3ex(dot)co(dot)uk>
Cc: pgsql-general(at)postgreSQL(dot)org, Bruce Momjian <pgman(at)candle(dot)pha(dot)pa(dot)us>
Subject: Re: Slowdown problem when writing 1.7million records
Date: 2001-05-10 23:22:36
Message-ID: 5312.989536956@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

This is a followup to a problem report Stephen Livesey made back in
February, to the effect that successive insertions got slower and
slower. At the time we speculated that btree indexes had a problem
with becoming out-of-balance when fed steadily increasing data values.
I have now tried to reproduce that effect --- and failed, in both
current sources and 7.0.2.

I did
create table foo (f1 serial primary key);
and then ran a process that just did
insert into foo default values;
over and over (each insert as a separate transaction). This will result
in inserting a continually increasing sequence of key values into the
pkey index.

I ran this out to about 3.4million records (twice the number of records
Stephen used) on both 7.0.2 and current sources. I do not see any real
slowdown in the insertion rate, and certainly not the drastic slowdown
that Stephen observed: he said

> I am now writing 1.7million records to this file.
>
> The first 100,000 records took 15mins.
> The next 100,000 records took 30mins
> The last 100,000 records took 4hours.
>
> In total, it took 43 hours to write 1.7million records.

It took me about 140 minutes to write 3.4million records, on a not
particularly fast machine; the insertion rate held pretty steady at
around 400 records/sec (fsync off).

So I no longer think that the problem was with the btree index.
Other possibilities are:

* If Stephen had fsync on, and his kernel was using a particularly
stupid fsync algorithm, the time might all have been going into
kernel searches of its buffer lists --- the files being fsync'd
would've been growing and growing, and some fsync routines take
time proportional to the number of buffered blocks.

* If Stephen had any relevant foreign-key checks (though he said not)
and was inserting all the records in one transaction, then the known
O(N^2) behavior in 7.0.*'s handling of deferred triggers could've
been the culprit. If so, this is fixed in 7.1.

* The problem might have been on the client side. I'm not sure
whether JDBC on Windows might suffer from memory leaks or anything
like that.

Anyway: Stephen, if you haven't lost interest, we need to take another
look at your problem and figure out where the issue really is.

Bruce, please remove the TODO item
* Be smarter about insertion of already-ordered data into btree index
It seems to have been a false alarm.

regards, tom lane

In response to

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Tom Lane 2001-05-10 23:27:54 Re: Re: Query not using index
Previous Message Richard Huxton 2001-05-10 23:16:10 Re: Re: Query not using index