Re: Berkeley DB...

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: "Mikheev, Vadim" <vmikheev(at)SECTORBASE(dot)COM>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: Berkeley DB...
Date: 2000-05-28 05:28:04
Message-ID: 2130.959491684@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

"Mikheev, Vadim" <vmikheev(at)SECTORBASE(dot)COM> writes:
>>>> Well, I've dropped index but INSERTs still take 70 sec and
>>>> COPY just 1sec -:(((

Mebbe so, but you can't blame it all on parse/plan overhead.

I did some experimentation on this with current sources, using a test
case of inserting 100,000 rows of 16 columns (the regression "tenk1"
table's contents repeated 10 times). Each test was started with a
freshly created empty table. The initial runs were done with all
postmaster options except -F defaulted. All numbers are wall-clock time
in seconds; the "+" column is the time increase from the previous case:

load via COPY, fsync off:
0 indexes 24.45s
1 index 48.88s + 24.43
2 indexes 62.65s + 13.77
3 indexes 96.84s + 34.19
4 indexes 134.09s + 37.25

load via INSERTs, fsync off, one xact (begin/end around all inserts):
0 indexes 194.95s
1 index 247.21s + 52.26
2 indexes 269.69s + 22.48
3 indexes 307.33s + 37.64
4 indexes 352.72s + 45.39

load via INSERTs, fsync off, separate transaction for each insert:
0 indexes 236.53s
1 index 295.96s + 59.43
2 indexes 323.40s + 27.44
[ got bored before doing 3/4 index cases ... ]

load via INSERTs, fsync on, separate transactions:
0 indexes 5189.99s
[ don't want to know how long it will take with indexes :-( ]

So while the parse/plan overhead looks kinda bad next to a bare COPY,
it's not anything like a 70:1 penalty. But an fsync per insert is
that bad and worse.

I then recompiled with -pg to learn more about where the time was going.
One of the useful places to look at is calls to FileSeek, since mdread,
mdwrite, and mdextend all call it. To calibrate these numbers, the
table being created occupies 2326 pages and the first index is 343
pages.

Inserts (all in 1 xact), no indexes:
0.00 0.00 1/109528 init_irels [648]
0.00 0.00 85/109528 mdread [592]
0.01 0.00 2327/109528 mdextend [474]
0.01 0.00 2343/109528 mdwrite [517]
0.23 0.00 104772/109528 _mdnblocks [251]
[250] 0.0 0.24 0.00 109528 FileSeek [250]
Inserts (1 xact), 1 index:
0.00 0.00 1/321663 init_irels [649]
0.00 0.00 2667/321663 mdextend [514]
0.10 0.00 55478/321663 mdread [277]
0.11 0.00 58096/321663 mdwrite [258]
0.38 0.00 205421/321663 _mdnblocks [229]
[213] 0.1 0.60 0.00 321663 FileSeek [213]
COPY, no indexes:
0.00 0.00 1/109527 init_irels [431]
0.00 0.00 84/109527 mdread [404]
0.00 0.00 2327/109527 mdextend [145]
0.00 0.00 2343/109527 mdwrite [178]
0.07 0.00 104772/109527 _mdnblocks [77]
[83] 0.0 0.07 0.00 109527 FileSeek [83]
COPY, 1 index:
0.00 0.00 1/218549 init_irels [382]
0.00 0.00 2667/218549 mdextend [220]
0.07 0.00 53917/218549 mdread [106]
0.08 0.00 56542/218549 mdwrite [99]
0.14 0.00 105422/218549 _mdnblocks [120]
[90] 0.0 0.30 0.00 218549 FileSeek [90]

The extra _mdnblocks() calls for the inserts/1index case seem to be from
the pg_index scans in ExecOpenIndices (which is called 100000 times in
the inserts case but just once in the COPY case). We know how to fix
that. Otherwise the COPY and INSERT paths seem to be pretty similar as
far as actual I/O calls go. The thing that jumps out here, however, is
that it takes upwards of 50000 page reads and writes to prepare a
343-page index. Most of the write calls turn out to be from
BufferReplace, which is pretty conclusive evidence that the default
setting of -B 64 is not enough for this example; we need more buffers.

At -B 128, inserts/0index seems about the same, inserts/1index traffic is
0.00 0.00 1/270331 init_irels [637]
0.01 0.00 2667/270331 mdextend [510]
0.06 0.00 29798/270331 mdread [354]
0.06 0.00 32444/270331 mdwrite [277]
0.40 0.00 205421/270331 _mdnblocks [229]
[223] 0.1 0.52 0.00 270331 FileSeek [223]
At -B 256, inserts/1index traffic is
0.00 0.00 1/221849 init_irels [650]
0.00 0.00 2667/221849 mdextend [480]
0.01 0.00 5556/221849 mdread [513]
0.01 0.00 8204/221849 mdwrite [460]
0.37 0.00 205421/221849 _mdnblocks [233]
[240] 0.0 0.40 0.00 221849 FileSeek [240]
At -B 512, inserts/1index traffic is
0.00 0.00 1/210788 init_irels [650]
0.00 0.00 25/210788 mdread [676]
0.00 0.00 2667/210788 mdextend [555]
0.00 0.00 2674/210788 mdwrite [564]
0.27 0.00 205421/210788 _mdnblocks [248]
[271] 0.0 0.28 0.00 210788 FileSeek [271]

So as long as the -B setting is large enough to avoid thrashing, there
shouldn't be much penalty to making an index. I didn't have time to run
the COPY cases but I expect they'd be about the same.

Bottom line is that where I/O costs are concerned, the parse/plan
overhead for INSERTs is insignificant except for the known problem
of wanting to rescan pg_index for each INSERT. The CPU overhead is
significant, at least if you're comparing no-fsync performance ...
but as I commented before, I doubt we can do a whole lot better in
that area for simple INSERTs.

regards, tom lane

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Tatsuo Ishii 2000-05-28 05:30:30 Re: Regression test failure on 7.0-STABLE
Previous Message Tatsuo Ishii 2000-05-28 04:56:22 Regression test failure on 7.0-STABLE