Re: tableam vs. TOAST

From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Prabhat Sahu <prabhat(dot)sahu(at)enterprisedb(dot)com>
Cc: PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: tableam vs. TOAST
Date: 2019-08-01 16:23:42
Message-ID: CA+TgmoZW-KL7fTyTHwF+BR48ussFY3BFPDx8GWCOnZn-21fWMw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Tue, Jul 9, 2019 at 12:40 AM Prabhat Sahu
<prabhat(dot)sahu(at)enterprisedb(dot)com> wrote:
>> Did you run each test just once? How stable are the results?
>
> No, I have executed the test multiple times(7times each) and calculated the result as the median among those,
> and the result looks stable(with v3 patches).

I spent some time looking at your SCC test today. I think this isn't
really testing the code that actually got changed in the patch: a
quick CPU profile shows that your SCC test is bottlenecked on
pg_lzcompress, which spends a huge amount of time compressing the
gigantic string of 'a's you've constructed, and that code is exactly
the same with the patch as it in master. So, I think that any
fluctuations between the patched and unpatched results are just random
variation. There's no reason the patch should be slower with one row
count and faster with a different row count, anyway.

I tried to come up with a better test case that uses a more modest
amount of data, and ended up with this:

-- Setup.
CREATE OR REPLACE FUNCTION randomish_string(integer) RETURNS text AS $$
SELECT string_agg(random()::text, '') FROM generate_series(1, $1);
$$ LANGUAGE sql;

CREATE TABLE source_compressed (a int, b text);
INSERT INTO source_compressed
SELECT g, repeat('a', 2000) FROM generate_series(1, 10000) g;
CREATE TABLE sink_compressed (LIKE source_compressed);

CREATE TABLE source_external (a int, b text);
INSERT INTO source_external
SELECT g, randomish_string(400) FROM generate_series(1, 10000) g;

CREATE TABLE sink_external (LIKE source_external);
CREATE TABLE source_external_uncompressed (a int, b text);
ALTER TABLE source_external_uncompressed ALTER COLUMN b SET STORAGE EXTERNAL;
INSERT INTO source_external_uncompressed
SELECT g, randomish_string(400) FROM generate_series(1, 10000) g;
CREATE TABLE sink_external_uncompressed (LIKE source_external_uncompressed);
ALTER TABLE sink_external_uncompressed ALTER COLUMN b SET STORAGE EXTERNAL;

-- Test.
\timing
TRUNCATE sink_compressed, sink_external, sink_external_uncompressed;
CHECKPOINT;
INSERT INTO sink_compressed SELECT * FROM source_compressed;
INSERT INTO sink_external SELECT * FROM source_external;
INSERT INTO sink_external_uncompressed SELECT * FROM
source_external_uncompressed;

Roughly, on both master and with the patches, the first one takes
about 4.2 seconds, the second 7.5, and the third 1.2. The third one
is the fastest because it doesn't do any compression. Since it does
less irrelevant work than the other two cases, it has the best chance
of showing up any performance regression that the patch has caused --
if any regression existed, I suppose that it would be an increased
per-toast-fetch or per-toast-chunk overhead. However, I can't
reproduce any such regression. My first attempt at testing that case
showed the patch about 1% slower, but that wasn't reliably
reproducible when I did it a bunch more times. So as far as I can
figure, this patch does not regress performance in any
easily-measurable way.

Barring objections, I plan to commit the whole series of patches here
(latest rebase attached). They are not perfect and could likely be
improved in various ways, but I think they are an improvement over
what we have now, and it's not like it's set in stone once it's
committed. We can change it more if we come up with a better idea.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Attachment Content-Type Size
0004-Rename-attribute-detoasting-functions.patch application/octet-stream 13.3 KB
0002-Create-an-API-for-inserting-and-deleting-rows-in-TOA.patch application/octet-stream 30.5 KB
0003-Allow-TOAST-tables-to-be-implemented-using-table-AMs.patch application/octet-stream 36.7 KB
0001-Split-tuptoaster.c-into-three-separate-files.patch application/octet-stream 164.3 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Arne Roland 2019-08-01 16:29:21 Re: Partial join
Previous Message Brandur Leach 2019-08-01 15:34:06 Re: Patch for SortSupport implementation on inet/cdir