Re: [Patch] Temporary tables that do not bloat pg_catalog (a.k.a fast temp tables)

From: Tomas Vondra <tomas(dot)vondra(at)2ndquadrant(dot)com>
To: pgsql-hackers(at)postgresql(dot)org, Aleksander Alekseev <a(dot)alekseev(at)postgrespro(dot)ru>
Subject: Re: [Patch] Temporary tables that do not bloat pg_catalog (a.k.a fast temp tables)
Date: 2016-07-29 23:46:45
Message-ID: 39b0f423-a4df-3564-3b4a-b1de6f9eea80@2ndquadrant.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi,

On 07/29/2016 01:15 PM, Aleksander Alekseev wrote:
> Hello
>
> Some time ago we discussed an idea of "fast temporary tables":
>
> https://www.postgresql.org/message-id/20160301182500.2c81c3dc%40fujitsu
>
> In two words the idea is following.
>
> <The Idea>
>
> PostgreSQL stores information about all relations in pg_catalog. Some
> applications create and delete a lot of temporary tables. It causes a
> bloating of pg_catalog and running auto vacuum on it. It's quite an
> expensive operation which affects entire database performance.
>
> We could introduce a new type of temporary tables. Information about
> these tables is stored not in a catalog but in backend's memory. This
> way user can solve a pg_catalog bloating problem and improve overall
> database performance.
>
> </The Idea>

Great! Thanks for the patch, this is definitely an annoying issue worth
fixing. I've spent a bit of time looking at the patch today, comments
below ...

>
> I took me a few months but eventually I made it work. Attached patch
> has some flaws. I decided not to invest a lot of time in documenting
> it or pgindent'ing all files yet. In my experience it will be rewritten
> entirely 3 or 4 times before merging anyway :) But it _works_ and
> passes all tests I could think of, including non-trivial cases like
> index-only or bitmap scans of catalog tables.
>

Well, jokes aside, that's a pretty lousy excuse for not writing any
docs, and you're pretty much asking the reviewers to reverse-engineer
your reasoning. So I doubt you'll get many serious reviews without
fixing this gap.

> Usage example:
>
> ```
> CREATE FAST TEMP TABLE fasttab_test1(x int, s text);
>
> INSERT INTO fasttab_test1 VALUES (1, 'aaa'), (2, 'bbb'), (3, 'ccc');
>
> UPDATE fasttab_test1 SET s = 'ddd' WHERE x = 2;
>
> DELETE FROM fasttab_test1 WHERE x = 3;
>
> SELECT * FROM fasttab_test1 ORDER BY x;
>
> DROP TABLE fasttab_test1;
> ```
>
> More sophisticated examples could be find in regression tests:
>
> ./src/test/regress/sql/fast_temp.sql
>
> Any feedback on this patch will be much appreciated!
>
>

1) I wonder whether the FAST makes sense - does this really change the
performance significantly? IMHO you only move the catalog rows to
memory, so why should the tables be any faster? I also believe this
conflicts with SQL standard specification of CREATE TABLE.

2) Why do we need the new relpersistence value? ISTM we could easily got
with just RELPERSISTENCE_TEMP, which would got right away of many
chances as the steps are exactly the same.

IMHO if this patch gets in, we should use it as the only temp table
implementation (Or can you think of cases where keeping rows in pg_class
has advantages?). That'd also eliminate the need for FAST keyword in the
CREATE TABLE command.

The one thin I'm not sure about is that our handling of temporary tables
is not standard compliant - we require each session to create it's own
private temporary table. Moving the rows from pg_class into backend
memory seems to go in the opposite direction, but as no one was planning
to fix this, I don't think it matters much.

3) I think the heapam/indexam/xact and various other places needs a
major rework. You've mostly randomly sprinkled the code with function
calls to make the patch work - that's fine for an initial version, but a
more principled approach is needed.

4) I'm getting failures in the regression suite - apparently the patch
somehow affects costing of index only scans, so that a several queries
switch from index only scans to bitmap index scans etc. I haven't
investigated this more closely, but it seems quite consistent (and I
don't see it without the patch). It seems the patch delays building of
visibility map, or something like that.

regards

--
Tomas Vondra http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Bruce Momjian 2016-07-30 00:18:38 Re: [BUGS] BUG #14244: wrong suffix for pg_size_pretty()
Previous Message Michael Paquier 2016-07-29 22:57:14 Re: [sqlsmith] Failed assertion in joinrels.c