|From:||Konstantin Knizhnik <k(dot)knizhnik(at)postgrespro(dot)ru>|
|To:||PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>|
|Subject:||Global temporary tables|
|Views:||Raw Message | Whole Thread | Download mbox | Resend email|
Current Postgres implementation of temporary table causes number of
1. Catalog bloating: if client creates and deletes too many temporary
tables, then autovacuum get stuck on catalog.
2. Parallel queries: right now usage of temporary tables in query
disables parallel plan.
3. It is not possible to use temporary tables at replica. Hot standby
configuration is frequently used to run OLAP queries on replica
and results of such queries are used to be saved in temporary tables.
Right now it is not possible (except "hackers" solution with storing
results in file_fdw).
4. Temporary tables can not be used in prepared transactions.
5. Inefficient memory usage and possible memory overflow: each backend
maintains its own local buffers for work with temporary tables.
Default size of temporary buffers is 8Mb. It seems to be too small for
modern servers having hundreds of gigabytes of RAM, causing extra
copying of data
between OS cache and local buffers. But if there are thousands of
backends, each executing queries with temporary tables, then total
memory used for temporary buffers can exceed several tens of gigabytes.
6. Connection pooler can not reschedule session which has created
temporary tables to some other backend
because it's data is stored in local buffers.
There were several attempts to address this problems.
For example Alexandr Alekseev has implemented patch which allows to
create fast temporary tables without accessing system catalog:
Unfortunately this patch was too invasive and rejected by community.
There was also attempt to allow under some condition use temporary
tables in 2PC transactions:
Them were also rejected.
I try to make yet another attempt to address this problems, first of all
1), 2), 5) and 6)
To solve this problems I propose notion of "global temporary" tables,
similar with ones in Oracle.
Definition of this table (metadata) is shared by all backends but data
is private to the backend. After session termination data is obviously lost.
Suggested syntax for creation of global temporary tables:
create global temp table
create session table
Once been created it can be used by all backends.
Global temporary tables are accessed though shared buffers (to solve
Cleanup of temporary tables data (release of shared buffer and deletion
of relation files) is performed on backend termination.
In case of abnormal server termination, files of global temporary tables
are cleaned-up in the same way as of local temporary tables.
Certainly there are cases were global temporary tables can not be used,
i.e. when application is dynamically constructed name and columns of
Also access to local buffers is more efficient than access to shared
buffers because it doesn't require any synchronization.
But please notice that it is always possible to create old (local)
temporary tables which preserves current behavior.
The problem with replica is still not solved. But shared metadata is
step in this direction.
I am thinking about reimplementation of temporary tables using new table
access method API.
The drawback of such approach is that it will be necessary to
reimplement large bulk of heapam code.
But this approach allows to eliminate visibility check for temporary
table tuples and decrease size of tuple header.
I still not sure if implementing special table access method for
temporary tables is good idea.
Patch for global temporary tables is attached to this mail.
The known limitation is that now it supports only B-Tree indexes.
Any feedback is welcome.
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company
|Next Message||Tom Lane||2019-07-31 15:07:17||Re: How to retain lesser paths at add_path()?|
|Previous Message||Rafia Sabih||2019-07-31 14:59:22||Re: [HACKERS] [WIP] Effective storage of duplicates in B-tree index.|