GSoC 2015 proposal: Improve the performance of “ALTER TABLE .. SET LOGGED / UNLOGGED” statement

From: Fabrízio de Royes Mello <fabriziomello(at)gmail(dot)com>
To: Pgsql Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: GSoC 2015 proposal: Improve the performance of “ALTER TABLE .. SET LOGGED / UNLOGGED” statement
Date: 2015-03-26 01:46:40
Message-ID: CAFcNs+r6+p_U-W7TJjHi2mU-wvmrNJcshMiGMfLB1bqBiSEmvQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi all,

Below I written my proposal idea to this GSoC.

*** Improve the performance of “ALTER TABLE .. SET LOGGED / UNLOGGED”
statement ***

Last year during the GSoC2014 I implemented the feature to allow an
unlogged table to be changed to logged [1], but the desing chosen was to
rewrite the entire heap in a new relfilenode with a new relpersistence
because some troubles pointed here [2].

The project was successfully finished and got committed [3] into PostgreSQL
to be released this year in the 9.5 version.

However this design lead us to performance problems with large relations
because we need to rewrite the entire content of the relation twice, one
into a new heap and other into the WAL, so this project will change the
current desing of the mechanism of change an unlogged table to logged
without the need to rewrite the entire heap, but just by removing the init
forks and if the wal_level != minimal we’ll write the contents to the WAL
too.

** Benefits to the PostgreSQL Community **

The “unlogged” tables feature was introduced by 9.1 version, and provide
better write performance than regular tables (logged), but are not
crash-safe. Their contents are automatically discarded (cleared) if the
server crashes. Also, their contents do not propagate to standby servers.

We already have a way to change an unlogged table to logged using “ALTER
TABLE name SET LOGGED” developed last year during the GSoC2014, now during
the GSoC2015 we’ll redesing the internals to improve the I/O performance of
this feature removing the need of rewrite the entire heap into a new
relfilenode.

The are a good idea about the desing here [4], but I’ll discuss the design
with my mentor to implement this improvement.

** Additional Goals **

The main goal of this project is improve the performance of the ALTER TABLE
name SET {LOGGED|UNLOGGED}, but we can expand this propostal to more
related goals.

* Allow unlogged materialized views
. ALTER MATERIALIZED VIEW name SET { UNLOGGED | LOGGED }
* Allow unlogged indexes on logged tables
. ALTER INDEX name SET { UNLOGGED | LOGGED }

** Deliverables **

This project has just one deliverable at the end. The deliverable will be
the improvement of the routines that transform an “unlogged” table to
“logged” and “logged” to “unlogged”, without the need to create a new
“relfilenode” with a different “relpersistence”.

** Project Schedule **

until May 25:
* create a website to the project (wiki.postgresql.org)
* create a public repository to the project (github.com/fabriziomello)
* read what has already been discussed by the community about the project
[4]
* learn about some PostgreSQL internals:
. control data (src/include/catalog/pg_control.h)
. storage (src/backend/storage/*)
* discuss the additional goals with community

May 26 - June 21
* implementation of the first prototype:
. implement the change of unlogged table to logged without rewrite the
entire heap when “wal_level = minimal”
. at this point when “wal_level != minimal” we use the current
implementation
* write documentation and the test cases
* submit this first prototype to the commitfest 2015/06 (
https://commitfest.postgresql.org/5/)

June 22 - June 26
* mentor review the work in progress

June 27 - August 17
* do the adjustments based on the community feedback during the commitfest
2015/06
* implementation of the second prototype:
. when “wal_level != minimal” we’ll remove the init fork (first
prototype) and write relation pages to the WAL.
. implement “ALTER MATERIALIZED VIEW .. SET LOGGED / UNLOGGED”
* submit to the commitfest 2015/09 for final evaluation and maybe will be
committed to 9.6 version (webpage don’t created yet)

August 18 - August 21
* do the adjustments based on the community feedback during the commitfest
2015/09
* final mentor review

** About the proponent **

Fabrízio de Royes Mello

e-mail: fabriziomello(at)gmail(dot)com
twitter: @fabriziomello
github: http://github.com/fabriziomello
linkedin: http://linkedin.com/in/fabriziomello

Currently I help people and teams to take the full potential of relational
databases, especially PostgreSQL, helping teams to design the structure of
the database (modeling), build physical architecture (database schema),
programming (procedural languages), SQL (usage, tuning, best practices),
optimization and orchestration of instances in production too. I perform a
volunteer work for Brazilian Community of PostgreSQL (www.postgresql.org.br),
supporting mailing lists, organizing events (pgbr.postgresql.org.br) and
some admin tasks. And also I help a little the PostgreSQL Global
Development Group (PGDG) in the implementation of some features and review
of patches (git.postgresql.org).

Links

[1]
https://wiki.postgresql.org/wiki/Allow_an_unlogged_table_to_be_changed_to_logged_GSoC_2014
[2]
http://www.postgresql.org/message-id/CA+Tgmob44LNwwU73N1aJsGQyzQ61SdhKJRC_89wCm0+aLg=x2Q@mail.gmail.com
[3]
http://git.postgresql.org/gitweb/?p=postgresql.git;a=commit;h=f41872d0c1239d36ab03393c39ec0b70e9ee2a3c
[4]
http://www.postgresql.org/message-id/CA+TgmoZM+-0R7h0eDPzZjbokVVQ+gAVKChmno4fypVEccW-EqA@mail.gmail.com

--
Fabrízio de Royes Mello
Consultoria/Coaching PostgreSQL
>> Timbira: http://www.timbira.com.br
>> Blog: http://fabriziomello.github.io
>> Linkedin: http://br.linkedin.com/in/fabriziomello
>> Twitter: http://twitter.com/fabriziomello
>> Github: http://github.com/fabriziomello

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Michael Paquier 2015-03-26 01:50:54 Re: Moving on to close the current CF 2015-02
Previous Message Tatsuo Ishii 2015-03-26 01:46:34 Re: Why SyncRepWakeQueue is not static?