RE: Disable WAL logging to speed up data loading

From: "tsunakawa(dot)takay(at)fujitsu(dot)com" <tsunakawa(dot)takay(at)fujitsu(dot)com>
To: "'David G(dot) Johnston'" <david(dot)g(dot)johnston(at)gmail(dot)com>
Cc: Stephen Frost <sfrost(at)snowman(dot)net>, "osumi(dot)takamichi(at)fujitsu(dot)com" <osumi(dot)takamichi(at)fujitsu(dot)com>, Magnus Hagander <magnus(at)hagander(dot)net>, Robert Haas <robertmhaas(at)gmail(dot)com>, Fujii Masao <masao(dot)fujii(at)oss(dot)nttdata(dot)com>, Laurenz Albe <laurenz(dot)albe(at)cybertec(dot)at>, Kyotaro Horiguchi <horikyota(dot)ntt(at)gmail(dot)com>, "ashutosh(dot)bapat(dot)oss(at)gmail(dot)com" <ashutosh(dot)bapat(dot)oss(at)gmail(dot)com>, "pgsql-hackers(at)lists(dot)postgresql(dot)org" <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: RE: Disable WAL logging to speed up data loading
Date: 2020-11-17 02:49:17
Message-ID: TYAPR01MB299099E8FEBE79F6D2054194FEE20@TYAPR01MB2990.jpnprd01.prod.outlook.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

# It'd be helpful if you could send mails in text format, not HTML.

From: David G. Johnston <david(dot)g(dot)johnston(at)gmail(dot)com>
> For this case the fundamental feature that would seem to be required is an ability for a transaction commit to return only after the system has ensured that all of the new pages added to the relation during the scope of the transaction have made it to disk. Something like:
>
> BEGIN UNLOGGED TRANSACTION FOR table1, table2;
> -- locking probably allows reads, definitely disallows concurrent writes, to the named tables
> -- Disallow updates and deletes, do not use dead tuple space, for the tables named. Should be able to do normal stuff for other tables?
> -- Always create new pages
> COPY TO table1;
> COPY TO table2;
> COMMIT; -- wait here until data files for table1 and table2 are completely written and the transaction alive flag is committed to the WAL.
>
> I suppose the above could be written "BEGIN UNLOGGED TRANSACTION FOR ALL TABLES" and you'd get the initial database population optimization capability.
>
> If the commit doesn't complete all of the newly created pages are junk. Otherwise, you have a crash-recoverable state for those tables as regards those specific pages.

As Steven-san said, I don't want to go this complicated direction. Plus, putting my feet in the user's shoes, I want to try to avoid introducing a new SQL syntax for this kind of performance boost, which requires applications and maintenance scripts and testing.

Regards
Takayuki Tsunakawa

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Fujii Masao 2020-11-17 02:53:45 Re: Add statistics to pg_stat_wal view for wal related parameter tuning
Previous Message Fujii Masao 2020-11-17 02:46:33 Re: Add statistics to pg_stat_wal view for wal related parameter tuning