Re: [HACKERS] COPY FREEZE and PD_ALL_VISIBLE

From: Darafei "Komяpa" Praliaskouski <me(at)komzpa(dot)net>
To: Jeff Janes <jeff(dot)janes(at)gmail(dot)com>
Cc: Simon Riggs <simon(at)2ndquadrant(dot)com>, Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: [HACKERS] COPY FREEZE and PD_ALL_VISIBLE
Date: 2019-01-15 15:17:59
Message-ID: CAC8Q8tJhCO7-2A+Gtv8ymtfMpOyrkyqyEuv=FPuv5xNRavxmLQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hello,

Today I bumped into need to limit first VACUUM time on data import.
I'm using utility called osmium together with COPY FREEZE to import
openstreetmap data into database.

osmium export -c osmium.config -f pg belarus-latest.osm.pbf -v --progress
| psql -1 -c 'create table byosm(geom geometry, osm_type text, osm_id
bigint, tags jsonb);copy byosm from stdin freeze;'

However, first pass of VACUUM rewrites the whole table. Here is two logs of
VACUUM VERBOSE in a row:

https://gist.github.com/Komzpa/e765c1c5e04623d83a6263d4833cf3a5

In Russian Postgres Telegram group I've been recommended this thread.
Can the patch be revived? What is needed to get it up for 12?

On Sun, Aug 14, 2016 at 10:37 PM Jeff Janes <jeff(dot)janes(at)gmail(dot)com> wrote:

> On Tue, Nov 3, 2015 at 6:37 AM, Simon Riggs <simon(at)2ndquadrant(dot)com> wrote:
> > On 3 November 2015 at 15:23, Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
> wrote:
> >>
> >> On Fri, Oct 23, 2015 at 6:29 AM, Simon Riggs <simon(at)2ndquadrant(dot)com>
> >> wrote:
> >>>
> >>> On 21 October 2015 at 13:31, Jeff Janes <jeff(dot)janes(at)gmail(dot)com> wrote:
> >>>
> >>>> Index-only scans will visit the heap for each tuple until the first
> >>>> VACUUM is done.
> >>>>
> >>>> The first vacuum will read the entire table, but not need to write it
> >>>> anymore. And will create the _vm file.
> >>>>
> >>>> I think we really want to create _vm file as well as set
> PD_ALL_VISIBLE,
> >>>> but I don't know the best way to do that. Set a flag somewhere and
> then
> >>>> create it in bulk at the end of the transaction? Set it bit by bit
> as the
> >>>> pages are extended and initialized?
> >>>
> >>>
> >>> Easy enough to do it at the end of the COPY FREEZE in one step.
> >>
> >>
> >> Here, we might want to consider that setting bit in visibility map
> >> will generate WAL log whereas Copy Freeze otherwise skip WAL
> >> when wal_level is less than archive. This can lead to extra disk
> >> writes which can slow down Copy Freeze, but OTOH that might
> >> be acceptable.
> >
> >
> > I'm building the map as I go, in the latest version of this patch I'm
> > working on.
>
> Hi Simon,
>
> Is this still on your radar? If you would like someone else to pick
> it up, can you post the WIP patch you have?
>
> Thanks,
>
> Jeff
>
>
> --
> Sent via pgsql-hackers mailing list (pgsql-hackers(at)postgresql(dot)org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-hackers
>

--
Darafei Praliaskouski
Support me: http://patreon.com/komzpa

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2019-01-15 15:53:30 Re: current_logfiles not following group access and instead follows log_file_mode permissions
Previous Message Peter Eisentraut 2019-01-15 14:56:49 Re: insensitive collations