Re: Passing values to a dynamic background worker

From: Keith Fiske <keith(at)omniti(dot)com>
To: Amit Langote <Langote_Amit_f8(at)lab(dot)ntt(dot)co(dot)jp>
Cc: Kyotaro HORIGUCHI <horiguchi(dot)kyotaro(at)lab(dot)ntt(dot)co(dot)jp>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Passing values to a dynamic background worker
Date: 2017-04-18 14:14:09
Message-ID: CAG1_KcBj52LpvVaFT4aBfttuf2i4DscBVPpOwwbNqsY8pzqQ_g@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Tue, Apr 18, 2017 at 5:40 AM, Amit Langote <Langote_Amit_f8(at)lab(dot)ntt(dot)co(dot)jp
> wrote:

> On 2017/04/18 18:12, Kyotaro HORIGUCHI wrote:
> > At Mon, 17 Apr 2017 16:19:13 -0400, Keith Fiske wrote:
> >> So after reading a recent thread on the steep learning curve for PG
> >> internals [1], I figured I'd share where I've gotten stuck with this in
> a
> >> new thread vs hijacking that one.
> >>
> >> One of the goals I had with pg_partman was to see if I could get the
> >> partitioning python scripts redone as C functions using a dynamic
> >> background worker to be able to commit in batches with a single call. My
> >> thinking was to have a user-function that can accept arguments for
> things
> >> like the interval value, batch size, and other arguments to the python
> >> script, then start/stop a dynamic bgw up for each batch so it can commit
> >> after each one. The dymanic bgw would essentially just have to call the
> >> already existing partition_data() plpgsql function, but I have to be
> able
> >> to pass the argument values that the user gave down into the dynamic
> bgw.
> >>
> >> I've reached a roadblock in that bgw_main_arg can only accept a single
> >> argument that must be passed by value for a dynamic bgw. I already
> worked
> >> around this for passing the database name to the my existing use of a
> bgw
> >> with doing partition maintenance (pass a simple integer to use as an
> index
> >> array value). But I'm not sure how to do this for passing multiple
> values
> >> in. I'm assuming this would be the place where I'd see about storing
> values
> >> in shared memory to be able to re-use later? I'm not even sure if that's
> >> the right approach, and if it is, where to even start to understand how
> to
> >> do that.
> >
> > On the other hand, AFAICS, DSM doesn't seem well documented. I
> > mangaged to find a related document in Postgres Wiki but it seems
> > a bit old.
> >
> > https://wiki.postgresql.org/wiki/Parallel_Internal_Sort
> >
> > This is a little complex than static shared memory, and it is
> > *not* guaranteed to mapped at the same address among workers. You
> > will see an instance in LaunchParallelWorkers() and the related
> > functions in parallel.c. The basic of its usage would be as the
> > follows.
> >
> > - Create a segment :
> > dsm_segment *seg = dsm_create(size);
> > - Send its handle via the bgw_main_arg.
> > worker.bgw_main_arg = dsm_segment_handle(seg);
> > - Attach the memory on the other side.
> > dsm_segment *seg = dsm_attach(main_arg);
> >
> > On both side, the address of the attached shared memory is
> > obtained using dsm_segment_address(seg).
> >
> > dsm_detach(seg) detaches the segment. All users of this segment
> > detach the segment, it will be destroyed.
>
> Perhaps, the more modern DSA mechanism could be applicable here, too.
>
> Some recent commits demonstrate examples of DSA usage, such as BRIN
> autosummarization commit (7526e10224f) and tidbitmap.c's shared iteration
> support commit (98e6e89040a05).
>
> Thanks,
> Amit
>
>
Thank you both very much for the suggestions!

Keith

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Peter Eisentraut 2017-04-18 14:22:31 Re: Interval for launching the table sync worker
Previous Message Andrew Dunstan 2017-04-18 13:47:23 Re: Continuous buildfarm failures on hamster with bin-check