Re: PATCH: logical_work_mem and logical streaming of large in-progress transactions

From: Dilip Kumar <dilipbalaut(at)gmail(dot)com>
To: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
Cc: Erik Rijkers <er(at)xs4all(dot)nl>, Kuntal Ghosh <kuntalghosh(dot)2007(at)gmail(dot)com>, Tomas Vondra <tomas(dot)vondra(at)2ndquadrant(dot)com>, Michael Paquier <michael(at)paquier(dot)xyz>, Peter Eisentraut <peter(dot)eisentraut(at)2ndquadrant(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: PATCH: logical_work_mem and logical streaming of large in-progress transactions
Date: 2020-06-10 09:00:15
Message-ID: CAFiTN-tPgcds9cRi1z47YpH+yvKOUZkQPELPdye_jUvEmZJ4ZQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Sun, Jun 7, 2020 at 5:06 PM Dilip Kumar <dilipbalaut(at)gmail(dot)com> wrote:
>
> On Thu, Jun 4, 2020 at 2:05 PM Dilip Kumar <dilipbalaut(at)gmail(dot)com> wrote:
> >
> > On Wed, Jun 3, 2020 at 2:43 PM Amit Kapila <amit(dot)kapila16(at)gmail(dot)com> wrote:
> > >
> > > On Tue, Jun 2, 2020 at 7:53 PM Dilip Kumar <dilipbalaut(at)gmail(dot)com> wrote:
> > > >
> > > > On Tue, Jun 2, 2020 at 4:56 PM Amit Kapila <amit(dot)kapila16(at)gmail(dot)com> wrote:
> > > > >
> > > > > On Tue, Jun 2, 2020 at 3:59 PM Dilip Kumar <dilipbalaut(at)gmail(dot)com> wrote:
> > > > > >
> > > > > > I thin for our use case BufFileCreateShared is more suitable. I think
> > > > > > we need to do some modifications so that we can use these apps without
> > > > > > SharedFileSet. Otherwise, we need to unnecessarily need to create
> > > > > > SharedFileSet for each transaction and also need to maintain it in xid
> > > > > > array or xid hash until transaction commit/abort. So I suggest
> > > > > > following modifications in shared files set so that we can
> > > > > > conveniently use it.
> > > > > > 1. ChooseTablespace(const SharedFileSet fileset, const char name)
> > > > > > if fileset is NULL then select the DEFAULTTABLESPACEOID
> > > > > > 2. SharedFileSetPath(char path, SharedFileSet fileset, Oid tablespace)
> > > > > > If fileset is NULL then in directory path we can use MyProcPID or
> > > > > > something instead of fileset->creator_pid.
> > > > > >
> > > > >
> > > > > Hmm, I find these modifications a bit ad-hoc. So, not sure if it is
> > > > > better than the patch maintains sharedfileset information.
> > > >
> > > > I think we might do something better here, maybe by supplying function
> > > > pointer or so, but maintaining sharedfileset which contains different
> > > > tablespace/mutext which we don't need at all for our purpose also
> > > > doesn't sound very appealing.
> > > >
> > >
> > > I think we can say something similar for Relation (rel cache entry as
> > > well) maintained in LogicalRepRelMapEntry. I think we only need a
> > > pointer to that information.
> >
> > Yeah, I see.
> >
> > > > Let me see if I can not come up with
> > > > some clean way of avoiding the need to shared-fileset then maybe we
> > > > can go with the shared fileset idea.
> > > >
> > >
> > > Fair enough.
> >
> > While evaluating it further I feel there are a few more problems to
> > solve if we are using BufFile, First thing is that in subxact file we
> > maintain the information of xid and its offset in the changes file.
> > So now, we will also have to store 'fileno' but that we can find using
> > BufFileTell. Yet another problem is that currently, we don't
> > have the truncate option in the BufFile, but we need it if the
> > sub-transaction gets aborted. I think we can implement an extra
> > interface with the BufFile and should not be very hard as we already
> > know the fileno and the offset. I will evaluate this part further and
> > let you know about the same.
>
> I have further evaluated this and also tested the concept with a POC
> patch. Soon I will complete and share, here is the scatch of the
> idea.
>
> As discussed we will use SharedBufFile for changes files and subxact
> files. There will be a separate LogicalStreamingResourceOwner, which
> will be used to manage the VFD of the shared buf files. We can create
> a per stream resource owner i.e. on stream start we will create the
> resource owner and all the shared buffiles will be opened under that
> resource owner, which will be deleted on stream stop. We need to
> remember the SharedFileSet so that for subsequent stream for the same
> transaction we can open the same file again, for this we will use a
> hash table with xid as a key and in that, we will keep stream_fileset
> and subxact_fileset's pointers as payload.
>
> +typedef struct StreamXidHash
> +{
> + TransactionId xid;
> + SharedFileSet *stream_fileset;
> + SharedFileSet *subxact_fileset;
> +} StreamXidHash;
>
> We have to do some extension to the buffile modules, some of them are
> already discussed up-thread but still listing them all down here
> - A new interface BufFileTruncateShared(BufFile *file, int fileno,
> off_t offset), for truncating the subtransaction changes, if changes
> are spread across multiple files those files will be deleted and we
> will adjust the file count and current offset accordingly in BufFile.
> - In BufFileOpenShared, we will have to implement a mode so that we
> can open in write mode as well, current only read-only mode supported.
> - In SharedFileSetInit, if dsm_segment is NULL then we will not
> register the file deletion on on_dsm_detach.
> - As usual, we will clean up the files on stream abort/commit, or on
> the worker exit.

Currently, I am done with a working prototype of using the BufFile
infrastructure for the tempfile. Meanwhile, I want to discuss a few
interface changes required for the BufFIle infrastructure.

1. Support read-write mode for "BufFileOpenShared", Basically, in
workers we will be opening the xid's changes and subxact files per
stream, so we need an RW mode even in the open. I have passed a flag
for the same.

2. Files should not be closed at the end of the transaction:
Currently, files opened with BufFileCreateShared/BufFileOpenShared are
registered to be closed on EOXACT. Basically, we need to open the
changes file on the stream start and keep it open until stream stop,
so we can not afford to get it closed on the EOXACT. I have added a
flag for the same.

3. As. discussed above we need to support truncate for handling thee
subtransaction abort so I have added a new interface for the same.

4. After every time we open the changes file, we need to seek to the
end, so I have supported SEEK_END.

Attached is the WIP patch for describing my changes.

--
Regards,
Dilip Kumar
EnterpriseDB: http://www.enterprisedb.com

Attachment Content-Type Size
buffile_change.patch application/octet-stream 14.1 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Kyotaro Horiguchi 2020-06-10 09:00:43 Re: Is it useful to record whether plans are generic or custom?
Previous Message Kyotaro Horiguchi 2020-06-10 08:38:44 Re: Physical replication slot advance is not persistent