Re: logical decoding : exceeded maxAllocatedDescs for .spill files

From: Amit Khandekar <amitdkhan(dot)pg(at)gmail(dot)com>
To: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
Cc: Alvaro Herrera from 2ndQuadrant <alvherre(at)alvh(dot)no-ip(dot)org>, Andres Freund <andres(at)anarazel(dot)de>, Juan José Santamaría Flecha <juanjo(dot)santamaria(at)gmail(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>, Robert Haas <robertmhaas(at)gmail(dot)com>, Thomas Munro <thomas(dot)munro(at)gmail(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Tomas Vondra <tomas(dot)vondra(at)2ndquadrant(dot)com>
Subject: Re: logical decoding : exceeded maxAllocatedDescs for .spill files
Date: 2019-12-11 10:46:43
Message-ID: CAJ3gD9c9s6m1KYfM9w6a52NE6kh=pWyMoEv9Dqnjt+DRTqD24A@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Sat, 7 Dec 2019 at 11:37, Amit Kapila <amit(dot)kapila16(at)gmail(dot)com> wrote:
>
> On Fri, Dec 6, 2019 at 5:00 PM Amit Khandekar <amitdkhan(dot)pg(at)gmail(dot)com> wrote:
> >
> > On Fri, 6 Dec 2019 at 15:40, Amit Kapila <amit(dot)kapila16(at)gmail(dot)com> wrote:
> > >
> > >
> > > Few comments:
> > > ----------------------
> > >
> > > 1.
> > > + /* Now that the state fields are initialized, it is safe to return it. */
> > > + *iter_state = state;
> > > +
> > > /* allocate heap */
> > > state->heap =
> > > binaryheap_allocate(state->nr_txns,
> > > ReorderBufferIterCompare,
> > >
> > > Is there a reason for not initializing iter_state after
> > > binaryheap_allocate? If we do so, then we don't need additional check
> > > you have added in ReorderBufferIterTXNFinish.
> >
> > If iter_state is initialized *after* binaryheap_allocate, then we
> > won't be able to close the vfds if binaryheap_allocate() ereports().
> >
>
> Is it possible to have vfds opened before binaryheap_allocate(), if so how?
No it does not look possible for the vfds to be opened before
binaryheap_allocate(). But actually, the idea behind placing the
iter_state at the place where I put is that, we should return back the
iter_state at the *earliest* place in the code where it is safe to
return.

> > I couldn't reproduce the original problem (on HEAD) reported with the
> > test case in the patch. So, I can't verify the fix. I think it is
> > because of recent commits cec2edfa7859279f36d2374770ca920c59c73dd8 and
> > 9290ad198b15d6b986b855d2a58d087a54777e87. It seems you need to either
> > change the value of logical_decoding_work_mem or change the test in
> > some way so that the original problem can be reproduced.
Amit Khandekar <amitdkhan(dot)pg(at)gmail(dot)com> wrote:
> Yeah, it does seem like the commit cec2edfa78592 must have caused the
> test to not reproduce on your env, although the test does fail for me
> still. Setting logical_decoding_work_mem to a low value does sound
> like a good idea. Will work on it.

I checked that setting logical_decoding_work_mem to its min value
(64KB) causes early serialization. So I think if you set this, you
should be able to reproduce with the spill.sql test that has the new
testcase. I will anyway set this value in the test. Also check below
...

>> 4. I think we should also check how much time increase will happen for
>> test_decoding regression test after the test added by this patch?
Amit Khandekar <amitdkhan(dot)pg(at)gmail(dot)com> wrote:
> Yeah, it currently takes noticeably longer compared to the others.
> Let's see if setting logical_decoding_work_mem to a min value allows
> us to reproduce the test with much lesser number of inserts.

The test in the patch takes around 20 seconds, as compared to the max
time of 2 seconds any of the other tests take in that test suite.

But if we set the max_files_per_process to a very low value (say 26)
as against the default 1000, we can reproduce the issue with as low as
20 sub-transactions as against the 600 that I used in spill.sql test.
And with this, the test runs in around 4 seconds, so this is good. But
the problem is : max_files_per_process needs server restart. So either
we have to shift this test to src/test/recovery in one of the
logical_decoding test, or retain it in contrib/test_decoding and let
it run for 20 seconds. Let me know if you figure out any other
approach.

>
> > >
> > > 5. One naive question about the usage of PathNameOpenFile(). When it
> > > reaches the max limit, it will automatically close one of the files,
> > > but how will that be reflected in the data structure (TXNEntryFile)
> > > you are managing. Basically, when PathNameOpenFile closes some file,
> > > how will the corresponding vfd in TXNEntryFile be changed. Because if
> > > it is not changed, then won't it start pointing to some wrong
> > > filehandle?
> >
> > In PathNameOpenFile(), excess kernel fds could be closed
> > (ReleaseLruFiles). But with that, the vfds themselves don't get
> > invalidated. Only the underlying kernel fd gets closed, and the
> > vfd->fd is marked VFD_CLOSED. The vfd array element remains valid (a
> > non-null vfd->fileName means the vfd slot is valid; check
> > FileIsValid). So later, when FileRead(vfd1) is called and that vfd1
> > happens to be the one that had got it's kernel fd closed, it gets
> > opened again through FileAccess().
> >
>
> I was under impression that once the fd is closed due to excess kernel
> fds that are opened, the slot in VfdCache array could be resued by
> someone else, but on closer inspection that is not true. It will be
> only available for reuse after we explicitly call FileClose, right?

Yes, that's right.

--
Thanks,
-Amit Khandekar
EnterpriseDB Corporation
The Postgres Database Company

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Michael Paquier 2019-12-11 11:06:45 Re: get_database_name() from background worker
Previous Message Konstantin Knizhnik 2019-12-11 09:37:57 Re: Session WAL activity