Re: logical decoding : exceeded maxAllocatedDescs for .spill files

From: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
To: Amit Khandekar <amitdkhan(dot)pg(at)gmail(dot)com>
Cc: Juan José Santamaría Flecha <juanjo(dot)santamaria(at)gmail(dot)com>, Andres Freund <andres(at)anarazel(dot)de>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Thomas Munro <thomas(dot)munro(at)gmail(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>, Tomas Vondra <tomas(dot)vondra(at)2ndquadrant(dot)com>, Alvaro Herrera from 2ndQuadrant <alvherre(at)alvh(dot)no-ip(dot)org>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: logical decoding : exceeded maxAllocatedDescs for .spill files
Date: 2019-11-22 10:56:34
Message-ID: CAA4eK1LbjMgPd9-qF1VRaRWyAdjFw6OK86STPBFyza+T6tXUFg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Fri, Nov 22, 2019 at 11:00 AM Amit Khandekar <amitdkhan(dot)pg(at)gmail(dot)com> wrote:
>
> On Fri, 22 Nov 2019 at 09:08, Amit Kapila <amit(dot)kapila16(at)gmail(dot)com> wrote:
> > Have you tried before that fix , if not, can you once try by
> > temporarily reverting that fix in your environment and share the
> > output of each step? After you get the error due to EOF, check that
> > you have .spill files in pg_replslot/<slot_name>/ and then again try
> > to get changes by pg_logical_slot_get_changes(). If you want, you
> > can use the test provided in Amit Khandekar's patch.
>
> On my Linux machine, I added elog() in ReorderBufferRestoreChanges(),
> just after FileRead() returns 0. This results in error. But the thing is, in
> ReorderBufferCommit(), the error is already handled using PG_CATCH :
>
> PG_CATCH();
> {
> .....
> AbortCurrentTransaction();
> .......
> if (using_subtxn)
> RollbackAndReleaseCurrentSubTransaction();
> ........
> ........
> /* remove potential on-disk data, and deallocate */
> ReorderBufferCleanupTXN(rb, txn);
> }
>
> So ReorderBufferCleanupTXN() removes all the .spill files using unlink().
>
> And on Windows, what should happen is : unlink() should succeed
> because the file is opened using FILE_SHARE_DELETE. But the files
> should still remain there because these are still open. It is just
> marked for deletion until there is no one having opened the file. That
> is what is my conclusion from running a sample attached program test.c
>

I think this is exactly the reason for the problem. In my test [1],
the error "permission denied" occurred when I second time executed
pg_logical_slot_get_changes() which means on first execution the
unlink would have been successful but the files are still not removed
as they were not closed. Then on second execution, it gets an error
"Permission denied" when it again tries to unlink files via
ReorderBufferCleanupSerializedTXNs().

.
> But what you are seeing is "Permission denied" errors. Not sure why
> unlink() is failing.
>

In your test program, if you try to unlink the file second time, you
should see the error "Permission denied".

[1] - https://www.postgresql.org/message-id/CAA4eK1%2Bcey6i6a0zD9kk_eaDXb4RPNZqu4UwXO9LbHAgMpMBkg%40mail.gmail.com

--
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Amit Kapila 2019-11-22 11:04:47 Re: adding partitioned tables to publications
Previous Message Peter Eisentraut 2019-11-22 10:50:49 Re: pause recovery if pitr target not reached