Re: drop/truncate table sucks for large values of shared buffers

From: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Simon Riggs <simon(at)2ndquadrant(dot)com>, Heikki Linnakangas <hlinnaka(at)iki(dot)fi>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: drop/truncate table sucks for large values of shared buffers
Date: 2015-07-12 12:16:38
Message-ID: CAA4eK1L=HDFy_Uwv2J9DQ0tFedDxxLuG8N+mis3kq2NEeLDQUw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Thu, Jul 2, 2015 at 7:33 PM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
>
> Simon Riggs <simon(at)2ndQuadrant(dot)com> writes:
> > On 2 July 2015 at 14:08, Heikki Linnakangas <hlinnaka(at)iki(dot)fi> wrote:
> >> I'm marking this as "returned with feedback" in the commitfest.
>
> > There are no unresolved issues with the approach, nor is it true it is
> > slower. If you think there are some, you should say what they are, not
act
> > high handedly to reject a patch without reason.
>
> Have you read the thread?

I have read the thread again with a focus on the various problems
(objections) raised for this patch and here is a summarization of
my reading.

1. lseek can lie about EOF which could lead to serious problem
during checkpoint if drop table misses to remove the shared buffer
belonging to the table-to-be-dropped.

This problem can be solved by maintaining the list of dropped relations
and then while checkpoint clean such buffers during buffer-pool scan.
Something similar is already used to avoid similar problems during
FSync.

2. Patch doesn't cater to DROP TABLESPACE and DROP DATABASE
operations.

It would have been better if there could be a simpler solution for these
operations as well, but even if we have something that generic to
avoid problems for these operations, I think there is no reason why it
can't be easily adopted for Drop Table operation as the changes
proposed by this patch are very much localized to one function (which
we have to anyway change even without patch if we come-up with
a generic mechanism), the other changes required to avoid the
problem-1 (lseek problem) would still be required even when we
have patch for generic approach ready. As mentioned by Andrew,
another thing to note is that these operations are much less used
as compare to Drop/Truncate Table, so I think optimzing these
are of somewhat lower priority.

3. Can't close-and-open a file (to avoid lseek lie about EOF or
otherwise) as that might lead to a failure if there is flush operation
for file in parallel.

I haven't checked about this, but I think we can find some way
to check if vm and fsm files exist before checking the number
of blocks for those files.

Apart from above, Heikki mentioned about overflow for total number of
blocks calculation, which I think is relatively simpler problem to fix.

> There were plenty of objections, as well as
> a design for a better solution.

I think here by better solution you mean radix based approach or
something similar, first I don't see any clear design for the same
and second even if tomorrow we have patch for the same ready,
it's not very difficult to change the proposed solution even after
it is committed as the changes are very much localized to one
function.

> In addition, we should think about
> more holistic solutions such as Andres' nearby proposal (which I doubt
> will work as-is, but maybe somebody will think of how to fix it).
> Committing a band-aid will just make it harder to pursue real fixes.
>

Right, but OTOH waiting for a long time to have some thing much
more generic doesn't sound wise either, especially when we can replace
the generic solution without much difficulty.

Having said above, I am not wedded to work on this idea, so if you
and or others have no inclination for the work in this direction, then
I will stop it and lets wait for the day when we have clear idea for
some generic way.

With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Jaimin Pan 2015-07-12 12:21:10 [Postgresql Master Branch Patch] object class patch
Previous Message Yourfriend 2015-07-12 11:09:39 Could be improved point of UPSERT