Re: Implementing cost limit/delays for insert/delete/update/select

From: Peter Schuller <peter(dot)schuller(at)infidyne(dot)com>
To: Gregory Stark <stark(at)enterprisedb(dot)com>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: Implementing cost limit/delays for insert/delete/update/select
Date: 2008-08-25 21:59:37
Message-ID: 20080825215936.GA2658@hyperion.scode.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Btw, I forgot to mention in my original post that one interesting use
case that is not application specific, is to apply cost limits on
pg_dump. That would be one of the final goals for me.

> I think the experience with vacuum was that cost_delay was a mistake. The only
> parameter users really ought to be messing with is cost_limit. Every time a
> user has posted about vacuum taking interminably long it was because they set
> *_cost_delay to something unreasonable. I suppose this could be selection bias
> since we would never hear about users who didn't set it unreasonably high.
>
> But I think we should consider removing the {auto,}vacuum_cost_delay parameter
> or at least hiding and undocumenting it. It's a foot-gun and serves no useful
> purpose that merely lowering the {auto,}vacuum_cost_limit can't serve equally
> well.

Sounds sensible to me. I included nice_cost_delay in this case to
remain consistent with the others.

> > DELETE NICELY FROM large_table WHERE id < 50000000
>
> Why not just have the GUC and leave it at that?
>
> SET nice_cost_limit = ...
> DELETE FROM ...
> SET nice_cost_limit = ...
> UPDATE ...
> ...

Sounds a lot cleaner than introducing new syntax, yes.

Leaving it with GUC only does mean the submitter must choose a value,
and cannot just indicate "whichever the administrator chose to be
sensible". Perhaps have a separate boolean cost_limit flag that would
allow just turning it on, without specifying actual limits?

> I think we would be better off with something like a vacuum_io_bandwidth_cap
> or something like that. Then the user has a hope of understanding what kind of
> numbers make sense.

Another option might be to give page_miss and friends an actual unit
that is meaningful, such as the expected worst-case "device time"
required to perform the I/O operation (tweaked on a per-device
basis). One could then specify the maximum vacuum cost in terms of
percentage of real time spend on vacuum related I/O.

> ExecutePlan? That's not often enough. You can easily construct plans that do
> massive sequential scans on the inner side of a join or in a subquery -- all
> of which happens before a single record is returned from ExecutePlan for a.
> You would have to test for whether it's time to sleep much more often.
> Possibly before every ExecProcNode call would be enough.
>
> Even then you have to worry about the i/o and cpu resources used by by
> tuplesort. And there are degenerate cases where a single ExecProcNode could do
> a lot of i/o such as a large scan looking for a single matching record.

Ok - I obviously need to look at these parts more carefully. Thanks
for the feedback!

--
/ Peter Schuller

PGP userID: 0xE9758B7D or 'Peter Schuller <peter(dot)schuller(at)infidyne(dot)com>'
Key retrieval: Send an E-Mail to getpgpkey(at)scode(dot)org
E-Mail: peter(dot)schuller(at)infidyne(dot)com Web: http://www.scode.org

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2008-08-25 23:02:38 Re: Implementing cost limit/delays for insert/delete/update/select
Previous Message Joshua Drake 2008-08-25 21:44:47 Re: Implementing cost limit/delays for insert/delete/update/select