Re: Implementing cost limit/delays for insert/delete/update/select

From: Gregory Stark <stark(at)enterprisedb(dot)com>
To: "Peter Schuller" <peter(dot)schuller(at)infidyne(dot)com>
Cc: <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Implementing cost limit/delays for insert/delete/update/select
Date: 2008-08-25 21:39:54
Message-ID: 87iqtoai51.fsf@oxford.xeocode.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

"Peter Schuller" <peter(dot)schuller(at)infidyne(dot)com> writes:

> Hello,
>
> I'd like to have a stab at implementing cost delays, for regular
> INSERT/DELETE/UPDATE/SELECT. The motivation is roughly the same as for
> VACUUM and the autovacuum limits; one may have application specific
> bulk operations that need executing without adverseley affecting
> latency/throughput of other operations.
>
> I tentatively call this executing statements "nicely". A better naming
> scheme might be a good idea...
>
> The idea would be to introduce two GUC variables:
>
> - nice_cost_limit
> - nice_cost_delay

I think the experience with vacuum was that cost_delay was a mistake. The only
parameter users really ought to be messing with is cost_limit. Every time a
user has posted about vacuum taking interminably long it was because they set
*_cost_delay to something unreasonable. I suppose this could be selection bias
since we would never hear about users who didn't set it unreasonably high.

But I think we should consider removing the {auto,}vacuum_cost_delay parameter
or at least hiding and undocumenting it. It's a foot-gun and serves no useful
purpose that merely lowering the {auto,}vacuum_cost_limit can't serve equally
well.

> Which would be equivalent to their vacuum_* counterparts.
>
> Upon executing an INSERT, UPDATE, DELETE or SELECT, one would
> optionally specify a "NICELY" modifier to enable nice cost limits for
> that statement. For example:
>
> DELETE NICELY FROM large_table WHERE id < 50000000

Why not just have the GUC and leave it at that?

SET nice_cost_limit = ...
DELETE FROM ...
SET nice_cost_limit = ...
UPDATE ...
...

> In the future I foresee also specifying a nice multiplier of some
> kind, thus supporting variable niceness on a per-statement basis.

Yeah, actually I think both the vacuum and this parameter need some further
thought. They don't represent any sort of real world parameter that the user
has any hope of knowing how to set except by trial-and-error.

I think we would be better off with something like a vacuum_io_bandwidth_cap
or something like that. Then the user has a hope of understanding what kind of
numbers make sense.

> * Adding the GUC variables

> * Modifying the parser slightly to support the NICELY "modifier"
> (terminology?)

As I mentioned I don't think this is necessary.

> * Modify ExecPlan in backend/executor/execMain.c to contain accounting
> initialization and cleanup like backend/commands/vacuum.c's vacuum().
> * Create an equivalent of the vacuum_delay_point() and call it in each
> loop iteration in ExecPlan().

ExecutePlan? That's not often enough. You can easily construct plans that do
massive sequential scans on the inner side of a join or in a subquery -- all
of which happens before a single record is returned from ExecutePlan for a.
You would have to test for whether it's time to sleep much more often.
Possibly before every ExecProcNode call would be enough.

Even then you have to worry about the i/o and cpu resources used by by
tuplesort. And there are degenerate cases where a single ExecProcNode could do
a lot of i/o such as a large scan looking for a single matching record.

--
Gregory Stark
EnterpriseDB http://www.enterprisedb.com
Get trained by Bruce Momjian - ask me about EnterpriseDB's PostgreSQL training!

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Joshua Drake 2008-08-25 21:44:47 Re: Implementing cost limit/delays for insert/delete/update/select
Previous Message Magnus Hagander 2008-08-25 21:23:45 Re: Should enum GUCs be listed as such in config.sgml?