Re: Truncate if exists

From: Christopher Browne <cbbrowne(at)gmail(dot)com>
To: Simon Riggs <simon(at)2ndquadrant(dot)com>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: Truncate if exists
Date: 2012-10-10 15:47:04
Message-ID: CAFNqd5XYkjq5T5YQXxw=B76qmSnps_h5Hqb7KdBeNKN2V1rNMg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Wed, Oct 10, 2012 at 3:32 AM, Simon Riggs <simon(at)2ndquadrant(dot)com> wrote:
> On 10 October 2012 02:10, Robert Haas <robertmhaas(at)gmail(dot)com> wrote:
>> On Tue, Oct 9, 2012 at 4:18 PM, Josh Berkus <josh(at)agliodbs(dot)com> wrote:
>>> The second is for making deployment scripts idempotent. For example,
>>> say you have script A which creates table "josh", and script B which
>>> needs table "josh" to be empty, if present. Since the two scripts are
>>> tied to different database features, and you don't know which one will
>>> be deployed first, it's useful to have TRUNCATE IF EXISTS. Yes, you can
>>> solve that problem with DO, but why make users go to the extra effort?
>>
>> Hmm. That's an interesting point. I think we're currently in
>> somewhat of a limbo zone about where we ought to have IF EXISTS and IF
>> NOT EXISTS options, and where we should not. Really, I'd like to
>> figure out what policy we want to have, and then go make everything
>> work that way. I don't exactly know what the policy should be, but if
>> we don't have one then we're going to have to argue about every patch
>> individually, which is already getting to be more than tedious. At
>> the one extreme, you have Tom, who probably would not have added any
>> of these given his druthers; at the other extreme, there are probably
>> some people who would say we ought to have this for every command in
>> the book, right down to INSERT IF EXISTS (and, hey, why not INSERT OR
>> CREATE for good measure?). I'm not sure what the right thing to do
>> is... but we should probably come up with some consensus position we
>> can all live with, and then go make this uniform[1].
>
> Damn it, now I have an opinion.
>
> I would say two things:
>
> 1) Consistency for DDL syntax is important. Sometimes humans still
> write SQL and often, ORMs generate SQL. Asking poeple to guess what
> our syntax is from release to release is a good way to have people not
> bother to support us properly. As Peter says, Truncate is not DDL (and
> argument I have used), but it is often used alongside DDL and does
> have many of the same characteristics. INSERT IF EXISTS is simply an
> argument ad absurdum, not a requirement that needs to be addressed.

I think I agree. We should not go down the "well, we haven't got
UPSERT yet, and that's why we shouldn't do this one" road.

> Clearly, rollout scripts benefit from not throwing errors.
> Personally I would prefer setting SET ddl_abort_on_missing_object =
> false; at the top of a script than having to go through every SQL
> statement and add extra syntax. That might even help people more than
> littering SQL with extra clauses.

Here, I'm rather less comfortable.

I could easily take the opposite tack, that rollout scripts benefit
from yes, indeed, throwing errors, so that inconsistencies get
rectified. I don't want to take that argument *too* far, mind you.

Doing things that "avoid throwing errors" isn't purely a good thing.
If a DDL script is doing genuinely different things when running in
different environments, it's difficult to be confident that the result
is correct in all cases. Hiding errors might lead to ignoring
important differences.

Given two further bits of "processing model," I might be made more
comfortable...

1. A direction we're trying to go is to have good comparison tools to
see where schemas differ between environments. (I need to poke at
getting a tool I call "pgcmp" released publicly.) If you have the
capability to compare the starting schema against what you imagined it
was supposed to be, as well as to compare the post-rollout schema
against what it was supposed to become, then that keeps things
relatively under control. If you can quickly determine divergence
from expected schema, then you can more easily keep on track.

2. [More on the SQL syntax/clauses front] In order to NOT litter the
SQL with extra clauses, I expect that there needs to be something of a
model of How You SHOULD Update Your Schema, in effect, some idiomatic
'best practice' that tends to cut risk and diminish the need for IF
EXISTS/IF NOT EXISTS. I'd be interested to see an outline of that
model.
--
When confronted by a difficult problem, solve it by reducing it to the
question, "How would the Lone Ranger handle this?"

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2012-10-10 15:55:12 Re: September 2012 commitfest
Previous Message Alvaro Herrera 2012-10-10 15:44:47 Re: [bugfix] sepgsql didn't follow the latest core API changes