Re: Is there a way to run heap_insert() AFTER ExecInsertIndexTuples() ?

From: "Florian G(dot) Pflug" <fgp(at)phlo(dot)org>
To: Zoltan Boszormenyi <zboszor(at)dunaweb(dot)hu>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: Is there a way to run heap_insert() AFTER ExecInsertIndexTuples() ?
Date: 2007-03-01 11:13:04
Message-ID: 45E6B540.8060904@phlo.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Zoltan Boszormenyi wrote:
> Florian G. Pflug írta:
>> Zoltan Boszormenyi wrote:
>>> The GENERATED column is an easy of use feature
>>> with possibly having less work, whereas the IDENTITY
>>> column is mandatory for some applications (e.g. accounting
>>> and billing is stricter in some countries) where you simply
>>> cannot skip a value in the sequence, the strict monotonity is
>>> not enough.
>>
>> But just postponing nextval() until after the uniqueness checks
>> only decreases the *probability* of non-monotonic values, and
>> *does not* preven them. Consindert two transactions
>>
>> A: begin ;
>> B: Begin ;
>> A: insert ... -- IDENTITY generates value 1
>> B: insert .. -- IDENTITY generates value 2
>> A: rollback ;
>> B: commit ;
>
> I can understand that. But your example is faulty,
> you can't have transaction inside a transaction.
> Checkpoints are another story. 8-)

A: and B: are meant to denote *two* *different*
transactions running concurrently.

> You can have some application tricks to
> have continous sequence today with regular
> serials but only if don't have a unique index
> that doesn't use the serial column. Inserting
> a record to that table outside the transaction,
> making note of the serial value.
>
> If subsequent processing fails (because of unique,
> check constraint, etc) you have to go back to the main
> table and modify the record, indicating that the record
> isn't representing valid data. But you must keep it with
> the serial value it was assigned. I have seen systems
> requiring this. My point is that with the identity
> column, you will be able to define unique index
> on the table that exludes the identity column.

Yes, of course you can prevent gaps by just filling them
with garbage/invalid records of whatever. But I don't see
why this is usefull - either you want, say, your invoice
number to be continuous because it's required by law - or
you don't. But if the law required your invoice numbers to be
continous, surely just filling the gaps with fake invoices
it just as illegal as having gaps in the first place.

>> I agree that I'd be nice to generate the identity columns as late as
>> possible to prevents needless gaps, but not if price is a for more
>> intrusive patch, or much higher complexity.
>
> Intrusive, hm? The catalog have to indicate that the column
> is IDENTITY, otherwise you cannot know it.
>
> The cost I am thinking now is an extra heap_update()
> after heap_insert() without generating the identity value
> and inserting index tuples to indexes that doesn't
> contain the identity column.

I'll have to admit that I haven't actually looked at your patch -
so sorry if I missunderstood things. I got the impression that
tom's main complaint was that you are shuffling too much existing code
around in your patch, and I figured that this is partly because you
try to generate the IDENTITY value as late as possible. Since doing
this won't prevent gaps, but just reduces the probability of creating
them, I thought that a way around tom's concerns might be to drop
that requirement.

I will shut up now, at least until I have read the patch ;-)

greetings, Florian Pflug

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Simon Riggs 2007-03-01 12:02:23 Revitalising VACUUM FULL for 8.3
Previous Message Zoltan Boszormenyi 2007-03-01 10:51:39 Re: Is there a way to run heap_insert() AFTER ExecInsertIndexTuples() ?