From: | "Mikheev, Vadim" <vmikheev(at)SECTORBASE(dot)COM> |
---|---|
To: | "'Chris Bitmead'" <chrisb(at)nimrod(dot)itg(dot)telstra(dot)com(dot)au> |
Cc: | pgsql-hackers(at)hub(dot)org |
Subject: | RE: postgres 7.2 features. |
Date: | 2000-07-11 18:19:05 |
Message-ID: | 8F4C99C66D04D4118F580090272A7A23018C50@SECTORBASE1 |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
> The bottom line is that the original postgres time-travel
> implementation was totally cost-free.
I disagree. I can't consider additional > 8 bytes per tuple +
pg_time (4 bytes per transaction... please remember that ppl
complain even about pg_log - 2 bits per transaction) as
totally cost-free for half-useful built-in feature used
by 10% of users.
Note that I don't talk about overwriting/non-overwriting smgr at all!
It's not issue. There are no problems with keeping dead tuples in files
as long as required. When I told about new smgr I meant ability to re-use
space without vacuum and store > 1 tables per file.
But I'll object storing transaction commit times in tuple header and
old-designed pg_time. If you want to do TT - welcome... but make
it optional, without affect for those who need not in TT.
> Actually it may have even speeded things up since vacuum would have
> less work to do.
This would make happy only *TT users* -:)
> Can you convince me that triggers can compare anywhere near for
performance?
No, they can't. But this is bad only for *TT users* -:)
> I can't see how. All I'm asking is don't damage anything that is in
postgres
> now that is relevant to time-travel in your quest for WAL....
It's not related to WAL!
Though... With WAL pg_log is not required to be permanent: we could re-use
transaction IDs after db restart... Well, seems we can handle this.
> > With the original TT:
> >
> > - you are not able to use indices to fetch tuples on time base;
>
> Sounds not very hard to fix..
Really? Commit time is unknown till commit - so you would have to insert
index tuples just before commit... how to know what insert?
> > - you are not able to control tuples life time;
>
> From the docs... "Applications that do not want to save
> historical data can sepicify a cutoff point for a relation.
> Cutoff points are defined by the discard command"
I meant another thing: when I have to deal with history,
I need sometimes to change historical date-s (c) -:))
Probably we can handle this as well, just some additional
complexity -:)
> > - you have to store commit time somewhere;
>
> Ok, so?
Space.
> > - you have to store additional 8 bytes for each tuple;
>
> A small price for time travel.
Not for those who aren't going to use TT at all.
Lower performance of trigger implementation is smaller price for me.
> > - 1 sec could be tooo long time interval for some uses of TT.
>
> So someone in the future can implement finer grains. If time travel
> disappears this option is not open.
Opened, with triggers -:)
As well as Colour-Travel and all other travels -:)
> > And, btw, what could be *really* very useful it's TT +
> > referential integrity feature. How could it be implemented without
triggers?
>
> In what way does TT not have referential integrity? As long as the
> system assures that every transaction writes the same timestamp to all
> tuples then referential integrity continues to exist.
The same tuple of a table with PK may be updated many times by many
transactions
in 1 second. For 1 sec grain you would read *many* historical tuples with
the same
PK all valid in the same time. So, we need in "finer grains" right now...
> > Imho, triggers can give you much more flexible and useful TT...
> >
> > Also note that TT was removed from Illustra and authors wrote that
> > built-in TT could be implemented without non-overwriting smgr.
>
> Of course it can be, but can it be done anywhere near as efficiently?
But without losing efficiency where TT is not required.
> > > > It was mentioned here that triggers could be used for async
> > > > replication, as well as WAL.
> > >
> > > Same story. Major inefficency. Replication is tough enough without
> > > mucking around with triggers. Once the trigger executes you've got
> > > to go and store the data in the database again anyway. Then figure
> > > out when to delete it.
> >
> > What about reading WAL to get and propagate changes? I
> > don't think that reading tables will be more efficient and, btw,
> > how to know what to read (C) -:) ?
>
> Maybe that is a good approach, but it's not clear that it is the best.
> More research is needed. With the no-overwrite storage manager there
> exists a mechanism for deciding how long a tuple exists and this
> can easily be tapped into for replication purposes. Vacuum could
This "mechanism" (just additional field in pg_class) can be used
for WAL based replication as well.
> serve two purposes of vacuum and replicate.
Vacuum is already slow, it's better to make it faster than ever slower...
I see vacuum as *optional* command someday... when we'll be able to
re-use space.
Vadim
From | Date | Subject | |
---|---|---|---|
Next Message | Mikheev, Vadim | 2000-07-11 18:26:04 | RE: Storage Manager (was postgres 7.2 features.) |
Previous Message | Thomas Good | 2000-07-11 18:18:15 | Re: Slashdot discussion |