Re: update on TOAST status'

From: JanWieck(at)t-online(dot)de (Jan Wieck)
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Bruce Momjian <pgman(at)candle(dot)pha(dot)pa(dot)us>, Jan Wieck <JanWieck(at)Yahoo(dot)com>, PostgreSQL HACKERS <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: update on TOAST status'
Date: 2000-07-07 00:05:07
Message-ID: 200007070005.CAA20517@hot.jw.home
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Tom Lane wrote:
> Bruce Momjian <pgman(at)candle(dot)pha(dot)pa(dot)us> writes:
> >> - VACUUM doesn't propagate ANALYZE to TOAST tables.
> >> Statistics for them are needless because the toast access
> >> is allways hardcoded indexed.
>
> > I don't think statistics are insignificant for TOASTed columns.
>
> He didn't say that! I think what he meant is that there's no need for
> statistics associated with the TOAST table itself, and AFAICS that's OK.
>
> BTW, I have thought of a potential problem with indexes on toasted
> columns. As I understand Jan's current thinking, the idea is
>
> 1. During storage of the tuple in the main table, any oversize fields
> get compressed/moved off.
>
> 2. The toasted item in the finished main tuple gets handed to the index
> routines to be stored in the index.

Right.

> Now, storing the toasted item in the index tuple seems fine, but what
> I do not like here is the implication that all the comparisons needed
> to find where to *put* the index tuple are done using a pretoasted
> value. That seems to imply dozens of redundant decompressions/fetches,
> another one for each key comparison we have to do.

Dozens - right.

I just did a little gdb session tracing a

SELECT ... WHERE toasted = 'xxx'

The table has 151 rows and an index on 'toasted'. It needed 6
fetches of the attribute. Better than good, because 2^6 is
only 64, so btree did a perfect job. Anyhow, in the case of a
real TOASTed (read burned) value, it'd mean 6 index scans to
recreate the on disk stored representation plus 6
decompression loops to get the plain one to compare against.
What the hell would an "IN (SELECT ...)" cause?

> Jan, do you have a way around this that I missed?
>
> One simple answer that might help for other scenarios too is to keep
> a small cache of the last few values that had to be untoasted. Maybe
> we only need it for moved-off values --- it could be that decompression
> is fast enough that we should just do it over rather than trying to
> cache.

I'm still argueing that indexing huge values is a hint for a
misleading schema. If this is true, propagating toasted
attributes into indices is a dead end street and I'd have to
change the heap-access<->toaster interface so that the
modified (stored) main tuple isn't visible to the following
code (that does the index inserts).

What is the value of supporting index tuples >2K? Support of
braindead schemas? I can live withoout it!

Jan

--

#======================================================================#
# It's easier to get forgiveness for being wrong than for being right. #
# Let's break this rule - forgive me. #
#================================================== JanWieck(at)Yahoo(dot)com #

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Richard Sand 2000-07-07 00:15:48 Re: Lessons learned on how to build 7.0.2 on AIX 4.x
Previous Message Ed Loehr 2000-07-06 23:37:37 Re: Memo on coding practices: strcmp() does not yield bool