Re: estimating # of distinct values

From: Jim Nasby <jim(at)nasby(dot)net>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Tomas Vondra <tv(at)fuzzy(dot)cz>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: estimating # of distinct values
Date: 2011-01-18 17:23:29
Message-ID: E132F02B-FB5C-48C4-B52D-EA018D9636F4@nasby.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Jan 17, 2011, at 8:11 PM, Robert Haas wrote:
> On Mon, Jan 17, 2011 at 7:56 PM, Jim Nasby <jim(at)nasby(dot)net> wrote:
>> - Forks are very possibly a more efficient way to deal with TOAST than having separate tables. There's a fair amount of overhead we pay for the current setup.
>
> That seems like an interesting idea, but I actually don't see why it
> would be any more efficient, and it seems like you'd end up
> reinventing things like vacuum and free space map management.

The FSM would take some effort, but I don't think vacuum would be that hard to deal with; you'd just have to free up the space in any referenced toast forks at the same time that you vacuumed the heap.

>> - Dynamic forks would make it possible to do a column-store database, or at least something approximating one.
>
> I've been wondering whether we could do something like this by
> treating a table t with columns pk, a1, a2, a3, b1, b2, b3 as two
> tables t1 and t2, one with columns pk, a1, a2, a3 and the other with
> columns pk, b1, b2, b3. SELECT * FROM t would be translated into
> SELECT * FROM t1, t2 WHERE t1.pk = t2.pk.

Possibly, but you'd be paying tuple overhead twice, which is what I was looking to avoid with forks.
--
Jim C. Nasby, Database Architect jim(at)nasby(dot)net
512.569.9461 (cell) http://jim.nasby.net

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Robert Haas 2011-01-18 17:24:53 Re: estimating # of distinct values
Previous Message Simone Aiken 2011-01-18 17:16:05 Re: ToDo List Item - System Table Index Clustering