Re: CUBE_MAX_DIM

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Devrim Gündüz <devrim(at)gunduz(dot)org>
Cc: PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: CUBE_MAX_DIM
Date: 2020-06-25 15:03:20
Message-ID: 2271927.1593097400@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Devrim =?ISO-8859-1?Q?G=FCnd=FCz?= <devrim(at)gunduz(dot)org> writes:
> Someone contacted me about increasing CUBE_MAX_DIM
> in contrib/cube/cubedata.h (in the community RPMs). The current value
> is 100 with the following comment:

> * This limit is pretty arbitrary, but don't make it so large that you
> * risk overflow in sizing calculations.

> They said they use 500, and never had a problem.

I guess I'm wondering what's the use-case. 100 already seems an order of
magnitude more than anyone could want. Or, if it's not enough, why does
raising the limit just 5x enable any large set of new applications?

The practical issue here is that, since the data requires 16 bytes per
dimension (plus a little bit of overhead), we'd be talking about
increasing the maximum size of a cube field from ~ 1600 bytes to ~ 8000
bytes. And cube is not toastable, so that couldn't be compressed or
shoved out-of-line. Maybe your OP never had a problem with it, but
plenty of use-cases would have "tuple too large" failures due to not
having room on a heap page for whatever other data they want in the row.

Even a non-toastable 2KB field is going to give the tuple toaster
algorithm problems, as it'll end up shoving every other toastable field
out-of-line in an ultimately vain attempt to bring the tuple size below
2KB. So I'm really quite hesitant to raise CUBE_MAX_DIM much past where
it is now without any other changes.

A more credible proposal would be to make cube toast-aware and then
raise the limit to ~1GB ... but that would take a significant amount
of work, and we still haven't got a use-case justifying it.

I think I'd counsel storing such data as plain float8 arrays, which
do have the necessary storage infrastructure. Is there something
about the cube operators that's particularly missing?

regards, tom lane

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Alvaro Herrera 2020-06-25 15:24:27 Re: min_safe_lsn column in pg_replication_slots view
Previous Message Robert Haas 2020-06-25 14:54:37 Re: improving basebackup.c's file-reading code