GiST limits on contrib/cube with dimension > 100?

From: Siarhei Siniak <siarheisiniak(at)yahoo(dot)com>
To: "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject: GiST limits on contrib/cube with dimension > 100?
Date: 2019-06-09 18:05:20
Message-ID: 631800776.720455.1560103520960@mail.yahoo.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

I've been using cube extension recompiled with #define MAX_DIM 256.
But with a version 11.3 I'm getting the following error:failed to add item to index page in <index_name>
There's a regression unit test in contrib/cube/expected/cube.out:
CREATE TABLE test_cube (c cube);
\copy test_cube from 'data/test_cube.data'
CREATE INDEX test_cube_ix ON test_cube USING gist (c);
SELECT * FROM test_cube WHERE c && '(3000,1000),(0,0)' ORDER BY c;
I've created gist index in the same way, i.e. create index <index_name> on <table_name> using gist(<column_name>);
If MAX_DIM equals to 512, btree index complaints as:index row size 4112 exceeds maximum 2712 for index <index_name>
HINT:  Values larger than 1/3 of a buffer page cannot be indexed.                                      
Consider a function index of an MD5 hash of the value, or use full text indexing.   

That's why 256 has been set.
But gist doesn't provide explanation on its error.
These are the places where the message might have been generated:src/backend/access/gist/gist.c:418:                                     elog(ERROR, "failed to add item to index page in \"%s\"", RelationGetRelationName(rel));
src/backend/access/gist/gist.c:540:                                     elog(ERROR, "failed to add item to index page in \"%s\"",

Question is what restrains from setting MAX_DIM bigger than 100 in a custom recompiled cube extension version?In practice the error messages are too cryptic.
contrib/cube/cube.c has the following methods regarding GIST:/*
** GiST support methods
*/

PG_FUNCTION_INFO_V1(g_cube_consistent);
PG_FUNCTION_INFO_V1(g_cube_compress);
PG_FUNCTION_INFO_V1(g_cube_decompress);
PG_FUNCTION_INFO_V1(g_cube_penalty);
PG_FUNCTION_INFO_V1(g_cube_picksplit);
PG_FUNCTION_INFO_V1(g_cube_union);
PG_FUNCTION_INFO_V1(g_cube_same);
PG_FUNCTION_INFO_V1(g_cube_distance);

g_cube_compress has the following body:    PG_RETURN_DATUM(PG_GETARG_DATUM(0));

Does it just returns void pointer to the underlying x array?
cube data structure:
typedef struct NDBOX
{
    /* varlena header (do not touch directly!) */
    int32        vl_len_;

    /*----------
     * Header contains info about NDBOX. For binary compatibility with old
     * versions, it is defined as "unsigned int".
     *
     * Following information is stored:
     *
     *    bits 0-7  : number of cube dimensions;
     *    bits 8-30 : unused, initialize to zero;
     *    bit  31   : point flag. If set, the upper right coordinates are not
     *                stored, and are implicitly the same as the lower left
     *                coordinates.
     *----------
     */
    unsigned int header;

    /*
     * The lower left coordinates for each dimension come first, followed by
     * upper right coordinates unless the point flag is set.
     */
    double        x[FLEXIBLE_ARRAY_MEMBER];
} NDBOX;

Can it be a problem of not fitting into some limits when building or updating gist index for cube with MAX_DIM > 100? 

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Andres Freund 2019-06-09 21:31:26 Re: Temp table handling after anti-wraparound shutdown (Was: BUG #15840)
Previous Message Avinash Kumar 2019-06-09 16:54:05 Re: Bloom Indexes - bit array length and the total number of bits (or hash functions ?? ) !