TOAST issue on custom index access method

From: Carsten Kropf <ckropf2(at)fh-hof(dot)de>
To: pgsql-hackers(at)postgresql(dot)org
Subject: TOAST issue on custom index access method
Date: 2010-06-24 07:24:02
Message-ID: 9944BC88-834C-452D-85CD-403BDC6052E8@fh-hof.de
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi *,
I have a strange issue using a custom built index structure. My index access method support document type composed of words (as tsvector) and points (1-dimensional arrays of them). For internal reasons, I have to save the documents as a whole inside my structure (for proper reorganisations).
So, I form the tuples using index_form_tuple with the proper description. Everything works fine, as long as the documents are quite small. However, if the tsvector becomes too large, I run into a problem of not being able to store the documents, because (obviously) the tsvector is too large for one page.
Everything works fine, as long as I do insert ...(select...) statement, i.e. I have some table that feeds my table, where the index is set up, with data using, e.g.:
insert into documents(words, points) (select words, points from docs);

This query works in every case, because the words that I receive in my index are properly toasted in advance and the size of the documents is quite limited then. The document is constructed as CREATE TYPE document AS (words tsvector, points _point) .
However, I tried to insert a document containing a large words vector directly (the tsvector is of length 770, actually the points don't matter here too much).
I tried this using:

INSERT INTO documents (words, points) VALUES (to_tsvector('english', '<extremely long string constant.....>'), '<points array>');

If I perform the statement, mentioned above (I can also extract data of my feeding table 'docs' and use them as input), i.e. I insert the large vector directly, without pulling it out of some feeding table, I get a problem, when I call index_form_tuple, resulting in the following output:

ERROR: index row requires 12320 bytes, maximum size is 8191
So, creating my tuple using index_form_tuple fails because of the tsvector not being toasted in prior to inserting it to my index.
What I tried to solve this issue here, is to extract the words from the document (in my index) and calling 'Datum toast_compress_datum(Datum value)'in order to compress the tsvector into a proper toast table. Unfortunately, this function always returned NULL, which meant that the vector could not be compressed, resulting in the same error.

Actually, currently, I don't know, how to proceed (of course, there would be the possibilty of stopping the direct insertion, creating a table which triggers the insertion into the real data table, containing the index and inserting the row there after the insertion, which I don't assume to be a good solution).
Does anybody know about this issue of getting untoasted values if the values are inserted directly and toasted values, if copying the data from another table?
I hope that somebody could help me, here in order that my access method is able to run, properly.

Thanks in advance

Best regards
Carsten Kropf

P.S.: The version I use to program is 8.4.2

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Böszörményi Zoltán 2010-06-24 07:27:57 Re: ECPG FETCH readahead
Previous Message Simon Riggs 2010-06-24 07:10:57 Re: [COMMITTERS] pgsql: Add TCP keepalive support to libpq.