Quick Links

Feature Request: bigtsvector

From:	CPT <cpt(at)novozymes(dot)com>
To:	<pgsql-general(at)postgresql(dot)org>
Subject:	Feature Request: bigtsvector
Date:	2015-06-17 05:58:21
Message-ID:	55810C7D.6000801@novozymes.com
Views:	Whole Thread \| Raw Message \| Download mbox \| Resend email
Thread:
Lists:	pgsql-general pgsql-hackers

Hi all;

We are running a multi-TB bioinformatics system on PostgreSQL and use a
denormalized schema in
places with a lot of tsvectors aggregated together for centralized
searching. This is
very important to the performance of the system. These aggregate many
documents (sometimes tens of thousands), many of which contain large
numbers of references to other documents. It isn't uncommon to have
tens of thousands of lexemes. The tsvectors hold mixed document id and
natural language search information (all f which comes in from the same
documents).

Recently we have started hitting the 1MB limit on tsvector size. We
have found it possible to
patch PostgreSQL to make the tsvector larger but this changes the
on-disk layout. How likely is
it that either the tsvector size could be increased in future versions
to allow for vectors up to toastable size (1GB logical)? I can't
imagine we are the only ones with such a problem. Since, I think,
changing the on-disk layout might not be such a good idea, maybe it
would be worth considering having a new bigtsvector type?

Btw, we've been very impressed with the extent that PostgreSQL has
tolerated all kinds of loads we have thrown at it.

Regards,
CPT

Responses

Re: [GENERAL] Feature Request: bigtsvector at 2015-09-09 14:52:02 from Bruce Momjian

Browse pgsql-general by date

	From	Date	Subject
Next Message	Albe Laurenz	2015-06-17 07:17:41	Re: pg_dump 8.4.9 failing after upgrade to openssl-1.0.1e-30.el6_6.11.x86_64 on redhat linux
Previous Message	Bill Moran	2015-06-17 01:45:10	Re: serialization failure why?

Browse pgsql-hackers by date

	From	Date	Subject
Next Message	Michael Paquier	2015-06-17 06:17:37	pg_rewind and xlogtemp files
Previous Message	Noah Misch	2015-06-17 05:02:57	Re: 9.5 release scheduling (was Re: logical column ordering)