Re: Slicing TOAST

From: Pavel Golub <pavel(at)microolap(dot)com>
To: Heikki Linnakangas <hlinnakangas(at)vmware(dot)com>
Cc: Josh Berkus <josh(at)agliodbs(dot)com>, pgsql-hackers(at)postgresql(dot)org, Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com>
Subject: Re: Slicing TOAST
Date: 2013-05-15 09:01:51
Message-ID: 1886757050.20130515120151@gf.microolap.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers pgsql-students

Hello, Heikki.

You wrote:

HL> On 14.05.2013 21:36, Josh Berkus wrote:
>>
>>> I'm proposing this now as a possible GSoC project; I don't propose to
>>> actively work on it myself.
>>
>> The deadline for submitting GSOC projects (by students) was a week ago.
>> So is this a project suggestion for next year ...?

HL> I've been thinking, we should already start collecting ideas for next
HL> year, and collect them throughout the year. I know I come up with some
HL> ideas every now and then, but when it's time for another GSoC, I can't
HL> remember any of them.

HL> I just created a GSoC2014 ideas pages on the wiki, for collecting these:
HL> https://wiki.postgresql.org/wiki/GSoC_2014. Let's keep the ideas coming,
HL> throughout the year.

Good idea! It reminds about feature proposed by Pavel Stehule while
ago here: http://www.postgresql.org/message-id/BANLkTini+ChGKfnyjkF1rsHSQ2kMktSDjg@mail.gmail.com

It's about streaming functionality for BYTEA type. But I think
streaming must be added to BYTEA, TEXT and VARCHAR without length
specifier too.

As Pavel stated: "A very large bytea are limited by
query size - processing long query needs too RAM". This is the holy
true, which came up suddenly in the project of one of my client.
Becuase he used bytea for images storing and text format in
PQexec, which as you know doubles-triples size of the data.

Some more details from Pavel:
<quote>
There is a few disadvantages LO against bytea, so there are requests
for "smarter" API for bytea.

Significant problem is different implementation of LO for people who
have to port application to PostgreSQL from Oracle, DB2. There are
some JDBC issues too.

For me - main disadvantage of LO in one space for all. Bytea removes
this disadvantage, but it is slower for lengths > 20 MB. It could be
really very practical have a possibility insert some large fields in
second NON SQL stream. Same situation is when large bytea is read.
</quote>

I'm not sure if the whole project is simple enough for GSOC, but I
suppose it may be splitted somehow.

PS Should we start separate thread for proposals, because I've spent
an hour since I found wiki for GSOC14 mention.

HL> - Heikki

--
With best wishes,
Pavel mailto:pavel(at)gf(dot)microolap(dot)com

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Pavel Stehule 2013-05-15 09:12:41 Re: proposal: option --application_name for psql
Previous Message Amit Kapila 2013-05-15 06:56:52 Re: Parallel Sort

Browse pgsql-students by date

  From Date Subject
Next Message Maxence AHLOUCHE 2013-05-25 09:21:40 Re: GSoC project: K-medoids clustering in Madlib
Previous Message Josh Berkus 2013-05-14 23:40:50 Re: Slicing TOAST