From: | Merlin Moncure <mmoncure(at)gmail(dot)com> |
---|---|
To: | Andy Colson <andy(at)squeakycode(dot)net> |
Cc: | Stefan Keller <sfkeller(at)gmail(dot)com>, pgsql-general List <pgsql-general(at)postgresql(dot)org> |
Subject: | Re: Call for Google Summer of Code (GSoC) 2012: Project ideas? |
Date: | 2012-03-09 17:06:04 |
Message-ID: | CAHyXU0zdvbnP39p1LJU5T1dPy0ZZLDT0dx3hTfCptCm1goxLzQ@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-general |
On Fri, Mar 9, 2012 at 10:19 AM, Andy Colson <andy(at)squeakycode(dot)net> wrote:
> On 3/9/2012 9:47 AM, Merlin Moncure wrote:
>>
>> On Thu, Mar 8, 2012 at 2:01 PM, Andy Colson<andy(at)squeakycode(dot)net> wrote:
>>>
>>> I know toast compresses, but I believe its only one row. page level
>>> would
>>> compress better because there is more data, and it would also decrease
>>> the
>>> amount of IO, so it might speed up disk access.
>>
>>
>> er, but when data is toasted it's spanning pages. page level
>> compression is a super complicated problem.
>>
>> something that is maybe more attainable on the compression side of
>> things is a userland api for compression -- like pgcrypto is for
>> encryption. even if it didn't make it into core, it could live on
>> reasonably as a pgfoundry project.
>>
>> merlin
>
>
> Agreed its probably too difficult for a GSoC project. But userland api
> would still be row level, which, in my opinion is useless. Consider rows
> from my apache log that I'm dumping to database:
It's useless for what you're trying to do, but it would be useful to
people trying to compress large datums (data, I know) before storage
using algorithms that postgres can't support, like lzo.
> date, url, status
> 2012-3-9 10:15:00, '/index.php?id=4', 202
> 2012-3-9 10:15:01, '/index.php?id=5', 202
> 2012-3-9 10:15:02, '/index.php?id=6', 202
>
> That wont compress at all on a row level. But it'll compress 99% on a
> "larger" (page/multirow/whatever/?) level.
sure, but you can only get those rates by giving up the segmented view
of the data that postgres requires. your tuples are very small and I
only see compression happening on the userland side by employing
tricks specific to your specific dataset (like employing "char" to map
the status, url mapping, etc).
merlin
From | Date | Subject | |
---|---|---|---|
Next Message | John R Pierce | 2012-03-09 17:07:23 | Re: How to erase transaction logs on PostgreSQL |
Previous Message | Bret Stern | 2012-03-09 16:53:25 | Regarding NOTIFY |