Skip site navigation (1) Skip section navigation (2)

Re: [PERFORM] Partioning tsearch2 a table into chunks and accessing via views

From: Benjamin Arai <me(at)benjaminarai(dot)com>
To: Joshua D(dot)Drake <jd(at)commandprompt(dot)com>
Cc: Brandon Shalton <brandon(at)cydataservices(dot)com>,PostgreSQL <pgsql-general(at)postgresql(dot)org>,pgsql-performance(at)postgresql(dot)org
Subject: Re: [PERFORM] Partioning tsearch2 a table into chunks and accessing via views
Date: 2007-08-25 05:34:06
Message-ID: F6B156AD-BE6D-4300-871A-E3DE0491AA11@benjaminarai.com (view raw or flat)
Thread:
Lists: pgsql-generalpgsql-performance
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

This kind of disappointing, I was hoping there was more that could be  
done.

There has to be another way to do incremental indexing without  
loosing that much performance.

Benjamin

On Aug 24, 2007, at 6:29 PM, Joshua D. Drake wrote:

> -----BEGIN PGP SIGNED MESSAGE-----
> Hash: SHA1
>
> Brandon Shalton wrote:
>> Benjamin,
>>
>>
>>>
>>> In order to avoid the re-indexing I was thinking of instead creating
>>> a new
>>> table each month (building its indexes and etc) and accessing  
>>> them all
>>> through a view. This way I only have to index the new data each  
>>> month.
>>>
>>
>> Take a look at bizgres.org (based on postgres).
>>
>> They have a parent-child structure.
>>
>> The way i use it, is I have about 30M records a day that are inserted
>> into the database.
>>
>> Each day is a "child" table to the "parent".
>>
>> so example:
>>
>> the parent table is called  "logfile"
>>
>> each day, is a child, with the structure like  "logfile_YYMMDD"
>>
>> the "child" inherits the table structure of the parent, such that you
>> could query the child table name directly, or you run the query  
>> against
>> the parent (ie. logfile table) and get all the data.
>>
>> the indexes are done on a per table basis, so new data that comes  
>> in, is
>> a lesser amount, and doesn't require re-indexing.
>
>
> PostgreSQL can do all of this too.
>
> Sincerely,
>
> Joshua D. Drake
>
>>
>>
>> example:
>>
>> select * from logfile_070825 where datafield = 'foo'
>>
>> if i knew i wanted to specifically go into that child, or:
>>
>> select * from logfile where datafield = 'foo'
>>
>> and all child tables are searched and results merged.  You can  
>> perform
>> any kind of sql query and field structures are you normally do.
>>
>> the downside is that the queries are run sequentially.
>>
>> so if you had 100 child tables, each table is queried via indexes,  
>> then
>> results are merged.
>>
>> but, this approach does allow me to dump alot of data in, without  
>> having
>> the re-indexing issues you are facing.
>>
>> at some point, you could roll up the days, in to weekly child tables,
>> then monthly tables, etc.
>>
>> I believe Bizgres has a new version of their system that does  
>> parallel
>> queries which would certainly speed things up.
>>
>> For your documents, you can do it by the day it was checked in, or  
>> maybe
>> you have some other way of logically grouping, but the parent/child
>> table structure really helped to solve my problem of adding in  
>> millions
>> of records each day.
>>
>> The closest thing in mysql is using merge tables, which is not really
>> practical when it comes time to do the joins to the tables.
>>
>> -brandon
>>
>> http://www.t3report.com - marketing intelligence for online marketing
>> and affiliate programs
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>> ---------------------------(end of  
>> broadcast)---------------------------
>> TIP 6: explain analyze is your friend
>>
>
>
> - --
>
>       === The PostgreSQL Company: Command Prompt, Inc. ===
> Sales/Support: +1.503.667.4564   24x7/Emergency: +1.800.492.2240
> PostgreSQL solutions since 1997  http://www.commandprompt.com/
> 			UNIQUE NOT NULL
> Donate to the PostgreSQL Project: http://www.postgresql.org/about/ 
> donate
> PostgreSQL Replication: http://www.commandprompt.com/products/
>
> -----BEGIN PGP SIGNATURE-----
> Version: GnuPG v1.4.6 (GNU/Linux)
> Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org
>
> iD8DBQFGz4XuATb/zqfZUUQRAukhAJ9b2x4PLPZsoPmtm3O/Ze4AobDXngCgq+rl
> X2j2ePDyjYxRajfGCVmjnYU=
> =pIjb
> -----END PGP SIGNATURE-----
>

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.5 (Darwin)

iQIVAwUBRs+/UfyqRf6YpodNAQL6Xg//eEqR0UQ4I/snn7Dtmkru40jCuECGeG8g
XoxLWEa+bumVgwrEYbjKTBp3KP6OEKz9VV4xHQROTtqxh+rg0hdoc0kWxSyquCm8
VljL24ykvBmRmjhacwi8FKp092zwRcLrbkzTxIr90q8u008aVPWxQCBtmfL6QVTv
I9AyN0kb00ypx+B9I2ySugYzBerVCMUiKUeXplHWn1loSSm1w+5CzXY8gtvivFEV
YspS1Fk2rxjnjlPE/FTGUiwJrdWZTJrd3BuSVbH5DWBoCjz9gzq0NyNZAtESWX2H
oGwlWBEJNFTtoHnK4iTMS+CzKHQQQZ9ZuQcHy84SlXYUo9n0/NCIeabu2xaj44Fs
LFq8jBCH3ebAkD/hQOgk1H05ljbfX8A/u2zz75W1NbD0xTB/sAljWqhypz2x7pOo
sUJF9MQ7DwVG8JitUAAc5fuGpLLR4WxF68YdkgycaCNknP7IATeD2ecqJkC26Av+
GHHci2ct5ypVq9Qq8OuesYSox7XpO2+E+Y5DtgBo+/R7eOJRLA3Z0FDXFLGsdFxy
0OKoew1MN79jP+KMZFJwvddH/TrkZBdIKlkacXYwUHU3c1ATwne6WteKTnEmr2aP
99oQgfmNDyQgTeEL20jokF4YZOdm1UO3Cc7wTi2QlwyqUDbUmYtWzgbS9QbnaGGA
58XdVacGznw=
=Hst4
-----END PGP SIGNATURE-----

In response to

Responses

pgsql-performance by date

Next:From: Kevin KempterDate: 2007-08-25 05:34:23
Subject: Re: significant vacuum issues - looking for suggestions
Previous:From: Brandon ShaltonDate: 2007-08-25 03:54:24
Subject: Re: [PERFORM] Partioning tsearch2 a table into chunks and accessing via views

pgsql-general by date

Next:From: Jaime CasanovaDate: 2007-08-25 05:40:44
Subject: Re: Add Column BEFORE/AFTER another column
Previous:From: Erik JonesDate: 2007-08-25 04:38:26
Subject: Re: Add Column BEFORE/AFTER another column

Privacy Policy | About PostgreSQL
Copyright © 1996-2014 The PostgreSQL Global Development Group