Re: Is there any method to keep table in memory at startup

From: Vinay Jain <vinayj(at)sarathi(dot)ncst(dot)ernet(dot)in>
To: Andrew Hammond <ahammond(at)ca(dot)afilias(dot)info>
Cc: vinayj(at)ncst(dot)ernet(dot)in, pgsql-hackers(at)postgresql(dot)org
Subject: Re: Is there any method to keep table in memory at startup
Date: 2004-05-06 08:47:49
Message-ID: 4099FBB5.7070809@sarathi.ncst.ernet.in
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

hi
ya function is declared immutable and strict... also made btree index..
one mistake that i was doing that I was using libpq rather than
SPI..but even after using it though no doubt the performance is
increased still not up to the mark
if i use index than it's explain analyze is like this :

// For 11 rows

template1=# explain ANALYZE select * from temp_hindi_copy3 order by name;
QUERY PLAN
--------------------------------------------------------------------------------------------------------------------------------------------
Index Scan using temp_hindi_copy3_pkey on temp_hindi_copy3
(cost=0.00..52.00 rows=1000 width=36) (actual time=0.06..0.12 rows=11
loops=1)
Total runtime: 0.24 msec
which is good and obvious but when i have to use other functions with
it like

// 11 Rows

template1=# explain ANALYZE select name, length(name), substr(name, 3,
2) from temp_hindi_copy3 order by name;
QUERY PLAN
-----------------------------------------------------------------------------------------------------------------------------------------------
Index Scan using temp_hindi_copy3_pkey on temp_hindi_copy3
(cost=0.00..52.00 rows=1000 width=32) (actual time=88.72..920.18 rows=11
loops=1)
Total runtime: 920.37 msec
(2 rows)
which is not good
similarly for Non Key order by:

//136 Rows
template1=# explain ANALYZE select * from temp_hindi_copy order by name;
QUERY PLAN
--------------------------------------------------------------------------------------------------------------------
Sort (cost=69.83..72.33 rows=1000 width=36) (actual
time=26282.57..26282.68 rows=136 loops=1)
Sort Key: name
-> Seq Scan on temp_hindi_copy (cost=0.00..20.00 rows=1000
width=36) (actual time=0.22..0.60 rows=136 loops=1)
Total runtime: 26282.92 msec
(4 rows)

Which takes a lot of time the reason is that in every indchar_lt
function call SPI connection is made and destroyed...
and order by clause calls it n (> no. of rows) times..
I tried a more than 10 times still the output was same.......... also in
one order by statement the same query is called more than n times....
but performance is same....
so is there any way to just make connection once and keep data in
memory...and when psql is exited the connection is closed...

When we start psql it makes connection with database is correct...
but not from program ....so have to Make SPI connection...

Andrew Hammond wrote:

> Vinay Jain wrote:
>
>> Hi
>> thanx and sorry that I asked such a simple question in
>> postgres-hackers list....
>> but the complexity which i feel on that basis ....please allow me to
>> explain my problem further.....
>> As i am working on sorting order , length and substring functions for
>> Hindi text(Indian Language)...
>> Here is the problem which i found in postgresql...
>> after setting collating sequence in proper way(i.e. C) the order was
>> on basis of unicode values...but in Hindi Language some of combined
>> unicode values makes a single character
>> similarly length is not appropriate for these reasons & hence
>> substring operations
>> so i designed a customized data type called IndChar....and operations
>> on it
>> in order by statement the only function called is indchar_lt(defined
>> for < operator)......
>
>
is your indchar_lt function declared IMMUTABLE? That would allow it's
results to be cached instead of re-calculated every time.

>

>
>> Now please guide me where is starting(where i can open connection to
>> database) and ending of my program....I feel only in indchar_lt
>> function which will be called many times in order by statement
>> causing performance degradation..
>
>
> Have you created an index on that column? That would be a usual way to
> speed up an ORDER BY. NB, the function involved must be IMMUTABLE to
> be used in an index.
>
>> as i am not much experienced this assumption may be wrong...
>
>
> My professor at University used to always say "measure measure
> measure". Postgres makes it easy to measure. Try putting EXPLAIN
> ANALYZE before your SELECT statement. Run the SELECT a couple of times
> first so that the OS can get stuff cached, then:
>
> do an EXPLAIN ANALYZE on the query, save the results
>
> then ANALYZE the tables involved and to another EXPLAIN ANALYZE on the
> query, save the results
>
> add the necessary index, ANALYZE then EXPLAIN ANALYZE.
>
>> so my question remains as it is that is there any such thing which
>> can be called at startup of psql.........to make connection to database
>
>
> I'm really not sure what you mean by this. psql connects to the
> database on startup.
>
>> regards
>> Vinay
>>
>>
>>
>>
>> Andrew Hammond wrote:
>>
>>> Vinay Jain wrote:
>>>
>>>> Hi
>>>> thank you for such a useful information...
>>>> but actually in my case if i keep table in disk it
>>>> significantly degrades performance and even for a table of 10 rows
>>>> it takes 1-2 minutes I think u r not beliving it ! am i right
>>>> for example
>>>> I create a table in which i use my customized data type say student
>>>> create table student
>>>> (Name INDCHAR //INDCHAR is customized data type
>>>> age integer);
>>>> now i give query like this
>>>> select * from student order by name;
>>>> it will search for it's comparator operator (<) and related
>>>> function...
>>>> in that function there is one lookup table if that table is in
>>>> memory no problem! (oh but it can't be) if it is in disk my
>>>> program makes connection to database and execute query which is
>>>> just a select statement on a simple where condition of equality.
>>>> then closes connection
>>>
>>>
>>>
>>>
>>> There's your problem. Creating database connections is an expensive
>>> operation. They are not intended to be opened and closed often or
>>> quickly. Open your database connection at the beginning of your
>>> program, and close it at the end.
>>>
>>> You could also throw an index on the column you're using in your
>>> order by clause, but that won't make a difference until your table
>>> get a little bigger.
>>>
>>> Please take further questions of this nature to the pgsql-novice list.
>>>
>>>> so every time less than operator(<) is called it does the same task..
>>>> what i feel in table of 10 rows how many times the < operator will
>>>> be called(NO idea but must be > 10 times)
>>>> is there any solution..
>>>> thanks in advance
>>>> regards
>>>> vinay
>>>
>>>
>>>
>>>
>>>
>>> ---------------------------(end of
>>> broadcast)---------------------------
>>> TIP 5: Have you checked our extensive FAQ?
>>>
>>> http://www.postgresql.org/docs/faqs/FAQ.html
>>>
>>
>>
>>
>
>
> ---------------------------(end of broadcast)---------------------------
> TIP 5: Have you checked our extensive FAQ?
>
> http://www.postgresql.org/docs/faqs/FAQ.html
>

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Andrew Dunstan 2004-05-06 11:50:09 Re: Function to do runtime relative directory mapping
Previous Message Gaetano Mendola 2004-05-06 08:35:00 Re: Postgres Optimization: IO and Data Organization