Skip site navigation (1) Skip section navigation (2)

Re: Speeding up query, Joining 55mil and 43mil records.

From: nicky <nicky(at)valuecare(dot)nl>
To: Sven Geisler <sgeisler(at)aeccom(dot)com>
Cc: pgsql-performance(at)postgresql(dot)org
Subject: Re: Speeding up query, Joining 55mil and 43mil records.
Date: 2006-06-22 12:10:50
Message-ID: 449A88CA.3000108@valuecare.nl (view raw or flat)
Thread:
Lists: pgsql-performance
Hello Sven,

We have the following indexes on src_faktuur_verrsec
/
    CREATE INDEX src_faktuur_verrsec_idx0
      ON src.src_faktuur_verrsec
      USING btree
      (id);

    CREATE INDEX src_faktuur_verrsec_idx1
      ON src.src_faktuur_verrsec
      USING btree
      (substr(code::text, 1, 2));

    CREATE INDEX src_faktuur_verrsec_idx2
      ON src.src_faktuur_verrsec
      USING btree
      (substr(correctie::text, 4, 1));/

and another two on src_faktuur_verricht

/    CREATE INDEX src_faktuur_verricht_idx0
      ON src.src_faktuur_verricht
      USING btree
      (id);

    CREATE INDEX src_faktuur_verricht_idx1
      ON src.src_faktuur_verricht
      USING btree
      (date_part('year'::text, datum))
      TABLESPACE src_index;/

PostgreSQL elects not to use them. I assume, because it most likely 
needs to traverse the entire table anyway.

if i change: /              substr(t0.code,1,2) not in 
('14','15','16','17')/
to (removing the NOT): /    substr(t0.code,1,2) in ('14','15','16','17')/

it uses the index, but it's not the query that needs to be run anymore.

Greetings,
Nick




Sven Geisler wrote:
> Hi Nicky,
>
> Did you tried to create an index to avoid the sequential scans?
>
> Seq Scan on src_faktuur_verrsec t0...
>
> I think, you should try
>
> CREATE INDEX src.src_faktuur_verrsec_codesubstr ON 
> src.src_faktuur_verrsec (substr(src.src_faktuur_verrsec.code,1,2))
>
> Cheers
> Sven.
>
> nicky schrieb:
>> Hello again,
>>
>> thanks for all the quick replies.
>>
>> It seems i wasn't entirely correct on my previous post, i've mixed up 
>> some times/numbers.
>>
>> Below the correct numbers
>>
>> MSSQL:      SELECT COUNT(*) from JOIN (without insert)   17 minutes
>> PostgreSQL: SELECT COUNT(*) from JOIN (without insert)   33 minutes
>> PostgreSQL: complete query                               55 minutes
>
>  <snip snip snip>
>>
>> A lot of improvement also in the select count: 33 minutes vs 10 minutes.
>>
>>
>> To us, the speeds are good. Very happy with the performance increase 
>> on that select with join, since 90% of the queries are SELECT based.
>>
>> The query results in 7551616 records, so that's about 4500 inserts 
>> per second. I'm not sure if that is fast or not. Any further tips 
>> would be welcome.
>
> ---------------------------(end of broadcast)---------------------------
> TIP 9: In versions below 8.0, the planner will ignore your desire to
>       choose an index scan if your joining column's datatypes do not
>       match
>
>

In response to

Responses

pgsql-performance by date

Next:From: Magnus HaganderDate: 2006-06-22 12:17:29
Subject: Re: Speeding up query, Joining 55mil and 43mil records.
Previous:From: Sven GeislerDate: 2006-06-22 11:29:41
Subject: Re: Speeding up query, Joining 55mil and 43mil records.

Privacy Policy | About PostgreSQL
Copyright © 1996-2014 The PostgreSQL Global Development Group