Quick Links

PostgreSQL - Help Optimizing performance - full text search on Heroku

From:	xlash <solutiondb(at)gmail(dot)com>
To:	pgsql-performance(at)postgresql(dot)org
Subject:	PostgreSQL - Help Optimizing performance - full text search on Heroku
Date:	2012-04-13 16:14:47
Message-ID:	1334333687863-5638777.post@n5.nabble.com
Views:	Whole Thread \| Raw Message \| Download mbox \| Resend email
Thread:
Lists:	pgsql-performance

I have a big table ~15M records called entities. I want to find top 5
entities matching "hockey" in their name.

I have a Full text index built for that, which is used :
gin_ix_entity_full_text_search_name, which indexes the name.

Query

SELECT "entities".*,
ts_rank(to_tsvector('english', "entities"."name"::text),
to_tsquery('english', 'hockey'::text)) AS "rank0.48661998202865475"
FROM "entities"
WHERE "entities"."place" = 'f'
AND (to_tsvector('english', "entities"."name"::text) @@
to_tsquery('english', 'hockey'::text))
ORDER BY "rank0.48661998202865475" DESC LIMIT 5
Duration 25,623 ms

QUERY PLAN

Limit (cost=4447.28..4447.29 rows=5 width=3116) (actual
time=18509.274..18509.282 rows=5 loops=1)
-> Sort (cost=4447.28..4448.41 rows=2248 width=3116) (actual
time=18509.271..18509.273 rows=5 loops=1)
Sort Key: (ts_rank(to_tsvector('english'::regconfig, (name)::text),
'''hockey'''::tsquery))
Sort Method: top-N heapsort Memory: 19kB
-> Bitmap Heap Scan on entities (cost=43.31..4439.82 rows=2248
width=3116) (actual time=119.003..18491.408 rows=2533 loops=1)
Recheck Cond: (to_tsvector('english'::regconfig, (name)::text) @@
'''hockey'''::tsquery)
Filter: (NOT place)
-> Bitmap Index Scan on gin_ix_entity_full_text_search_name
(cost=0.00..43.20 rows=2266 width=0) (actual time=74.093..74.093 rows=2593
loops=1)
Index Cond: (to_tsvector('english'::regconfig,
(name)::text) @@ '''hockey'''::tsquery)
Total runtime: 18509.381 ms
(10 rows)

Is it because of my boolean condition (not Place?) If so, I should add it to
my index and I should get a very fast query? Or is it the sorting condition
which makes it very long?

Thanks helping me understand the Query plan and how to fix my 25 seconds
query!

Here are my DB parameters . It is an online DB hosted by Heroku, on Amazon
services. They describe it as having 1.7GB of ram, 1 processing unit and a
DB of max 1TB.

name | current_setting

version | PostgreSQL 9.0.7 on i486-pc-linux-gnu,
compiled by GCC gcc-4.4.real (Ubuntu 4.4.3-4ubuntu5) 4.4.3, 32-bit
archive_command | test -f
/etc/postgresql/9.0/main/wal-e.d/ARCHIVING_OFF || envdir
/etc/postgresql/9.0/resource29857_heroku_com/wal-e.d/env wal-e wal-push %p
archive_mode | on
archive_timeout | 1min
checkpoint_completion_target | 0.7
checkpoint_segments | 40
client_min_messages | notice
cpu_index_tuple_cost | 0.001
cpu_operator_cost | 0.0005
cpu_tuple_cost | 0.003
effective_cache_size | 1530000kB
hot_standby | on
lc_collate | en_US.UTF-8
lc_ctype | en_US.UTF-8
listen_addresses | *
log_checkpoints | on
log_destination | syslog
log_line_prefix | %u [YELLOW]
log_min_duration_statement | 50ms
log_min_messages | notice
logging_collector | on
maintenance_work_mem | 64MB
max_connections | 500
max_prepared_transactions | 500
max_stack_depth | 2MB
max_standby_archive_delay | -1
max_standby_streaming_delay | -1
max_wal_senders | 10
port |
random_page_cost | 2
server_encoding | UTF8
shared_buffers | 415MB
ssl | on
syslog_ident | resource29857_heroku_com
TimeZone | UTC
wal_buffers | 8MB
wal_keep_segments | 127
wal_level | hot_standby
work_mem | 100MB
(39 rows)

I tried playing with the work_mem, setting it as high as 1.5GB, with no
success. I believe it is heroku reading speed that is abysmal in this case.
But I'd like to confirm that. Or is it the function that I'm calling in my
SELECT clause?

Thanks for help
Also posted on
http://dba.stackexchange.com/questions/16437/postgresql-help-optimizing-sql-performance-full-text-search

--
View this message in context: http://postgresql.1045698.n5.nabble.com/PostgreSQL-Help-Optimizing-performance-full-text-search-on-Heroku-tp5638777p5638777.html
Sent from the PostgreSQL - performance mailing list archive at Nabble.com.

Responses

Re: PostgreSQL - Help Optimizing performance - full text search on Heroku at 2012-04-13 17:09:10 from Tomas Vondra

Browse pgsql-performance by date

	From	Date	Subject
Next Message	Steve Crawford	2012-04-13 16:36:20	Re: scale up (postgresql vs mssql)
Previous Message	Claudio Freire	2012-04-13 15:04:39	Re: scale up (postgresql vs mssql)