A problem with the IN clause

From: Sean Shanny <shannyconsulting(at)earthlink(dot)net>
To: pgsql-general(at)postgresql(dot)org
Cc: Nick Shanny <nshanny(at)tripadvisor(dot)com>, Sean Shanny <shannyconsulting(at)earthlink(dot)net>
Subject: A problem with the IN clause
Date: 2004-05-18 06:59:07
Message-ID: 40A9B43B.6070306@earthlink.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

To all,

We are running postgresql 7.4.1 on an G5 with dual procs, OSX 10.3.3
server, 8GB mem, attached to a fully configured 3.5TB XRaid box via
fibre channel.

I think we have run into this issue before but I thought the code was
fixed. :-(

I have the following SQL:

SELECT subage, COUNT( DISTINCT (t1.session_key)), COUNT(
DISTINCT(t1.userid_key))
FROM f_pageviews t1 JOIN segmented_sub t0 ON (t1.userid_key = t0.id)
WHERE t1.date_key BETWEEN 496 AND 502
AND t1.newsletterid_key IN (SELECT newsletterid FROM t_newscontentstatic)
GROUP BY 1

When I run this against our warehouse instance I get an out of memory
error. If I remove the

AND t1.newsletterid_key IN (SELECT newsletterid FROM t_newscontentstatic)

portion if runs fine. We had a problem in the past with the IN clause
having a small number of values and some sort of hash table size
allocation being called with an enormous value which caused the
individual postgresql process to run out of memory. In this case we have
11 strings, each 8 characters in length, in the SELECT newsletterid FROM
t_newscontentstatic result set. This query has been running fine for
weeks. The only change has been the increase in the number of values
returned in the IN clause result set, it goes up by 1 or 2 each week. I
ran this query by hand populating the IN clause with 4 out of the 11
strings and it ran just fine.

I have included the relevant postgresql logfile and successful and
unsuccessful explain analyze outputs below.

I know Tom Lane was kind enough to take a look last time we had this
issue and we would be more than happy to let him into our system again
to debug. :-)

Thanks.

--sean

=============================================================
=============================================================

explain analyze for the failed run:

tripmaster=# explain analyze SELECT subage, COUNT( DISTINCT
(t1.session_key)), COUNT( DISTINCT(t1.userid_key))
tripmaster-# FROM f_pageviews t1 JOIN segmented_sub t0 ON (t1.userid_key
= t0.id)
tripmaster-# WHERE t1.date_key BETWEEN 496 AND 502
tripmaster-# AND t1.newsletterid_key IN (SELECT newsletterid FROM
t_newscontentstatic)
tripmaster-# GROUP BY 1;

QUERY PLAN
-------------------------------------------------------------------------------------------------------------------------------------------------------------
GroupAggregate (cost=585501.75..585501.77 rows=1 width=12) (actual
time=192050.482..192055.137 rows=14 loops=1)
-> Sort (cost=585501.75..585501.75 rows=1 width=12) (actual
time=192046.408..192047.231 rows=3214 loops=1)
Sort Key: t0.subage
-> Nested Loop (cost=585495.82..585501.74 rows=1 width=12)
(actual time=189303.056..192042.130 rows=3214 loops=1)
-> Hash Join (cost=585495.82..585495.87 rows=1 width=8)
(actual time=189253.209..189384.639 rows=6691 loops=1)
Hash Cond: ("outer".newsletterid =
"inner".newsletterid_key)
-> HashAggregate (cost=9.10..9.10 rows=8
width=12) (actual time=61.695..61.726 rows=11 loops=1)
-> Seq Scan on t_newscontentstatic
(cost=0.00..9.08 rows=8 width=12) (actual time=16.280..61.660 rows=11
loops=1)
-> Hash (cost=585486.72..585486.72 rows=1
width=24) (actual time=189190.574..189190.574 rows=0 loops=1)
-> Seq Scan on f_pageviews t1
(cost=0.00..585486.72 rows=1 width=24) (actual
time=60502.415..-463715.543 rows=24422838 loops=1)
Filter: ((date_key >= 496) AND
(date_key <= 502))
-> Index Scan using idx_segmented_sub on segmented_sub
t0 (cost=0.00..5.85 rows=1 width=8) (actual time=0.391..0.394 rows=0
loops=6691)
Index Cond: ("outer".userid_key = t0.id)
Total runtime: 194187.416 ms
(14 rows)

=============================================================
=============================================================

explain analyze from a successful run:

tripmaster=# explain analyze SELECT subage, COUNT( DISTINCT
(t1.session_key)), COUNT( DISTINCT(t1.userid_key))
tripmaster-# FROM f_pageviews t1 JOIN segmented_sub t0 ON (t1.userid_key
= t0.id)
tripmaster-# WHERE t1.date_key BETWEEN 496 AND 502
tripmaster-# AND t1.newsletterid_key IN ( '20040429', '20040330',
'20040506', '20040520')
tripmaster-# GROUP BY 1;

QUERY PLAN

----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
GroupAggregate (cost=876145.08..876145.09 rows=1 width=12) (actual
time=194282.825..194283.361 rows=11 loops=1)
-> Sort (cost=876145.08..876145.08 rows=1 width=12) (actual
time=194281.394..194281.579 rows=999 loops=1)
Sort Key: t0.subage
-> Nested Loop (cost=0.00..876145.07 rows=1 width=12) (actual
time=66567.739..194279.092 rows=999 loops=1)
-> Seq Scan on f_pageviews t1 (cost=0.00..876139.20
rows=1 width=8) (actual time=66124.874..192232.450 rows=1961 loops=1)
Filter: ((date_key >= 496) AND (date_key <= 502)
AND ((newsletterid_key = '20040429'::text) OR (newsletterid_key =
'20040330'::text) OR (newsletterid_key = '20040506'::text) OR
(newsletterid_key = '20040520'::text)))
-> Index Scan using idx_segmented_sub on segmented_sub
t0 (cost=0.00..5.85 rows=1 width=8) (actual time=1.031..1.033 rows=1
loops=1961)
Index Cond: ("outer".userid_key = t0.id)
Total runtime: 194283.802 ms
(9 rows)

=============================================================
=============================================================

capture of relevant part of postgresql logfile

2004-05-18 01:47:43 LOG: recycled transaction log file "0000041800000007"
2004-05-18 01:48:15 LOG: recycled transaction log file "0000041800000008"
2004-05-18 01:48:47 LOG: recycled transaction log file "0000041800000009"
2004-05-18 01:48:47 LOG: recycled transaction log file "000004180000000A"
2004-05-18 01:49:52 LOG: recycled transaction log file "000004180000000B"
2004-05-18 01:50:24 LOG: recycled transaction log file "000004180000000C"
2004-05-18 01:50:24 LOG: recycled transaction log file "000004180000000D"
2004-05-18 01:50:56 LOG: recycled transaction log file "000004180000000E"
*** malloc: vm_allocate(size=8388608) failed (error code=3)
*** malloc[3646]: error: Can't allocate region
*** malloc: vm_allocate(size=8388608) failed (error code=3)
*** malloc[3646]: error: Can't allocate region
*** malloc: vm_allocate(size=8388608) failed (error code=3)
*** malloc[3646]: error: Can't allocate region
*** malloc: vm_allocate(size=8388608) failed (error code=3)
*** malloc[3646]: error: Can't allocate region
*** malloc: vm_allocate(size=8388608) failed (error code=3)
*** malloc[3646]: error: Can't allocate region
*** malloc: vm_allocate(size=8388608) failed (error code=3)
*** malloc[3646]: error: Can't allocate region
*** malloc: vm_allocate(size=8388608) failed (error code=3)
*** malloc[3646]: error: Can't allocate region
*** malloc: vm_allocate(size=4194304) failed (error code=3)
*** malloc[3646]: error: Can't allocate region
*** malloc: vm_allocate(size=4194304) failed (error code=3)
*** malloc[3646]: error: Can't allocate region
*** malloc: vm_allocate(size=4194304) failed (error code=3)
*** malloc[3646]: error: Can't allocate region
*** malloc: vm_allocate(size=4194304) failed (error code=3)
*** malloc[3646]: error: Can't allocate region
*** malloc: vm_allocate(size=4194304) failed (error code=3)
*** malloc[3646]: error: Can't allocate region
*** malloc: vm_allocate(size=4194304) failed (error code=3)
*** malloc[3646]: error: Can't allocate region
*** malloc: vm_allocate(size=4194304) failed (error code=3)
*** malloc[3646]: error: Can't allocate region
*** malloc: vm_allocate(size=2097152) failed (error code=3)
*** malloc[3646]: error: Can't allocate region
*** malloc: vm_allocate(size=2097152) failed (error code=3)
*** malloc[3646]: error: Can't allocate region
*** malloc: vm_allocate(size=2097152) failed (error code=3)
*** malloc[3646]: error: Can't allocate region
*** malloc: vm_allocate(size=2097152) failed (error code=3)
*** malloc[3646]: error: Can't allocate region
*** malloc: vm_allocate(size=2097152) failed (error code=3)
*** malloc[3646]: error: Can't allocate region
*** malloc: vm_allocate(size=2097152) failed (error code=3)
*** malloc[3646]: error: Can't allocate region
*** malloc: vm_allocate(size=2097152) failed (error code=3)
*** malloc[3646]: error: Can't allocate region
*** malloc: vm_allocate(size=2097152) failed (error code=3)
*** malloc[3646]: error: Can't allocate region
*** malloc: vm_allocate(size=1048576) failed (error code=3)
*** malloc[3646]: error: Can't allocate region
TopMemoryContext: 40960 total in 4 blocks; 10336 free (7 chunks); 30624 used
TopTransactionContext: 8192 total in 1 blocks; 8136 free (0 chunks); 56 used
DeferredTriggerXact: 0 total in 0 blocks; 0 free (0 chunks); 0 used
MessageContext: 57344 total in 3 blocks; 4688 free (3 chunks); 52656 used
PortalMemory: 8192 total in 1 blocks; 8040 free (0 chunks); 152 used
PortalHeapMemory: 1024 total in 1 blocks; 912 free (0 chunks); 112 used
ExecutorState: 24576 total in 2 blocks; 8552 free (2 chunks); 16024 used
HashTableContext: 0 total in 0 blocks; 0 free (0 chunks); 0 used
HashBatchContext: -978331468 total in 414 blocks; 44200 free (788
chunks); -978375668 used
HashTableContext: 0 total in 0 blocks; 0 free (0 chunks); 0 used
HashBatchContext: 364112 total in 1 blocks; 0 free (0 chunks); 364112 used
ExprContext: 0 total in 0 blocks; 0 free (0 chunks); 0 used
ExprContext: 0 total in 0 blocks; 0 free (0 chunks); 0 used
ExprContext: 0 total in 0 blocks; 0 free (0 chunks); 0 used
AggContext: 8192 total in 1 blocks; 7968 free (0 chunks); 224 used
DynaHashTable: 8192 total in 1 blocks; 6112 free (0 chunks); 2080 used
ExprContext: 0 total in 0 blocks; 0 free (0 chunks); 0 used
ExprContext: 0 total in 0 blocks; 0 free (0 chunks); 0 used
ExprContext: 0 total in 0 blocks; 0 free (0 chunks); 0 used
ExprContext: 0 total in 0 blocks; 0 free (0 chunks); 0 used
ExprContext: 0 total in 0 blocks; 0 free (0 chunks); 0 used
ExprContext: 0 total in 0 blocks; 0 free (0 chunks); 0 used
AggContext: 0 total in 0 blocks; 0 free (0 chunks); 0 used
ExprContext: 0 total in 0 blocks; 0 free (0 chunks); 0 used
ExprContext: 0 total in 0 blocks; 0 free (0 chunks); 0 used
CacheMemoryContext: 1040384 total in 7 blocks; 463824 free (2 chunks);
576560 used
t_newscontentstatic_pkey: 1024 total in 1 blocks; 640 free (0 chunks);
384 used
idx_pageviews_primary: 1024 total in 1 blocks; 320 free (0 chunks); 704 used
idx_segmented_sub: 1024 total in 1 blocks; 640 free (0 chunks); 384 used
idx_segmented_sub: 1024 total in 1 blocks; 640 free (0 chunks); 384 used
pg_toast_16408_index: 1024 total in 1 blocks; 320 free (0 chunks); 704 used
d_userid_user_id_key: 1024 total in 1 blocks; 640 free (0 chunks); 384 used
d_userid_pkey: 1024 total in 1 blocks; 640 free (0 chunks); 384 used
t_user_email_key: 1024 total in 1 blocks; 640 free (0 chunks); 384 used
t_user_pkey: 1024 total in 1 blocks; 640 free (0 chunks); 384 used
idx_user_country: 1024 total in 1 blocks; 640 free (0 chunks); 384 used
idx_user_status: 1024 total in 1 blocks; 640 free (0 chunks); 384 used
pg_description_o_c_o_index: 2048 total in 1 blocks; 768 free (0 chunks);
1280 used
pg_depend_depender_index: 2048 total in 1 blocks; 768 free (0 chunks);
1280 used
pg_depend_reference_index: 2048 total in 1 blocks; 768 free (0 chunks);
1280 used
d_date_pkey: 1024 total in 1 blocks; 640 free (0 chunks); 384 used
idx_date_5: 1024 total in 1 blocks; 320 free (0 chunks); 704 used
idx_date_4: 1024 total in 1 blocks; 640 free (0 chunks); 384 used
idx_date_3: 1024 total in 1 blocks; 640 free (0 chunks); 384 used
idx_date_2: 1024 total in 1 blocks; 640 free (0 chunks); 384 used
idx_date_1: 1024 total in 1 blocks; 640 free (0 chunks); 384 used
pg_index_indrelid_index: 1024 total in 1 blocks; 640 free (0 chunks);
384 used
pg_attrdef_adrelid_adnum_index: 1024 total in 1 blocks; 320 free (0
chunks); 704 used
pg_conversion_default_index: 2048 total in 1 blocks; 704 free (0
chunks); 1344 used
pg_group_sysid_index: 1024 total in 1 blocks; 640 free (0 chunks); 384 used
pg_rewrite_rel_rulename_index: 1024 total in 1 blocks; 320 free (0
chunks); 704 used
pg_proc_proname_args_nsp_index: 2048 total in 1 blocks; 704 free (0
chunks); 1344 used
pg_namespace_nspname_index: 1024 total in 1 blocks; 640 free (0 chunks);
384 used
pg_conversion_name_nsp_index: 1024 total in 1 blocks; 320 free (0
chunks); 704 used
pg_index_indexrelid_index: 1024 total in 1 blocks; 640 free (0 chunks);
384 used
pg_trigger_tgrelid_tgname_index: 1024 total in 1 blocks; 320 free (0
chunks); 704 used
pg_inherits_relid_seqno_index: 1024 total in 1 blocks; 320 free (0
chunks); 704 used
pg_amop_opr_opc_index: 1024 total in 1 blocks; 320 free (0 chunks); 704 used
pg_amproc_opc_procnum_index: 1024 total in 1 blocks; 320 free (0
chunks); 704 used
pg_shadow_usename_index: 1024 total in 1 blocks; 640 free (0 chunks);
384 used
pg_amop_opc_strategy_index: 1024 total in 1 blocks; 320 free (0 chunks);
704 used
pg_conversion_oid_index: 1024 total in 1 blocks; 640 free (0 chunks);
384 used
pg_cast_source_target_index: 1024 total in 1 blocks; 320 free (0
chunks); 704 used
pg_class_oid_index: 1024 total in 1 blocks; 640 free (0 chunks); 384 used
pg_language_name_index: 1024 total in 1 blocks; 640 free (0 chunks); 384
used
pg_attribute_relid_attnam_index: 1024 total in 1 blocks; 320 free (0
chunks); 704 used
pg_proc_oid_index: 1024 total in 1 blocks; 640 free (0 chunks); 384 used
pg_statistic_relid_att_index: 1024 total in 1 blocks; 320 free (0
chunks); 704 used
pg_attribute_relid_attnum_index: 1024 total in 1 blocks; 320 free (0
chunks); 704 used
pg_class_relname_nsp_index: 1024 total in 1 blocks; 320 free (0 chunks);
704 used
pg_language_oid_index: 1024 total in 1 blocks; 640 free (0 chunks); 384 used
pg_opclass_am_name_nsp_index: 2048 total in 1 blocks; 768 free (0
chunks); 1280 used
pg_type_typname_nsp_index: 1024 total in 1 blocks; 320 free (0 chunks);
704 used
pg_type_oid_index: 1024 total in 1 blocks; 640 free (0 chunks); 384 used
pg_aggregate_fnoid_index: 1024 total in 1 blocks; 640 free (0 chunks);
384 used
pg_namespace_oid_index: 1024 total in 1 blocks; 640 free (0 chunks); 384
used
pg_group_name_index: 1024 total in 1 blocks; 640 free (0 chunks); 384 used
pg_operator_oprname_l_r_n_index: 2048 total in 1 blocks; 704 free (0
chunks); 1344 used
pg_operator_oid_index: 1024 total in 1 blocks; 640 free (0 chunks); 384 used
pg_opclass_oid_index: 1024 total in 1 blocks; 640 free (0 chunks); 384 used
pg_shadow_usesysid_index: 1024 total in 1 blocks; 640 free (0 chunks);
384 used
MdSmgr: 8192 total in 1 blocks; 5928 free (0 chunks); 2264 used
DynaHash: 8192 total in 1 blocks; 6912 free (0 chunks); 1280 used
DynaHashTable: 8192 total in 1 blocks; 2008 free (0 chunks); 6184 used
DynaHashTable: 8192 total in 1 blocks; 5080 free (0 chunks); 3112 used
DynaHashTable: 8192 total in 1 blocks; 2008 free (0 chunks); 6184 used
DynaHashTable: 8192 total in 1 blocks; 1984 free (0 chunks); 6208 used
DynaHashTable: 8192 total in 1 blocks; 3520 free (0 chunks); 4672 used
DynaHashTable: 24576 total in 2 blocks; 13240 free (4 chunks); 11336 used
DynaHashTable: 0 total in 0 blocks; 0 free (0 chunks); 0 used
DynaHashTable: 0 total in 0 blocks; 0 free (0 chunks); 0 used
DynaHashTable: 0 total in 0 blocks; 0 free (0 chunks); 0 used
DynaHashTable: 0 total in 0 blocks; 0 free (0 chunks); 0 used
DynaHashTable: 0 total in 0 blocks; 0 free (0 chunks); 0 used
ErrorContext: 8192 total in 1 blocks; 8176 free (0 chunks); 16 used
2004-05-18 02:04:53 ERROR: out of memory
DETAIL: Failed on request of size 77.
2004-05-18 02:04:57 ERROR: current transaction is aborted, commands
ignored until end of transaction block
2004-05-18 02:05:00 LOG: unexpected EOF on client connection
2004-05-18 02:05:00 LOG: unexpected EOF on client connection
2004-05-18 02:05:00 LOG: unexpected EOF on client connection
2004-05-18 02:05:00 LOG: unexpected EOF on client connection
2004-05-18 02:05:00 LOG: unexpected EOF on client connection
2004-05-18 02:05:00 LOG: unexpected EOF on client connection
2004-05-18 02:05:00 LOG: unexpected EOF on client connection
2004-05-18 02:05:00 LOG: unexpected EOF on client connection
2004-05-18 02:05:00 LOG: unexpected EOF on client connection
2004-05-18 02:05:00 LOG: unexpected EOF on client connection

Responses

Browse pgsql-general by date

  From Date Subject
Next Message rong.xie 2004-05-18 15:27:34 PostgreSQL delete the blank in the end of the String automatically. how can I avoid it?
Previous Message Scott Ribe 2004-05-18 03:39:41 Re: I want to use postresql for this app, but...