Re: Show lossy heap block info in EXPLAIN ANALYZE for bitmap heap scan

From: Amit Khandekar <amit(dot)khandekar(at)enterprisedb(dot)com>
To: Etsuro Fujita <fujita(dot)etsuro(at)lab(dot)ntt(dot)co(dot)jp>
Cc: Fujii Masao <masao(dot)fujii(at)gmail(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Show lossy heap block info in EXPLAIN ANALYZE for bitmap heap scan
Date: 2013-11-25 03:40:25
Message-ID: CACoZds1nh6YD_X_Kdy7Yx=QVquULTX3nvJbmz9G7uW7Zf9aGMA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 1 November 2013 16:32, Etsuro Fujita <fujita(dot)etsuro(at)lab(dot)ntt(dot)co(dot)jp> wrote:

> > From: Fujii Masao [mailto:masao(dot)fujii(at)gmail(dot)com]
>
> > This is what I'm looking for! This feature is really useful for tuning
> work_mem
> > when using full text search with pg_trgm.
> >
> > I'm not sure if it's good idea to show the number of the fetches because
> it
> > seems difficult to tune work_mem from that number. How can we calculate
> how
> > much to increase work_mem to avoid lossy bitmap from the number of the
> fetches
> > in EXPLAIN output?
>
> We can calculate that from the following equation in tbm_create():
>
> nbuckets = maxbytes /
> (MAXALIGN(sizeof(HASHELEMENT)) + MAXALIGN(sizeof(PagetableEntry))
> + sizeof(Pointer) + sizeof(Pointer)),
>
> where maxbytes is the size of memory used for the hashtable in a TIDBitmap,
> designated by work_mem, and nbuckets is the estimated number of hashtable
> entries we can have within maxbytes. From this, the size of work_mem
> within
> which we can have every hashtable entry as an exact bitmap is calculated as
> follows:
>
> work_mem = (the number of exact pages + the number of lossy pages) *
> (MAXALIGN(sizeof(HASHELEMENT)) + MAXALIGN(sizeof(PagetableEntry))
> + sizeof(Pointer) + sizeof(Pointer)) /
> (1024 * 1024).
>

I am yet to give more thought on the above formula (particularly
exact_pages + lossy_pages), but I was also wondering if the user would
indeed be able to figure out the above way to estimate the memory, or the
explain itself should show the estimated memory required for the bitmap.
For hash joins we do show the memory taken by the hash table in
show_hash_info(). We can show the memory requirement in addition to the
number of exact/lossy pages.

> I'll show you an example. The following is the result for work_mem = 1MB:
>
> > > postgres=# EXPLAIN ANALYZE SELECT * FROM demo WHERE col2 between 0.01
> and
> > 0.02;
> > > QUERY PLAN
> > > ----------------------------------------------------------------------
> > > ----------
> > > ------------------------------------------------
> > > Bitmap Heap Scan on demo (cost=2716.54..92075.46 rows=105766
> > > width=34) (actual
> > > time=24.907..1119.961 rows=100047 loops=1)
> > > Recheck Cond: ((col2 >= 0.01::double precision) AND (col2 <=
> > > 0.02::double
> > > precision))
> > > Rows Removed by Index Recheck: 5484114
> > > Heap Blocks: exact=11975 lossy=46388
> > > -> Bitmap Index Scan on demo_idx (cost=0.00..2690.09 rows=105766
> > > width=0) (actual time=22.821..22.821 rows=100047 loops=1)
> > > Index Cond: ((col2 >= 0.01::double precision) AND (col2 <=
> > > 0.02::double
> > > precision))
> > > Total runtime: 1129.334 ms
> > > (7 rows)
>
> So, by setting work_mem to
>
> work_mem = (11975 + 46388) *
> (MAXALIGN(sizeof(HASHELEMENT)) + MAXALIGN(sizeof(PagetableEntry))
> + sizeof(Pointer) + sizeof(Pointer)) /
> (1024 * 1024),
>
> which is about 5MB, we have the following (Note that no lossy heap pages!):
>
> postgres=# EXPLAIN ANALYZE SELECT * FROM demo WHERE col2 between 0.01 and
> 0.02;
> QUERY PLAN
>
>
> --------------------------------------------------------------------------------
> ----------------------------
> --------------------
> Bitmap Heap Scan on demo (cost=2716.54..92075.46 rows=105766 width=34)
> (actual
> time=42.981..120.252 rows=1
> 00047 loops=1)
> Recheck Cond: ((col2 >= 0.01::double precision) AND (col2 <=
> 0.02::double
> precision))
> Heap Blocks: exact=58363
> -> Bitmap Index Scan on demo_idx (cost=0.00..2690.09 rows=105766
> width=0)
> (actual time=26.023..26.023 r
> ows=100047 loops=1)
> Index Cond: ((col2 >= 0.01::double precision) AND (col2 <=
> 0.02::double
> precision))
> Total runtime: 129.304 ms
> (6 rows)
>
> BTW, as the EXPLAIN ANALYZE output, the number of exact/lossy heap pages
> would
> be fine with me.
>
> > Anyway, could you add the patch into next CF?
>
> Done.
>
> Thanks,
>
> Best regards,
> Etsuro Fujita
>
>
>
> --
> Sent via pgsql-hackers mailing list (pgsql-hackers(at)postgresql(dot)org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-hackers
>

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Amit Kapila 2013-11-25 03:46:49 Re: Re: Server is not getting started with log level as debug5 on master after commit 3147ac
Previous Message Abhijit Menon-Sen 2013-11-25 03:29:20 Re: [PATCH] Use MAP_HUGETLB where supported (v3)