Hash index not being updated

From: Justin Naifeh <jnaifeh(at)terraframe(dot)com>
To: pgsql-general(at)postgresql(dot)org
Subject: Hash index not being updated
Date: 2011-10-05 00:37:20
Message-ID: 4E8BA6C0.5000700@terraframe.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

In Postgres 8.4, I have a table called java_types with two columns,
package_name and class_name. There is another table called java_objects
that defines a column called type whose value matches the concatenation
of package_name and class_name. A typical join and result looks like this:

SELECT package_name, class_name, type FROM java_objects o INNER JOIN
java_types t ON (t.package_name || '.' || t.class_name) = o.type;

package_name, class_name, type
"java.lang" , "String" , "java.lang.String"

The above works, although it is slow for large data sets so I defined
the following index:

CREATE INDEX java_type_hash
ON java_types
USING hash
(((package_name::text || '.'::text) || class_name::text));

I confirmed that my new index is being used by inspecting the query plan
and finding the following:

-> Index Scan using java_type_hash on java_types (cost=0.00..1.22
rows=1 width=49)
Index Cond: ((((java_types.package_name)::text || '.'::text) ||
(java_types.class_name)::text) = (java_objects.type)::text)

This gave me a speed boost and worked initially. A day after defining
the index, however, I inserted a few rows into java_types and then many
rows into java_objects. When I ran the aforementioned query I got 0
results. A month later (without restarting postgres), I ran the same
query and still got 0 results. I suspected the index had either been
corrupted or not updated automatically. I confirmed this by running the
following:

REINDEX INDEX java_type_hash;

And then when I re-ran the same query the correct results came back! I
thought that maybe my system wasn't set to autovacuum but a query for my
current settings confirmed that it was:

autovacuum = on
autovacuum_analyze_scale_factor = 0.1
autovacuum_analyze_threshold = 256
autovacuum_freeze_max_age = 200000000
autovacuum_max_workers = 3
autovacuum_naptime = 60
autovacuum_vacuum_cost_delay = 20
autovacuum_vacuum_cost_limit = -1
autovacuum_vacuum_scale_factor = 0.2
autovacuum_vacuum_threshold = 512
track_counts = on

Finally, this query gave me some extra information about when my system
had last been vacuumed/analyzed:

SELECT relname, last_vacuum, last_autovacuum, last_analyze,
last_autoanalyze FROM pg_stat_all_tables WHERE relname = 'java_types'

rename , last_vacuum , last_autovacuum ,
last_analyze , last_autoanalyze
"java_types", "2011-10-04 13:37:03.867-07", "" ,
"2011-10-04 13:21:22.342-07", "2011-08-30 10:15:13.018-07"

The index was created on 8/30/2011, inserts were done on java_types and
java_objects on 8/31/2011, and then 10/4/2011 is when I manually ran
REINDEX (and later VACUUM ANALYZE).

So after all that, the questions is: If I have autovacuum set to true
then I shouldn't have to worry about calling VACUUM/ANALYZE/REINDEX
manually to update my java_types_hash index, right? Maybe my settings
are wrong or I'm making an incorrect assumption about Postgres behavior.

Any pointer in the right direction helps. Thanks in advance.

--

Justin Naifeh Software Developer

Voice:
303-460-7111 x1 Toll Free:
877-444-3074 x1 Cell:
720-363-8874 AIM, Yahoo
justinnaifeh <aim:goim?screenname=justinnaifeh>

www.terraframe.com/products/runwaysdk
<http://www.terraframe.com/products/runwaysdk>
Makers of TerraFrame RUNWAY SDK^(TM), the next-generation
model-driven engineering (MDE) application toolkit for software developers

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Adarsh Sharma 2011-10-05 06:19:53 Restoring 2 Tables From All Databases Backup
Previous Message Dave Page 2011-10-05 00:22:27 Re: [GENERAL] One-click installer, Windows 7 32-bit, and icacls.exe