Re: What's the best way in Postgres9 to store a bunch of arbitrary boolean values for a row?

From: Mike Christensen <mike(at)kitchenpc(dot)com>
To: Craig Ringer <ringerc(at)ringerc(dot)id(dot)au>
Cc: pgsql-general(at)postgresql(dot)org
Subject: Re: What's the best way in Postgres9 to store a bunch of arbitrary boolean values for a row?
Date: 2011-12-05 07:31:58
Message-ID: CABs1bs3fdB0pjRPKAAzOJNJ4Uxr2VtRYXoeu_ACqMHoAptHn8Q@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

>> I have a database full of recipes, one recipe per row.  I need to
>> store a bunch of arbitrary "flags" for each recipe to mark various
>> properties such as Gluton-Free, No meat, No Red Meat, No Pork, No
>> Animals, Quick, Easy, Low Fat, Low Sugar, Low Calorie, Low Sodium and
>> Low Carb.  Users need to be able to search for recipes that contain
>> one or more of those flags by checking checkboxes in the UI.
>>
>> I'm searching for the best way to store these properties in the
>> Recipes table.
>
> I'd use hstore to store them as tags. You can then use hstore's GiST index
> support to get quick lookups.
>>
>> 1. Have a separate column for each property and create an index on
>> each of those columns.  I may have upwards of about 20 of these
>> properties, so I'm wondering if there's any drawbacks with creating a
>> whole bunch of BOOL columns on a single table.
>
> It'll get frustrating as you start adding new categories, and will drive you
> insane as soon as you want to let the user define their own categories -
> which you will land up wanting to do in your problem space. I'd avoid it.
>>
>> 2. Use a bitmask for all properties and store the whole thing in one
>> numeric column that contains the appropriate number of bits.  Create a
>> separate index on each bit so searches will be fast.
>
> Same as above, it'll get annoying to manage when you want user tagging.
>>
>> 3. Create an ENUM with a value for each tag, then create a column that
>> has an ARRAY of that ENUM type.  I believe an ANY clause on an array
>> column can use an INDEX, but have never done this.
>
> Same again.
>>
>> 4. Create a separate table that has a one-to-many mapping of recipes
>> to tags.  Each tag would be a row in this table.  The table would
>> contain a link to the recipe, and an ENUM value for which tag is "on"
>> for that recipe.  When querying, I'd have to do a nested SELECT to
>> filter out recipes that didn't contain at least one of these tags.  I
>> think this is the more "normal" way of doing this, but it does make
>> certain queries more complicated - If I want to query for 100 recipes
>> and also display all their tags, I'd have to use an INNER JOIN and
>> consolidate the rows, or use a nested SELECT and aggregate on the fly.
>
> That'll get slow. It'll work and is IMO better than all the other options
> you suggested, but I'd probably favour hstore over it.

The hstore module sounds fantastic!

I'm curious as to how these columns are serialized back through the
driver, such as Npgsql. Do I get the values as strings, such as a
comma delimited key/value pair list? Or would I need to do some
custom logic to deserialize them?

Right now, I'm using Npgsql as a driver, and NHibernate/Castle
ActiveRecord as an ORM.

Mike

In response to

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Florian Schwendener 2011-12-05 08:07:20 Re: odbc_fdw
Previous Message tamanna madaan 2011-12-05 07:15:17 Re: psql query gets stuck indefinitely