Re: What's the best way in Postgres9 to store a bunch of arbitrary boolean values for a row?

From: "Albe Laurenz" <laurenz(dot)albe(at)wien(dot)gv(dot)at>
To: "Mike Christensen *EXTERN*" <mike(at)kitchenpc(dot)com>, <pgsql-general(at)postgresql(dot)org>
Subject: Re: What's the best way in Postgres9 to store a bunch of arbitrary boolean values for a row?
Date: 2011-12-05 09:04:45
Message-ID: D960CB61B694CF459DCFB4B0128514C207347E9A@exadv11.host.magwien.gv.at
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

Mike Christensen wrote:
> I have a database full of recipes, one recipe per row. I need to
> store a bunch of arbitrary "flags" for each recipe to mark various
> properties such as Gluton-Free, No meat, No Red Meat, No Pork, No
> Animals, Quick, Easy, Low Fat, Low Sugar, Low Calorie, Low Sodium and
> Low Carb. Users need to be able to search for recipes that contain
> one or more of those flags by checking checkboxes in the UI.
>
> I'm searching for the best way to store these properties in the
> Recipes table. My ideas so far:
>
> 1. Have a separate column for each property and create an index on
> each of those columns. I may have upwards of about 20 of these
> properties, so I'm wondering if there's any drawbacks with creating a
> whole bunch of BOOL columns on a single table.
> 2. Use a bitmask for all properties and store the whole thing in one
> numeric column that contains the appropriate number of bits. Create a
> separate index on each bit so searches will be fast.
> 3. Create an ENUM with a value for each tag, then create a column that
> has an ARRAY of that ENUM type. I believe an ANY clause on an array
> column can use an INDEX, but have never done this.
> 4. Create a separate table that has a one-to-many mapping of recipes
> to tags. Each tag would be a row in this table. The table would
> contain a link to the recipe, and an ENUM value for which tag is "on"
> for that recipe. When querying, I'd have to do a nested SELECT to
> filter out recipes that didn't contain at least one of these tags. I
> think this is the more "normal" way of doing this, but it does make
> certain queries more complicated - If I want to query for 100 recipes
> and also display all their tags, I'd have to use an INNER JOIN and
> consolidate the rows, or use a nested SELECT and aggregate on the fly.
>
> Write performance is not too big of an issue here since recipes are
> added by a backend process, and search speed is critical (there might
> be a few hundred thousand recipes eventually). I doubt I will add new
> tags all that often, but I want it to be at least possible to do
> without major headaches.

I would use a boolean column per property and a partial index on the
ones
where the property is selective, i.e. only a small percentage of all
recipes
match the property.

Yours,
Laurenz Albe

In response to

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Filip Rembiałkowski 2011-12-05 10:48:44 Re: What's the best way in Postgres9 to store a bunch of arbitrary boolean values for a row?
Previous Message John R Pierce 2011-12-05 08:59:00 Re: Regarding licensing of Postgresql