From: | "Ilya Kazakevich" <Ilya(dot)Kazakevich(at)JetBrains(dot)com> |
---|---|
To: | "'Chris Withers'" <chris(at)simplistix(dot)co(dot)uk>, "'Postgresql General'" <pgsql-general(at)postgresql(dot)org> |
Subject: | Re: schema advice for event stream with tagging and filtering |
Date: | 2016-08-16 14:10:45 |
Message-ID: | 056201d1f7c7$fbe0a090$f3a1e1b0$@JetBrains.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-general |
>>> An event is a row with a primary key along the lines of (colo, host,
>>> category) and an associated set of tags, where each tag has a type
>>> and a value
>>> (eg: {"color": "red", "owner": "fred", "status": "open"...}).
>>
>> What about simple table with several columns and hstore field for tags?
BTW, "pure SQL" approach here is to use separate tables: Tags(TagId, TagName) and TagValues(EventId,TagId,Value).
But in this case it will be painful to filter events by tag values directly, so only separate denormalized OLAP table should be used in "pure SQL":)
PostgreSQL, however, supports key-value based hstore.
>> You may also normalize it (move hosts and categories to separate table).
>
>Why? These form part of the primary key for the event...
Host and category could be part of PK, but it may be better to have "HostId" field and "Hosts(HostId, Host)" table than "text" field with many "www.foo.bar" values)
>What kind of index is recommended here? The kind of queries would be:
>
>- show me a list of tag types and the count of the number of events of that
>type
>
>- show me all events that have tag1=x, tag2=y and does not have tag3
Hstore supports GIN and GIST (https://www.postgresql.org/docs/9.1/static/textsearch-indexes.html) I'd start with GIN.
Do you need real-time data or, say, one-day old data is ok? In latter case it is better to load data to denormalized table to speed-up queries and use no index on normalized(OLTP) table because index update operation is not free)
Ilya Kazakevich
JetBrains
http://www.jetbrains.com
The Drive to Develop
From | Date | Subject | |
---|---|---|---|
Next Message | Jim Nasby | 2016-08-16 14:47:46 | Re: [GENERAL] C++ port of Postgres |
Previous Message | Chris Withers | 2016-08-16 13:37:56 | Re: schema advice for event stream with tagging and filtering |