Insert/Update/Select large XML files(hstore?)

From: Raju Angani <angani(at)gmail(dot)com>
To: pgsql-admin(at)postgresql(dot)org
Subject: Insert/Update/Select large XML files(hstore?)
Date: 2012-08-16 06:25:08
Message-ID: CAJ-04OoZVyjjiuDtnzUKUiC65X4jEB=E4b--UdDXii=p8xksVQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-admin

Hi pg gurus,

I'm very new into postgres, and I need some help on handling large xml
files(20k lines or 1MB size) within the postgres database. Today we are
using xDB(from EMC) to handle this data. Overall xDB is good, but it has
got some serious limitations. I would like to explore hstore or xml
datatype in postgres to do a POC, and my case is to prove postgres can
handle this kind of data seamlessly.

Could someone point me in the right direction on how to achieve my goals.
1) Fast inserts/updates
2) Select data seamlessly
3) Support adhoc queries(there could be 20k documents of size 300kb~1024kb
each),
e.g.: select all the matching attributes of device scsi from all 20k
documents.

Sample data from my xml files. I could have massive xml doc in this format.

<scsiLun xsi:type="HostScsiDisk"
qs:id="ScsiLun:key%2dvim%2ehost%2eScsiDisk%2ddisk4">
<deviceName>/devices/disks/disk4</deviceName>
<deviceType>scsi-disk</deviceType>
<key>host.ScsiDisk-disk4</key>
<uuid>disk4</uuid>
<canonicalName>hba4:1:1</canonicalName>
<lunType>disk</lunType>
<scsiLevel>0</scsiLevel>
<durableName>
<namespace>Unknown</namespace>
<namespaceId>0</namespaceId>
<data>9</data>
</durableName>
<queueDepth>0</queueDepth>
<operationalState>ok</operationalState>
<capacity>
<blockSize>4096</blockSize>
<block>100000000</block>
</capacity>
<devicePath>/devices/disks/disk4</devicePath>
</scsiLun>
<scsiLun xsi:type="HostScsiDisk"
qs:id="ScsiLun:key%2dvim%2ehost%2eScsiDisk%2ddisk10">
<deviceName>/devices/disks/disk10</deviceName>
<deviceType>scsi-disk</deviceType>
<key>host.ScsiDisk-disk10</key>
<uuid>disk10</uuid>
<canonicalName>hba10:0:0</canonicalName>
<lunType>disk</lunType>
<scsiLevel>0</scsiLevel>
<durableName>
<namespace>Unknown</namespace>
<namespaceId>0</namespaceId>
<data>9</data>
</durableName>
<queueDepth>0</queueDepth>
<operationalState>ok</operationalState>
<capacity>
<blockSize>4096</blockSize>
<block>10000000</block>
</capacity>
<devicePath>/devices/disks/disk10</devicePath>
</scsiLun>
<adapter>
<key>host.ScsiTopology.Interface-hba11</key>
<adapter xlink:type="simple"
xlink:href="#HostHostBusAdapter:key%2dvim%2ehost%2eParallelScsiHba%2dhba11"/>
<target>
<key>host.ScsiTopology.Target-hba11:0:0</key>
<target>0</target>
<lun>
<key>host.ScsiTopology.Lun-disk100</key>
<lun>0</lun>
<scsiLun xlink:type="simple"
xlink:href="#ScsiLun:key%2dvim%2ehost%2eScsiDisk%2ddisk100"/>
</lun>
<lun>
<key>host.ScsiTopology.Lun-disk101</key>
<lun>1</lun>
<scsiLun xlink:type="simple"
xlink:href="#ScsiLun:key%2dvim%2ehost%2eScsiDisk%2ddisk101"/>
</lun>
<lun>
<key>host.ScsiTopology.Lun-disk102</key>
<lun>2</lun>
<scsiLun xlink:type="simple"
xlink:href="#ScsiLun:key%2dvim%2ehost%2eScsiDisk%2ddisk102"/>
</lun>
<lun>
<key>host.ScsiTopology.Lun-disk103</key>
<lun>3</lun>
<scsiLun xlink:type="simple"
xlink:href="#ScsiLun:key%2dvim%2ehost%2eScsiDisk%2ddisk103"/>
</lun>
<lun>
<key>host.ScsiTopology.Lun-disk104</key>
<lun>4</lun>
<scsiLun xlink:type="simple"
xlink:href="#ScsiLun:key%2dvim%2ehost%2eScsiDisk%2ddisk104"/>
</lun>
<lun>
<key>host.ScsiTopology.Lun-disk105</key>
<lun>5</lun>
<scsiLun xlink:type="simple"
xlink:href="#ScsiLun:key%2dvim%2ehost%2eScsiDisk%2ddisk105"/>
</lun>
<transport xsi:type="HostParallelScsiTargetTransport"/>
</target>
</adapter>
<mountInfo>
<mountInfo>
<path>/volumes/f121d633-405475583f86-bcacce3ac69a</path>
<accessMode>readWrite</accessMode>
<mounted>true</mounted>
<accessible>true</accessible>
</mountInfo>
<volume xsi:type="HostVolume">
<type>XFS</type>
<name>sh-ds-3</name>
<capacity>322122547200</capacity>
<blockSizeMb>1</blockSizeMb>
<maxBlocks>63963136</maxBlocks>
<majorVersion>5</majorVersion>
<version>5.00</version>
<uuid>f121d633-405475583f86-bcacce3ac69a</uuid>
<extent>
<diskName>hba10:0:3</diskName>
<partition>1</partition>
</extent>
<xfsUpgradable>false</xfsUpgradable>
</volume>
</mountInfo>

Browse pgsql-admin by date

  From Date Subject
Next Message Saravanakumar Ramasamy 2012-08-17 08:11:48 When I executed type cast functions. The postgres normal concatenation operator query was breaking.
Previous Message Joe Tennant 2012-08-14 13:08:58 pg_dump out of memory issue...