Skip site navigation (1) Skip section navigation (2)

Insert/Update/Select large XML files(hstore?)

From: Raju Angani <angani(at)gmail(dot)com>
To: pgsql-admin(at)postgresql(dot)org
Subject: Insert/Update/Select large XML files(hstore?)
Date: 2012-08-16 06:25:08
Message-ID: CAJ-04OoZVyjjiuDtnzUKUiC65X4jEB=E4b--UdDXii=p8xksVQ@mail.gmail.com (view raw or flat)
Thread:
Lists: pgsql-admin
Hi pg gurus,

I'm very new into postgres, and I need some help on handling large xml
files(20k lines or 1MB size) within the postgres database. Today we are
using xDB(from EMC) to handle this data. Overall xDB is good, but it has
got some serious limitations. I would like to explore hstore or xml
datatype in postgres to do a POC, and my case is to prove postgres can
handle this kind of data seamlessly.

Could someone point me in the right direction on how to achieve my goals.
1) Fast inserts/updates
2) Select data seamlessly
3) Support adhoc queries(there could be 20k documents of size 300kb~1024kb
each),
e.g.: select all the matching attributes of device scsi from all 20k
documents.


Sample data from my xml files. I could have massive xml doc in this format.

      <scsiLun xsi:type="HostScsiDisk"
qs:id="ScsiLun:key%2dvim%2ehost%2eScsiDisk%2ddisk4">
        <deviceName>/devices/disks/disk4</deviceName>
        <deviceType>scsi-disk</deviceType>
        <key>host.ScsiDisk-disk4</key>
        <uuid>disk4</uuid>
        <canonicalName>hba4:1:1</canonicalName>
        <lunType>disk</lunType>
        <scsiLevel>0</scsiLevel>
        <durableName>
          <namespace>Unknown</namespace>
          <namespaceId>0</namespaceId>
          <data>9</data>
        </durableName>
        <queueDepth>0</queueDepth>
        <operationalState>ok</operationalState>
        <capacity>
          <blockSize>4096</blockSize>
          <block>100000000</block>
        </capacity>
        <devicePath>/devices/disks/disk4</devicePath>
      </scsiLun>
      <scsiLun xsi:type="HostScsiDisk"
qs:id="ScsiLun:key%2dvim%2ehost%2eScsiDisk%2ddisk10">
        <deviceName>/devices/disks/disk10</deviceName>
        <deviceType>scsi-disk</deviceType>
        <key>host.ScsiDisk-disk10</key>
        <uuid>disk10</uuid>
        <canonicalName>hba10:0:0</canonicalName>
        <lunType>disk</lunType>
        <scsiLevel>0</scsiLevel>
        <durableName>
          <namespace>Unknown</namespace>
          <namespaceId>0</namespaceId>
          <data>9</data>
        </durableName>
        <queueDepth>0</queueDepth>
        <operationalState>ok</operationalState>
        <capacity>
          <blockSize>4096</blockSize>
          <block>10000000</block>
        </capacity>
        <devicePath>/devices/disks/disk10</devicePath>
      </scsiLun>
      <adapter>
          <key>host.ScsiTopology.Interface-hba11</key>
          <adapter xlink:type="simple"
xlink:href="#HostHostBusAdapter:key%2dvim%2ehost%2eParallelScsiHba%2dhba11"/>
          <target>
            <key>host.ScsiTopology.Target-hba11:0:0</key>
            <target>0</target>
            <lun>
              <key>host.ScsiTopology.Lun-disk100</key>
              <lun>0</lun>
              <scsiLun xlink:type="simple"
xlink:href="#ScsiLun:key%2dvim%2ehost%2eScsiDisk%2ddisk100"/>
            </lun>
            <lun>
              <key>host.ScsiTopology.Lun-disk101</key>
              <lun>1</lun>
              <scsiLun xlink:type="simple"
xlink:href="#ScsiLun:key%2dvim%2ehost%2eScsiDisk%2ddisk101"/>
            </lun>
            <lun>
              <key>host.ScsiTopology.Lun-disk102</key>
              <lun>2</lun>
              <scsiLun xlink:type="simple"
xlink:href="#ScsiLun:key%2dvim%2ehost%2eScsiDisk%2ddisk102"/>
            </lun>
            <lun>
              <key>host.ScsiTopology.Lun-disk103</key>
              <lun>3</lun>
              <scsiLun xlink:type="simple"
xlink:href="#ScsiLun:key%2dvim%2ehost%2eScsiDisk%2ddisk103"/>
            </lun>
            <lun>
              <key>host.ScsiTopology.Lun-disk104</key>
              <lun>4</lun>
              <scsiLun xlink:type="simple"
xlink:href="#ScsiLun:key%2dvim%2ehost%2eScsiDisk%2ddisk104"/>
            </lun>
            <lun>
              <key>host.ScsiTopology.Lun-disk105</key>
              <lun>5</lun>
              <scsiLun xlink:type="simple"
xlink:href="#ScsiLun:key%2dvim%2ehost%2eScsiDisk%2ddisk105"/>
            </lun>
             <transport xsi:type="HostParallelScsiTargetTransport"/>
          </target>
        </adapter>
      <mountInfo>
        <mountInfo>
          <path>/volumes/f121d633-405475583f86-bcacce3ac69a</path>
          <accessMode>readWrite</accessMode>
          <mounted>true</mounted>
          <accessible>true</accessible>
        </mountInfo>
        <volume xsi:type="HostVolume">
          <type>XFS</type>
          <name>sh-ds-3</name>
          <capacity>322122547200</capacity>
          <blockSizeMb>1</blockSizeMb>
          <maxBlocks>63963136</maxBlocks>
          <majorVersion>5</majorVersion>
          <version>5.00</version>
          <uuid>f121d633-405475583f86-bcacce3ac69a</uuid>
          <extent>
            <diskName>hba10:0:3</diskName>
            <partition>1</partition>
          </extent>
          <xfsUpgradable>false</xfsUpgradable>
        </volume>
      </mountInfo>

pgsql-admin by date

Next:From: Saravanakumar RamasamyDate: 2012-08-17 08:11:48
Subject: When I executed type cast functions. The postgres normal concatenation operator query was breaking.
Previous:From: Joe TennantDate: 2012-08-14 13:08:58
Subject: pg_dump out of memory issue...

Privacy Policy | About PostgreSQL
Copyright © 1996-2014 The PostgreSQL Global Development Group