You are viewing an old version of this page. View the current version.

Compare with Current View Page History

« Previous Version 4 Current »

The Tika protocol handler allows to use the Apache Tika framework to parse arbitrary documents.

Options

none

Example

To access the different attribute values the KeyValueToTuple operator can be used to transform the required attributes into a relational tuple.

PQL

Example
input = ACCESS({source='source', 
                wrapper='GenericPush',
                transport='TCPClient',
                protocol='tika',
                dataHandler='KeyValueObject',
                options=[['host','192.168.1.20'],['port','2111']]
})

out = KEYVALUETOTUPLE({
          schema=[['content', 'String']],
          type='Document',
          keepinput='false' 
        },
        input
      )
  • No labels