This document describes Odysseus possibilities to integrate external data streams.
To process external data streams they need to be registered in Odysseus. This is typically done with one of the query languages Odysseus provides. Although it is possible to use CQL to attach data streams, the PQL approach is much more flexible. In the following we will concentrate on the integration approach with PQL (the corresponding usage with CQL can be found in The Access Operator Framework in CQL(StreamSQL))
To integrate new streams with PQL the ACCESS
-Operator is needed. Because of compatibility issues, there are a lot of more deprecated parameters, which can be set. In the following we will only describe the preferred parameters. The deprecated parameters will be removed in a future version.
The following parameters can be used in the ACCESS-Operator:
Source
: This is the system wide unique name of the source. If the source name is already used and further parameters are given, an error is thrown. An already created source can be reused by using this source parameter only.Wrapper
: This parameter allows the selection of the wrapper that is responsible for the integration of the sources. In Odysseus the default wrappers are GenericPush and GenericPull. Other extensions provide further names.Schema
: This parameter is needed as the output schema of the access operator and for the creation of some data handler (e.g. Tuple). For each Element there must be a base data handler available. The special types StartTimestamp(String) and EndTimestamp(String) are used to set the time meta data of the created element. Example:[['TIMESTAMP','StartTimeStamp', ['NAME','String'],['TEMP','Double'],['AccX','Double'],['AccY','Double'],
['AccZ','Double'],['PosX','Double']]
The following parameters are to further describe the wrapper GenericPush and GenericPull. GenericPull is needed, when the data needs to be extracted from the sources (e.g. from a file) and GenericPush is needed, when the data from the source is send actively. Pull requires scheduling (done automatically), push not.
Each parameter typically needs further configurations parameters (e.g. a file name for a file wrapper). These additional parameters are set in the options-Parameter, consisting of key-value pairs:
Options = [['key1', 'value1'], ['key2', 'value2'], … , ['keyN', 'valueN']]
This parameter selects the input type of the Wrapper. The following values are currently supported for the GenericPull-Wrapper:
If the source needs login and password
The following values are currently supported for the GenericPush-Wrapper:
If the source needs login and password
The parameter determines how the input from the transport is processed. The main task for this component is the identification of objects in the input and the preparation for the data handler (see next parameter).
The following protocols are currently available in Odysseus.
Finally, this option defines the data handler that is responsible for the creation of the objects that will be processed inside Odysseus. The set of data handlers can be distinguished into handler for base types (like long, boolean or int) and constructors for complex types (like tuple or list). For the following set of base data types Odysseus provides data handler:
Odysseus provides the following type constructors:
The following PQL command creates a new source with
nexmark_person := ACCESS({source='nexmark:person', wrapper='GenericPush', transport='NonBlockingTcp', protocol='SizeByteBuffer', dataHandler='Tuple',options=[['host','odysseus.offis.uni-oldenburg.de'],['port','65440'],['ByteOrder','Little_Endian']], schema=[['timestamp','StartTimeStamp'], ['id','INTEGER'], ['name','String'], ['email','String'], ['creditcard','String'], ['city','String'], ['state','String'] ]}) |
worldBoundaries := ACCESS({Source='WorldBoundaries', Wrapper='GenericPull', Schema=[['geometry','SpatialGeometry'], ['geometry_vertex_count','Integer'], ['OBJECTID','Integer'], ['ISO_2DIGIT','String'], ['Shape_Leng','Double'], ['Shape_Area','Double'], ['Name','String'], ['import_notes','String'], ['Google requests','String'] ], InputSchema=['SpatialKML','Integer','Integer','String','Double','Double','String','String','String'], transport='File', protocol='csv', dataHandler='Tuple', Options=[['filename','C:/Users/Marco Grawunder/Documents/My Dropbox/OdysseusQuickShare/Daten/Geo/World Country Boundaries.csv'], ['Delimiter',',']]} |
AccessAO, AccessAOBuilder ¿ PQL Documentation
GenericPush and GenericPull
New Wrapper
Scai
To publish processed data with PQL the SENDER
-Operator is needed. This operator takes care of the application depending and transport depending transformation and delivery of the processed elements in the data stream.
The following parameters are required in the SENDER-Operator:
This is the system wide unique name of the sink. If the sink name is already used and further parameters are given, an error is thrown. An already created sink can be reused by using this sink parameter only.
Wrapper
This parameter allows the selection of the wrapper that is responsible for the delivery of the data. In Odysseus the default wrappers are GenericPush and GenericPull. Other extensions provide further names.
The transport handler is responsible for the delivery of the processed data stream elements at a given endpoint.
The protocol handler is responsible for the transformation of the processed sensor data elements into an application depending protocol to transport them over a given transport protocol to an endpoint.
The data handler transforms the elements in a data stream to the right representations (I.e. String or Byte Array). Depending on the protocol handler a specific data handler may be required. However, in most cases the data handler Tuple should be adequate.
The options field includes additional parameter for the transport and protocol handlers.
output = SENDER({ sink='Sink', wrapper='GenericPush', transport='TCPClient', protocol='CSV', dataHandler='Tuple', options=[['host', 'example.com'],['port', '8081'],['read', '10240'],['write', '10240']] }, input) |