Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

This document describes the basic concepts of the Procedural Query Language (PQL) of Odysseus and shows how to use the language. In contrast to languages SQL based languages like the Continuous Query Language (CQL) or StreamSQL, PQL is more procedural and functional than declarative. This document shows how to formulate queries with PQL.

 


Using PQL in Queries

PQL is an operator based language where an operator can be seen as a logical building block of the query. Thus, PQL is the connection of several operators. Since Odysseus differentiates between logical operators and their physical operators, which are the implementing counterpart, PQL is based upon logical operators. Therefore, it may happen that the query gets changed during the transformation from the logical query plan into the physical query plan. This includes also logical optimization techniques like the restructuring of the logical query plan. To avoid this, you can explicitly turn off the query optimization.

Define an Operator

An operator can be used in PQL via its name and some optional settings, which can be compared with a function and the variables for the function:

OPERATORNAME(parameter, operator, operator, ...)

The first variable (parameter) describes operator dependent parameters and is used for configuring the operator. Note, that there is only one parameter variable! The other variables (operator) are input operators, which are the preceding operators that push their data into this operator. The inputs of an operator can be directly defined by the definition of another operator:

OPERATOR1(parameter1, OPERATOR2(Parameter2, OPERATOR3(...)))

Except for source operators (usually the first operator of a query) each operator should have at least one input operator. Thus, the operator can only have parameters:

OPERATOR1(parameter1)

Accordingly, the operator may only have input operators but no parameters:

OPERATOR1(OPERATOR2(OPERATOR3(...)))

Alternatively, if the operator has neither parameters nor input operators, the operator only exists of its name (without any brackets!), so just:

OPERATORNAME

It is also possible to combine all kinds of definitions, for example:

OPERATOR1(OPERATOR2(Parameter2, OPERATOR3))

Intermediate Names, Views and Sources

Since the nesting of operators may lead to an unreadable code, it is possible to name operators to reuse intermediate result. This is done via the "=" symbol. Thus, we can temporary save parts of the query, for example (it is important to place blanks before and after the "=" symbol!) :

Result2 = OPERATOR2(Parameter2, OPERATOR3)

The defined names can be used like operators, so that we can insert them as the input for another operator, for example:

Result2 = OPERATOR2(Parameter2, OPERATOR3)OPERATOR1(Result2)

There could be also more than one intermediate result, if they have different names:

Result1 = OPERATOR1(Parameter1, …)Result2 = OPERATOR2(Parameter2, Result1)Result3 = OPERATOR3(Parameter3, Result2)

And you can use the intermediate name more than one time, e.g. if there are two or more operators that should get the same preceding operator:

Result1 = OPERATOR1(Parameter1, …)OPERATOR2(Parameter2, Result1)OPERATOR3(Parameter3, Result1)

All intermediate results that are defined via the "=" are only valid within the query. Thus, they are lost after the query is parsed and runs. This can be avoided with views.
A view is defined like the previous described intermediate results but uses ":=" instead of "=", e.g.:

Result2 := OPERATOR2(Parameter2, OPERATOR3)

Such a definition creates an entry into the data dictionary, so that the view is globally accessible and can be also used in other query languages like CQL.
Alternatively, the result of an operator can also be stored as a source into the data dictionary by using "::="

Result2 ::= OPERATOR2(Parameter2, OPERATOR3)

The difference between a view and a source is the kind of query plan that is saved into the data dictionary and is reused. If a view is defined, the result of the operator is saved as a logical query plan, which exists of logical operators. Thus, if another query uses the view, the logical operators are fetched from the data dictionary and build the lower part of the new operator plan or query. If an operator is saved as a source, the result of the operator is saved as a physical query plan, which exists of already transformed and maybe optimized physical operators. Thus, reusing a source is like a manually query sharing where parts of two or more different queries are used together. Additionally, the part of the source is not recognized if the new part of the query that uses the source is optimized. In contrast, the logical query plan that is used via the a view is recognized, but will not compulsorily lead to a query sharing.
Finally, all possibilities gives the following structure:

QUERY = (TEMPORARYSTREAM | VIEW | SHAREDSTREAM)+
TEMPORARYSTREAM = STROM "=" OPERATOR
VIEW = VIEWNAME ":=" OPERATOR
SHAREDSTREAM = SOURCENAME "::=" OPERATOR

Parameters – Configure an Operator

As mentioned before, the definition of an operator can contain a parameter. More precisely, the parameter is a list of parameters and is encapsulated via two curly brackets:

OPERATOR({parameter1, paramter2, …}, operatorinput)

A parameter itself exists of a name and a value that are defined via a "=". For example, if we have the parameter port and want to set this parameter to the 1234, we use the following definition:

OPERATOR({port=1234}, …)

The value can be one of the following simple types:

  • Integer or long: OPERATOR({port=1234}, …)
  • Double: OPERATOR({possibility=0.453}, …)
  • String: OPERATOR({host='localhost'}, …)

Furthermore, there are also some complex types:

  • Predicate: A predicate is normally an expression that can be evaluated and returns either true or false. In most cases a predicate is simple a string, e.g.:

OPERATOR({predicate='1<1234'}, …)

Hint: In some cases the predicate must be in this form PREDICATE_TYPE('1<1234'), where PREDICATE_TYPE can be something like RelationalPredicate.

  • List:It is also possible to pass a list of values. For that, the values have to be surrounded with squared brackets:

OPERATOR({color=['green', 'red', 'blue']}, …)
(Type of elements: integer, double, string, predicate, list, map).

  • Map: This one allows maps like the HashMap in Java. Thus, one parameter can have a list of key-value pairs, where the key and the value are one of the described type. So, you can use this, to define a set of pairs where the key and the value are strings using the "=" for separating the key from the value:

OPERATOR({def=['left'='green', 'right'='blue']}, …)
It is also possible that values are lists:
OPERATOR({def=['left'=['green','red'],'right'=['blue']]}, …)
Remember, although the key can be another data type than the value, all keys must have the same data type and all values must have the same data type
Notice, that all parameters and their types (string or integer or list or…) are defined by their operator. Therefore, maybe it is not guaranteed that the same parameters of different operators use the same parameter declaration – although we aim to uniform all parameters.

Ports – What if the Operator Has More Than One Output?

There are some operators that have more than one output. Each output is provided via a port. The default port is 0, the second one is 1 etc. The selection Route for example, pushes all elements that fulfill the predicate to output port 0 and all other to output port 1allows to split a stream according to predefined predicates to different output ports. So, if you want to use another port, you can prepend the port number with a colon in front of the operator. For example, if you want the second output (port 1) of the select:

PROJECT({…}, 1:SELECT({predicate='1<x'}, …))

The Full Grammar of PQL

QUERY           = (TEMPORARYSTREAM | VIEW | SHAREDSTREAM)+
TEMPORARYSTREAM = STREAM "=" OPERATOR
VIEW            = VIEWNAME ":=" OPERATOR
SHAREDSTREAM    = SOURCENAME "::=" OPERATOR
OPERATOR        = QUERY | [OUTPUTPORT ":"] OPERATORTYPE "(" (PARAMETERLIST [ "," OPERATORLIST ] | OPERATORLIST) ")"
OPERATORLIST    = [ OPERATOR ("," OPERATOR)* ]
PARAMETERLIST   = "{" PARAMETER ("," PARAMETER)* "}"
PARAMETER       = NAME "=" PARAMETERVALUE
PARAMETERVALUE  = LONG | DOUBLE | STRING | PREDICATE | LIST | MAP
LIST            = "[" [PARAMETERVALUE ("," PARAMETERVALUE)*] "]"
MAP             = "[" [MAPENTRY ("," MAPENTRY*] "]"
MAPENTRY        = PARAMETERVALUE "=" PARAMETERVALUE
STRING          = "'" [~']* "'"
PREDICATE       = PREDICATETYPE "(" STRING ")"

List of available PQL Operators

Odysseus has a wide range of operators build in and are explained here.

Base Operators

ACCESS

Description

The access operator can be used to integrate new sources into Odysseus. Further information can be found in the Documentation to the Access Operator Framework.

Remark: There is no need to define an access operator as view (:=) or source (::=). Each access operator is automatically a source with name source. For most cases the assignment is only for parsing purposes (see example below).

Parameter

  • source: The name of the access operator. Remark: This name must be different to all source names and all view or stream definitions! A new source will be added to the data dictionary automatically.
  • wrapper: In Odysseus the default wrappers are GenericPush and GenericPull
  • transport: The transport defines the transport protocol to use.
  • protocol: The protocol parameter defines the application protocol to transform the processing results.
  • datahandler: This parameter defines the transform of the single attributes of the processing results.
  • options: Transport protocol and application protocol depending options
  • schema: The output schema of the access operator (may depend on the protocol handler)

Example

PQL
Code Block
themeEclipse
languagejavascript
titleAccess Operator
linenumberstrue
input = ACCESS({source='Source',
wrapper='GenericPush',
transport='TCPClient',
protocol='CSV',
dataHandler='Tuple',
options=[['host', 'example.com'],['port', '8080'],['read', '10240'],['write', '10240']],
schema=[
['id', 'Double'],
['data', 'String']]
})
 
CQL
Code Block
themeEclipse
languagesql
titleAccess Operator
linenumberstrue
CREATE STREAM source (id Double, data STRING)
    WRAPPER 'GenericPush'
    PROTOCOL 'CSV'
    TRANSPORT 'TCPClient'
    DATAHANDLER 'Tuple'
    OPTIONS ( 'host' 'example.com', 'port' '8080', 'read' '10240', 'write' '10240')

AGGREGATE

Description

This operator is used to aggregate and group the input stream.

Parameter

  • group_by: An optional list of attributes over which the grouping should occur.
  • aggregations: A list if elements where each element contains up to four string:
    • The name of the aggregate function, e.g. MAX
    • The input attribute over which the aggregation should be done
    • The name of the output attribute for this aggregation
    • The optional type of the output
  • dumpAtValueCount: This parameter is used in the interval approach. Here a result is generated, each time an aggregation cannot be changed anymore. This leads to fewer results with a larger validity. With this parameter the production rate can be raised. A value of 1 means, that for every new element the aggregation operator receives new output elements are generated. This leads to more results with a shorter validity.
  • outputPA: This parameter allow to dump partial aggregates instead of evaluted values. The partial aggregates can be send to other aggregation operators and do a final aggregation (e.g. in case of distribution). The input schema of an aggregate operator that read partial aggregates must state a datatype that is a partial aggregated (see example below). Remark: Aggregate has one input and requires ordered input. To combine different parital aggregations e.g. a union operator is needed to reorder the input elements.

Aggregation Functions

The set of aggregate functions is extensible. The following list is in the core Odysseus:

  • MAX: The maximum element
  • MIN: The minimum element
  • AVG: The average element
  • SUM: The sum of all elements
  • COUNT: The number of elements
  • STDDEV: The standard deviation
Some nonstandard aggregations: These should only be used, if you a familiar with them:
  • FIRST: The first element
  • LAST: The last element
  • NTH: The nth element
  • RATE: The number of elements per time unit
  • NEST: Nest the attribute values in a list

Example:

PQL
Code Block
themeEclipse
languagejavascript
titleAggregate Operator
linenumberstrue
output = AGGREGATE({
                    group_by = ['bidder'], 
                    aggregations=[ ['MAX', 'price', 'max_price', 'double'] ]
                   }, input)

// Parital Aggregate example
pa = AGGREGATE({
          name='PRE_AGG',
          aggregations=[
            ['count', 'id', 'count', 'PartialAggregate'],
            ['sum', 'id', 'avgsum', 'PartialAggregate'],
            ['min', 'id', 'min', 'PartialAggregate'],
            ['max', 'id', 'max', 'PartialAggregate']
          ],
          outputpa='true'        
        },
        nexmark:person
      )
      
out = AGGREGATE({
          name='AGG',
          aggregations=[
            ['count', 'count', 'count', 'Integer'],
            ['sum', 'avgsum', 'sum', 'Double'],
            ['avg', 'avgsum', 'avg', 'Double'],
            ['min', 'min', 'min', 'Integer'],
            ['max', 'max', 'max', 'Integer']
          ]        
        },
        pa
      )
CQL
Code Block
themeEclipse
languagesql
titleAggregate Operator
linenumberstrue
SELECT MAX(price) AS max_price FROM input GROUP BY bidder

APPENDTO

Description

This operator can be used to attach a subplan to another operator with a specific id.

Parameter

  • appendTo: The id of the operator to append to.

Example

ASSUREHEARTBEAT

Description

This operator assures that there will be periodically a heartbeat to avoid blocking because of missing information about time progress. The operator guarantees, that no element (heartbeat or streamobject) is send, that is older than the last send heartbeat (i.e. the generated heartbeats are in order and indicate time progress). Heartbeats can be send periodically (sendAlwaysHeartbeats = true) or only if no other stream elements indicate time progress (e.g. in out of order scenarios) independent if a new element has been received or not.

Parameter

  • RealTimeDelay: How long should the operator wait in transaction time (real time) before it should send a punctuation
  • ApplicationTimeDelay: How long is the realTimeDelay in terms of application time (typically this should be the same, but for simulations this could be adapted)
  • TimeUnit: What is the time unit (see Java TimeUnit). Minimum Time unit is milliseconds!
  • SendAlwaysHeartbeat: If true, a heartbeat is send periodically for every realTimeDelay. This is useful for out of order processing
  • AllowOutOfOrder: If set to true, the operator allows heartbeats to be send, that lie before the last send element. In other cases this is not allowed.startAtCurrentTime
  • StartAtCurrentTime: Normally, heartbeats start at 0, however, if this parameter is set to "true", heartbeats begin at current system time in millis. This might be e.g. useful, if there is no tuple at the beginning of the process. 
  • Children Display
    depth2
    pageOperators
    excerpttrue
    excerptTypesimple




    Machine Learning / Data Mining

    Available mining or machine learning operators are described here: Machine Learning

    Example

     

    Code Block
    themeEclipse
    languagejavascript
    titleAssureHeartbeat Operator
    linenumberstrue
    output = ASSUREHEARTBEAT({realTimeDelay=5000, applicationTimeDelay=5000, sendAlwaysHeartbeat='false', allowOutOfOrder='false'}, input)
    
    output = ASSUREHEARTBEAT({realTimeDelay=5000, applicationTimeDelay=5000, sendAlwaysHeartbeat='false', allowOutOfOrder='false', startAtCurrentTime='true'}, input)

    ASSUREORDER

    Description

    The operator ensures the order of tuples. It collects all tuples in input until a heartbeat-punctuation is received, then it writes all collected tuples to output in the correct order. So the ordered output contains all tuples which were received after the last received heartbeat-punctuation. There has to be an ASSUREHEARTBEAT operator before this operator.

    Example

    Code Block
    themeEclipse
    languagejavascript
    titleAssure Order Operator
    linenumberstrue
    ordered_output = ASSUREORDER(input) 

    BUFFER

    Description

    Typically, Odysseus provides a buffer placement strategy to place buffers in the query plan. This operator allows adding buffers by hand. Buffers receives data stream elements and stores them in an internal elementbuffer. The scheduler stops the execution here for now. Later, the scheduler resumes to execution (e.g. with an another thread).

    Parameter

    • Type: The type of the buffer. The following types are currently part of Odysseus:
      • Normal: This is the standard type and will be used if the type parameter is absent
      • Currently, no further types are supported in the core Odysseus. In the future, there will be some buffers integrated to treat prioritized elements.
    • MaxBufferSize: If this value is set, a buffer will block if it reaches its maximal capacity. No values will be thrown away; this is not a load shedding operator!

    Example

    BUFFEREDFILTER

    Description

    This operator can be used to reduce data rate. It buffers incoming elements on port 0 (left) for bufferTime and evaluates a predicate over the elements on port 1 (right). If the predicate for the current element e evaluates to true, all elements from port 0 that are younger than e.startTimeStamp()-bufferTime will be enriched with e and delivered for deliverTime. Each time the predicate evaluates to true, the deliverTime will be increased.

    Parameter

    • Predicate: The predicate that will be evaluated on element of port 1
    • BufferTime: The time in history, the elements will be kept in history
    • DeliverTime: The time in the future, the elements will be delivered

    Example

    Code Block
    themeEclipse
    languagejavascript
    titleBufferedFilter Operator
    linenumberstrue
    output = BUFFEREDFILTER({predicate = 'id == 2 || id == 4 || id == 10', bufferTime = 5000, deliverTime = 20000}, left, right)

    CACHE

    Description

    This operator can can some stream elements. At runtime, every time a new operator is connected it will get the cached elements. This can be usefull when reading from a csv file and multiple parts of a query need this information.

    Parameter

    Example

    CHANGECORRELATE

    Operator used in DEBS Grand Challenge 2012 ... TODO: Describe?

    Description

    Parameter

    Example

    CALCLATENCY

    Description

    Odysseus has some features to measure the latency of single stream elements. This latency information is modeled as an interval. An operator in Odysseus can modify the start point of this interval. This operator sets the endpoint and determines the place in the query plan, where the latency measurement finds place. There can be multiple operators in the plan, to measure latency at different places.

    Parameter

    none

    Example

    Code Block
    themeEclipse
    languagejavascript
    titleCalcLatency Operator
    linenumberstrue
    output = CALCLATENCY(input)

     

    CHANGEDETECT

    Description

    This operator can be used to filter out equal elements and reduce data rate.

    Parameter

    • ATTR (List<String>): only these attribute are considered for the comparison
    • group_by: An optional list of attributes. If given, the elements are comared in their group (e.g. by a sensor id)
    • Heartbeatrate (Integer): For each nth element that is filtered out, a heartbeat is generated to state the progress of time
    • deliverFirstElement (Boolean): If true, the first element will be send to the next operator.
    • tolerance (Double): This value can be used, to allow a tolerance band (histerese band). Only if the difference between the last seen value and the new value is greater than this tolerance, an object is send.
    • relativeTolerance (Boolean): If set to true, the tolerance value is treated relative to the last send value, e.g. if tolerance = 0.1, a new element is send only if the new value is at least 10 percent higher or lower.

    COALESCE

    Description

    This Operator can be used to combine sequent elements, e.g. by a set of grouping attributes or with a predicates.

    In the attributes case, the elements are merged with also given aggregations functions, as long as the grouping attributes (e.g. a sensorid) are the same. When a new group is opened (e.g. a measurement from a new sensor) the old aggregates values and the grouping attributes are created as a result.

    In the predicate case, the elements are merged as long as the predicates evaluates to false, i.e. a new tuple is created when the predicates evaluates to true.

    Parameter

    • AGGREGATIONS: The aggregate functions (see AGGREGATE for examples)
    • ATTR: The grouping attributes, cannot be used together with predicate.
    • PREDICATE: The predicate cannot be used together with ATTR

    Example

    Code Block
    themeEclipse
    languagejavascript
    titleCoalesce Operator
    linenumberstrue
    coalesce = COALESCE({ATTR=['sensorid'], 
            AGGREGATIONS=[['AVG','temperature','temperatur']},tempSensor1)
    
    coalesce = COALESCE({predicate='temperature>=10', 
            AGGREGATIONS=[['last','temperature','temperature'], ['AVG,'temperature','avgTemp']},tempSensor1)

    CONTEXTENRICH

    Description

    This operator enriches tuples with information from the context store. Further Information can be found here. There is also an DBENRICH operator for fetching data from a database or a simple ENRICH that caches incoming streams.

    Parameter

    • ATTRIBUTES: The attributes from the store object, that should be used for enrichment
    • STORE: The name of the store
    • OUTER: Enriches with <null>, if the store is empty or (when outer is false) the input is discarded.
       

    Example

    Code Block
    themeEclipse
    languagejavascript
    titleEnrich Operator
    linenumberstrue
    output = CONTEXTENRICH({store='StoreName', outer='true'}, input)

    CONVERTER

    Description

    This operator can be used to transform element with other protocol handler, e.g. read a complete document from a server and then parse this document with csv or xml

    Parameter

    • protocol: The protocol parameter defines the application protocol to transform the processing results.
    • inputDatahandler: This parameter defines the input format
    • outputDatahandler: This parameter defines the transform of the single attributes of the processing results.
    • options: Transport protocol and application protocol depending options
    • schema: The output schema of the access operator (may depend on the protocol handler)

    Example

    Code Block
    languagejavascript
    CONVERTER({protocol='CSV',
        inputDataHandler='tuple',
        outputDataHandler='tuple',
                  options=[
                      ['csv.delimiter',','],
                      ['csv.textDelimiter','"']
                  ],
                  schema=[['id', 'String'],['text1', 'String'],['text2','String'],['time','String']]            
                }, ACCESS({
                  source='Yahoo',
                  wrapper='GenericPull',
                  transport='HTTP',
                  protocol='Document',
                  datahandler='Tuple',
                  options=[
                    ['uri', 'http://finance.yahoo.com/d/quotes.csv?s=AAPL+MSFT&f=snat1'],
                    ['method', 'get'],
                    ['scheduler.delay','1000']
                  ],
                  schema=[['text', 'String']]                                    
                }                              
              ))   

     

    CONVOLUTION

    Description

    This operator applies a convolution filter, which is often used in electronic signal processing or in image processing to clean up wrong values like outliers. The idea behind the convultion is to correct the current value by looking at its neighbours. The number of neighbours is the size of the filter. If, for example, SIZE=3, the filter uses the three values before the current and three values after the current value to correct the current value. Therefore, the filter does not deliver any results for the first SIZE values, because it also needs additionally SIZE further values after the current one!

    The ATTRIBUTES parameter is a list of all attributes where the filter should be used. Note, this is not a projection list, so that an incoming schema of "speed, direction, size" is still "speed, direction, size" after the operator. Each attribute of the schema that is not mentioned in ATTRIBUTES is not recognized and just copied from the input element. Each mentioned element, however, is replaced by the filtered value.

    The filtered value is calculated by FUNCTION, which is a density function. Currently, the following functions are directly implemented:

    Gaussian / Normal Distribution, where Image Removed is the mean  and Image Removed the standard deviation (notice, Image Removed is the variance!)

    Image Removed

    Logarithmic Distribution, where Image Removed is the mean  and Image Removed the standard deviation (notice, Image Removed is the variance!)

    Image Removed

    Uniform Distribution (where b-a = SIZE*2+1):

    Image Removed

    Exponential Distribution (with Image Removed):

    Image Removed

    So, FUNCTION may have the following values: "gaussian", "uniform", "logarithmic", "exponential". Alternatively, you may add your own function, where you can use the parameter x to denote the value within the density function.

    The GROUP_BY is an optional list of attributes and is used like in aggregations. For example, if the attribute "id" is added to the list of GROUP_BY, the convolution only considers values that come from elements with same "id". Notice, this may increase the time of waiting, because the filter needs at least (SIZE*2)+1) values for each "id".

    The OPTIONS are used to pass the optional parameters for the standard density functions. The parameters and their defaults are as follows:

    for gaussian and logarithmic:

    • mean (default = 0.0) 
    • deviation (default = 1.0)

    for exponential:

    • alpha (default = 1.0)

    the uniform parameters (a and b) are derived from the SIZE parameter.

     

    A simple example: If SIZE=3, we look at the 6 neighbours, which are at the following indices where the current object is at index 0:

    [-3, -2, -1, 0, 1, 2, 3]

    For a simple example, we have the following values at these indices:

    [1, 2, 3, 10, 5, 6, 7]

    Thus, at this point in time, we use the values [1, 2, 3, 10, 5, 6, 7] to filter (correct, convolute) the value 10 (our current object). This is done as follows.

    The indices [-3, -2, -1, 0, 1, 2, 3] are weighted by using each index as the parameter x for the density function f(x). Which means, that we have the following weights for a uniform density function (where x is not really used and a-b = SIZE * 2 + 1 = 7):

    [1/7, 1/7, 1/7, 1/7, 1/7, 1/7, 1/7]

    This weights are used to calculate the percentage of each value from [1, 2, 3, 10, 5, 6, 7]:

    [1/7 * 1, 1/7 * 2, 1/7 * 3, 1/7 * 10, 1/7 * 5, 1/7 * 6, 1/7 * 7]

    The sum of these values is 4,857 and this is the corrected value and replaces the 10 (at index 0). Thus, in this case, the outlier of 10 is smothed to 4,857.

    Notice, this filter does not recognize any time intervals or validities!

    Parameter

    • ATTRIBUTES: The attributes where the filter should be applied on.
    • SIZE: The size of the filter (look above)
    • FUNCTION: The density function that is used as a filter. Possible values are: "gaussian", "logarithmic", "uniform", "exponential" or an expression, where x is the variable that is replaced by the value.
    • GROUP_BY: A list of attributes that are used for grouping.
    • OPTIONS: A map of options to configure the function: "mean", "deviation", "alpha".

    Example

    Code Block
    themeEclipse
    languagejavascript
    titleEnrich Operator
    linenumberstrue
    // applies a normal distribution filter (gaussian filter) to the speed attribute and recognizes 3 values before and 3 values after the value that has to be filtered.
    output = CONVOLUTION({ATTRIBUTES=['speed'], SIZE=3, FUNCTION='gaussian'}, input)  
      
    // applies a normal distribution filter (gaussian filter) to the speed attribute with mean = 0 and standard deviation = 2
    output = CONVOLUTION({ATTRIBUTES=['speed'], SIZE=3, FUNCTION='gaussian', OPTIONS=[['mean','0.0'],['deviation','2.0']]}, input) 
    
    // applies a normal distribution filter (gaussian filter) to the speed attribute, but only considers values with same id by groupping by "id"
    output = CONVOLUTION({ATTRIBUTES=['speed'], SIZE=3, FUNCTION='gaussian', GROUP_BY=['id']}, input)
    
     // applies a log normal distribution filter to the speed attribute
    output = CONVOLUTION({ATTRIBUTES=['speed'], SIZE=3, FUNCTION='logarithmic'}, input)    
    
    // applies a exponential distribution filter to the speed attribute
    output = CONVOLUTION({ATTRIBUTES=['speed'], SIZE=3, FUNCTION='exponential'}, input)    
    
    // applies a exponential distribution filter to the speed attribute
    output = CONVOLUTION({ATTRIBUTES=['speed'], SIZE=3, FUNCTION='exponential'}, input)    
    
    // applies a uniform distribution filter to the speed attribute
    output = CONVOLUTION({ATTRIBUTES=['speed'], SIZE=3, FUNCTION='uniform'}, input)    
    
    // applies a self-defined density function (use x to get the value)
    output = CONVOLUTION({ATTRIBUTES=['speed'], SIZE=3, FUNCTION='x+2/7'}, input)  

     

    DIFFERENCE

    Description

    This operator calculates the difference between two input sets.

    Parameter

    None

    Example

     

    Code Block
    themeEclipse
    languagejavascript
    titleDifference Operator
    linenumberstrue
    output = DIFFERENCE(left, right)

     

    ENRICH

    Description

    This operator enriches tuples with data that is cached, e.g. to enrich a stream with a list of categories. The first input stream, therefore, should be only stream limited data to avoid buffer overflows. The second input is the data stream that should be enriched.

    Parameter

  • PREDICATE: A predicate that is used for lookup and joining.
  • MINIMUMSIZE: Blocks the streaming input until there are at least MINIMUMSIZE elements in the cache from the static stream . Can be used to assure that all static data is loaded before it is used for enriching the other stream. This parameter is optional, the default value is 0, so that it is immediately tried to enrich.

    Example

    Code Block
    themeEclipse
    languagejavascript
    titleEnrich Operator
    linenumberstrue
    output = ENRICH({MINIMUMSIZE = 42, PREDICATE = 'sensorid = sid'}, metadata, input)

     

    EVENTTRIGGER

    Description

    Parameter

    Example

     

    EXISTENCE

    Description

    This operator tests an existence predicate and can be used with the type EXISTS (semi join) and NOT_EXISTS (anti semi join). The predicates can be evaluated against the element from the first input and the second input.
    Semi join: All elements in the first input for which there are elements in the second input that fulfills the predicate are sent.
    Semi anti join: All elements in the first input for which there is no element in the second input that fulfills the predicate are sent.

    Parameter

    • Predicate: The predicate
    • Type: Either EXISTS or NOT_EXISTS

    Example

     

    Code Block
    themeEclipse
    languagejavascript
    titleDifference Operator
    linenumberstrue
    output = EXISTENCE({
                        type='EXISTS', 
                        predicate='auction = auction_id'
                       }, left, right)
    
    output = EXISTENCE({
                        type='NOT_EXISTS', 
                        predicate='auction = auction_id'
                       }, left, right)

     

    FILESINK

    Description

    The operator can be used to dump the results of an operator to a file.

    Deprecated: Use the Sender Operator

    Parameter

    • FILE: The filename to dump. If the path does not exist it will be created.
    • FILETYPE: The type of file, CSV: Print as CSV, if not given, the element will be dump to file a raw format (Java toString()-Method)

    Example

    JOIN

    Description

    Joins tuple from two input streams iff their timestamps are overlapping and if the optional predicate validates to true.

    Parameter

    • predicate: The predicate to evaluate over each incoming tuple from left and right

    Example

    PQL
    Code Block
    themeEclipse
    languagejavascript
    titleJoin Operator
    linenumberstrue
    output = join({predicate = 'auction_id = auction'}, left, right)
    CQL
    Code Block
    themeEclipse
    languagesql
    titleMap Operator
    linenumberstrue
    SELECT * FROM left, right WHERE auction_id = auction

    LEFTJOIN

    Description

    Parameter

    Example

     

    MAP

    Description

    Performs a mapping of incoming attributes to out-coming attributes using map functions. Odysseus also provides a wide range of mapping functions.

    Hint: Map is stateless. To used Map in a statebased fashion see: StateMap

    Parameter

    • expressions: A list of expressions to map multiple incoming attribute values to out-coming attributes. Optional each expression can have a name (in this case use ['expression', 'expressionName'])
    • threads: Number of threads used to process the expressions simultaneous. A positive number greater than 1 indicates the fixed number of threads, a value of 0 or 1 disables threading, and a negative number estimates the number of threads based on the number of expressions and the available processors.

    Example

    PQL
    Code Block
    themeEclipse
    languagejavascript
    titleMap Operator
    linenumberstrue
    output = MAP({
                  expressions = [
    								['auction_id * 5','AuctionMult5'],
    								'sqrt(auction_id)'
    							]
                 }, input)
    CQL
    Code Block
    themeEclipse
    languagesql
    titleMap Operator
    linenumberstrue
    SELECT auction_id * 5, sqrt(auction_id) FROM input

    PROJECT

    Description

    Reduces the input object to the given attributes.Makes no duplicate eleminition.

    Parameter

    • attributes: A list of attribute names to project on

    Example

    PQL
    Code Block
    themeEclipse
    languagejavascript
    titleProject Operator
    linenumberstrue
    output = PROJECT({
                      attributes = ['auction', 'bidder']
                     }, input)
    CQL
    Code Block
    themeEclipse
    languagesql
    titleProject Operator
    linenumberstrue
    SELECT auction, bidder FROM input

    RENAME

    Description

    Renames the attributes.

    Parameter

    • aliases: The list new attribute names to use from now on. If the flag pairs is set, aliases will be interpreted as pairs of (old_name, new_name). See the examples below.
    • type: By default, each schema has the name of the source from which the data comes from (i.e. it is the name of the type that is processed by this operator). With this parameter the type (source) name can be changed. This is needed in the SASE operator. 
    • pairs: Optional boolean value that flags, if the given list of aliases should be interpreted as pairs of (old_name, new_name). Default value is false.

    Example

    PQL
    Code Block
    themeEclipse
    languagejavascript
    titleRename Operator
    linenumberstrue
    // Renames the first attribute to auction_id, the second to bidder_id and the last to another_id.
    output = RENAME({
                      aliases = ['auction_id', 'bidder_id', 'another_id']
                     }, input)
    
    // Due the set flag pairs, the rename operator renames the attribute auction_id to auction and bidder_id to bidder.
    output = RENAME({
                     aliases = ['auction_id', 'auction', 'bidder_id', 'bidder'], 
                     pairs = 'true'
                    }, input)
    
    
    CQL
    Code Block
    themeEclipse
    languagesql
    titleRename Operator
    linenumberstrue
    SELECT auction_id AS auction, bidder_id AS bidder FROM input

    ROUTE

    Description

    This operator can be used to route the elements in the stream to different further processing operators, depending on the predicate.

    Parameter

    • predicates: A list of predicates
    • overlappingPredicates: If set to true, all predicates are evaluated and for each predicate evaluated to true, the element will be send. Default is false and will only send the element when the first element evaluates to true.

    Example

    Code Block
    themeEclipse
    languagejavascript
    titleRoute Operator
    linenumberstrue
    route({predicates=['price > 200', 'price > 300', 'price > 400']}, input)

    SAMPLE

    Description

    This operator can reduce load by throwing away tuples.

    Parameter

    • sampleRate: The rate elements are send (i.e. 1 means, send every element, 10 means, send every 10th element)

    Example

    Code Block
    languagejavascript
    out = SAMPLE({sampleRate = 2}, input}

     

    SELECT

    Description

    The select operator filters the incoming data stream according to the given predicate.

    Parameter

    • predicate: The predicate to evaluate over each incoming tuple

    Example

    PQL
    Code Block
    themeEclipse
    languagejavascript
    titleSelect Operator
    linenumberstrue
    output = SELECT({ 
                     predicate='price > 100' 
                    }, input)
    CQL
    Code Block
    themeEclipse
    languagesql
    titleSelect Operator
    linenumberstrue
    SELECT * FROM input WHERE price > 100

    SENDER

    Description

    This operator can be used to publish processing results to multiple endpoints using different transport and application protocols.

    Parameter

    • wrapper: In Odysseus the default wrappers are GenericPush and GenericPull
    • transport: The transport defines the transport protocol to use.
    • protocol: The protocol parameter defines the application protocol to transform the processing results.
    • datahandler: This parameter defines the transform of the single attributes of the processing results.
    • options: Transport protocol and application protocol depending options

    Example

    PQL
    Code Block
    themeEclipse
    languagejavascript
    titleSender Operator
    linenumberstrue
    output = SENDER({sink='Sink',
                     wrapper='GenericPush',
                     transport='TCPClient',
                     protocol='CSV',
                     dataHandler='Tuple',
                     options=[['host', 'example.com'],['port', '8081'],['read', '10240'],['write', '10240']]
                    }, input)
    CQL
    Code Block
    themeEclipse
    languagesql
    titleSender Operator
    linenumberstrue
    CREATE SINK outSink (timestamp STARTTIMESTAMP, auction INTEGER, bidder INTEGER, datetime LONG, price DOUBLE)
        WRAPPER 'GenericPush'
        PROTOCOL 'CSV'
        TRANSPORT 'TCPClient'
        DATAHANDLER 'Tuple'
        OPTIONS ( 'host' 'example.com', 'port' '8081', 'read' '10240', 'write' '10240' )
    
    STREAM TO sink SELECT * FROM input 

    SOCKETSINK

    Description

    This operator can be used to send/provide data from Odysseus via a tcp socket connection. (Remark: This operator will potentially change in future)

    Deprecated: Use the Sender operator.

    Parameter

    Depending on the parameter push:Boolean (todo: change name!), the parameter have to following meaning:

    • push = false (Default): On Odysseus side a server socket connection is opened
      • host:String typically 'localhost'
      • sinkport:Integer On which port should the socket be opened
    • push = true: Odysseus connects to a server
      • host:String Server to connect to
      • port:Integer Port on host to connect to

    Example

    socketsink({host='localhost', push=false, sinkport=4712, sinkType='bytebuffer', sinkName='driverChannel'}, timestampToPayload(person))

    STATEMAP

    Description

    Performs a mapping of incoming attributes to out-coming attributes using map functions. Odysseus also provides a wide range of mapping functions.

    Hint: StateMap can use history information. To access the last n.th version of an attribute use "__last_n." Mind the two "_" at the beginning!

    Parameter

    • expressions: A list of expressions to map multiple incoming attribute values to out-coming attributes

    Example

    PQL
    output = MAP({
                  expressions = ['auction_id * 5','sqrt(auction_id)', '__last_5.auction_id]
                 }, input)

    STORE

    Description

    Transfer temporary information in a context store for use with the Enrich operator. More information about the context store can be found here.

    Parameter

    • store: The name of the store

    Example

    Code Block
    themeEclipse
    languagejavascript
    titleStore Operator
    linenumberstrue
    STORE({store = 'StoreName'}, input)

     

    TIMESTAMP

    Description

    This Operator can be used to update the timestamp information in the meta data part. Be careful because this may lead undefined semantics.

    Parameter

    • START: use this attribute to set the start time stamp. This is the same as using STARTTIMESTAMP in schema creating a new source. The value will be interpreted as basic time unit (e.g. millisecond)
    • END: use this attribute to set the end time stamp. This is the same as using ENDTIMESTAMP in schema creating a new source. The value will be interpreted as basic time unit (e.g. millisecond)
    • clearEnd: This parameter can be used to delete the end time stamp (i.e. set to infinity). Attention: If no start attribute is given, the start time stamp will be set to system time, unless the parameter SystemTime is set to false!
    • SystemTime: If not start attribute is given, the time stamps will be set to system time (now()). Use SystemTime = 'false' to avoid using system time.
    • If the start timestamp is spread over multiple attributes, use the following parameter to set year, month, etc. individually:
    • YEAR:
    • MONTH:
    • DAY:
    • HOUR:
    • MINUTE:
    • SECOND:
    • MILLISECOND:
    • FACTOR (Integer): Multiply the input value with this factor (e.g. to allow a finer time granularity)
    • DATEFORMAT: If set, the start attribute value will be interpreted as date string (Java SimpleDateFormat)
    • TIMEZONE (TimeZone): Set the timezone. Will only be used, if YEAR (etc) or DateFormat is set.

    Example

    PQL
    Code Block
    themeEclipse
    languagejavascript
    titleTimestamp Operator
    linenumberstrue
    /// Example using attributes for timestamp
    output = Timestamp({year='year', month='month', day='day', hour='hour', minute='minute', second='second', millisecond='millisecond'}, input)
    
    /// Example using date format
    output = Timestamp({start='timestamp', dateformat='EEE MMM dd HH:mm:ss zzz yyyy'}, input)

     

    TIMESTAMPTOPAYLOAD

    Description

    This operator is needed before data is send to another system (e.g. via a socket sink) to keep the time meta information (i.e. start and end time stamp). The input object gets two new fields with start and end timestamp. If this output is read again by (another) Odysseus instance, the following needs to be attached to the schema:

    ['start', 'StartTimestamp'], ['end', 'EndTimestamp']

    Parameter

    The operator is parameterless.

    Example

    driverChannel = socketsink({sinkport=4712, sinkType='bytebuffer', sinkName='driverChannel'}, timestampToPayload(person))

    UNION

    Description

    This operator calculates the union of two input sets

    Parameter

    none

    Example

    PQL
    Code Block
    themeEclipse
    languagejavascript
    titleUnion Operator
    output = UNION(left, right)
    CQL
    Code Block
    themeEclipse
    languagesql
    titleUnion Operator
    linenumberstrue
    SELECT * FROM left UNION SELECT * FROM rigtht

    UNNEST

    Description

    The UnNest operator unpacks incoming tuple with a multi value attribute to create multiple tuples

    Parameter

    • attribute: The attribute that should be unpack.

    Example

     

    Code Block
    themeEclipse
    languagejavascript
    titleUnNest Operator
    linenumberstrue
    output = UNNEST({
                     attribute='myAttribute'
                    },input)

    UDO

    Description

    The UDO operator calls a user defined operator.

    Parameter

    • class: The name of the user defined operator
    • init: Additional parameter for the user defined operator

    Example

    Code Block
    themeEclipse
    languagejavascript
    titleUDO Operator
    linenumberstrue
    output = UDO({
                  class='MyUDO', 
                  init='some parameter'
                 }, input)

    WINDOW

    Description

    The window sets – dependent on the used parameters – the validity of the tuple. If a time based window is used, the default time granularity is in milliseconds. So, if you have another time granularity, you may use the unit-parameter (e.g. use 5 for size and SECONDS for the unit parameter) or you have to adjust the arity (e.g. use 5000 for size without the unit parameter)

    Parameter

    • size: The size of the window
    • advance: The advance the window moves forward
    • slide: The slide of the window. When using this parameter all elements in the windows will have the same starttimestamp (e.g. helful for aggregations), while advance will not change the starttimestamp.
    • type: The type of the window. The possible values are Time, Tuple, Predicate, and Unbound
    • partition: The partition attribute of the window
    • start: The start condition for a predicate window. If the condition evaluates to true, the windows is opened until the end predicate evaluates to true (or if not given the start predicate evaluates to false). Note, that all elements that are not inside a window are send to ouput port 1
    • end: The end condition for a predicate window
    • sameStartTime: For predicate windows: If set to true, all produced elements get the same start timestamp
    • unit: The unit for the time granularity - Possible values are one of TimeUnit like SECONDS, NANOSECODS etc. - default time

    Example

    Code Block
    themeEclipse
    languagejavascript
    titleWindow Operator
    linenumberstrue
    //sliding time window (notice: size and advance is directly based on the used timestamps. 
    //if they are in milliseconds (which is default), size and advance are in milliseconds too.
     output = WINDOW({
                      size = 5, 
                      advance = 1, 
                      type = 'time'
                     }, input)
    
    //sliding time window with another time unit for size and advance. 
    //size and advance are converted from seconds into milliseconds (since this is the default time granularity). This means, size will be 5000 and advance will be 1000
     output = WINDOW({
                      size = 5, 
                      advance = 1, 
                      type = 'time',
    				  unit = 'SECONDS'
    				 }, input)
    
     //sliding tuple window partitioned over bidder
     output = WINDOW({
                      size = 5, 
                      advance = 1, 
                      type = 'tuple', 
                      partition=['bidder']
                     }, input) 
    
     //unbounded window
     output = WINDOW({
                      type = 'unbounded'
                     }, input) 
    
     //now window (size = 1, advance = 1)
     output = WINDOW({
                      type = 'time'
                     }, input)
    
     //sliding delta window, reduces time granularity to value of slide
     output = WINDOW({
                      size = 5, 
                      type = 'time', 
                      slide = 5
                     }, input)
    
     // Predicate window
     output = WINDOW({
                      start = 'a > 10 and a < 20', 
    				  type = 'Predicate'
    				 }, input)
     output = WINDOW({
                      start = 'a = 11',
    				  end = 'a = 20',
    				  type = 'Predicate'
    				 }, input)

     

    Pattern Matching

    TODO: TRANSLATE!

    PATTERN

    Description

    This generic operator allows the definition of different kinds of pattern (e.g. all, any). For sequence based patterns see SASE operator (below). In the following the implemented pattern types are desribed.

    Parameter

    • type: What kind of pattern should be detected. See below for all supported types and examples
    • eventTypes: Describes the types of the input ports
    • time: If there should be a temporal context (window) this states the time and
    • timeUnit is the unit of this time
    • size: For element based windows, this is the count of elements that are treated together, size and time can be used together
    • assertions: Predicate over the input data that must be fullfilled to create an output
    • outputmode: states, what the operator should deliver:
      • EXPRESSION: use the return parameter to create the output
      • INPUT: Deliver events from input port 0, can be changed with parameter inputPort
      • TUPLE_CONTAINER: Deliver all events that are related to this matching
      • SIMPLE
    • return: see outputmode/EXPRESSION
    • inputPort: see outputmode/INPUT

    Logical

    ALL
    Beschreibung

    Die Anfrage wählt zu einem Gebot, mit einemWert höher als 200, und zu der Person, die das Gebot abgegeben hat, die ID und den Namen der Person und den Preis des Gebots aus. Berücksichtigt werden nur Personen und Gebote, die nicht älter als 10 min sind. Zusammenfassend könnte man also sagen, dass die Anfrage die Personen auswählt, die innerhalb von 10 min nach ihrem Erscheinen bereits ein Gebot mit einem Wert über 200 abgeben.

    Possible Parameter

    assertions, time, timeUnit, size, outputMode, return, inputPort

    Besonderheiten

    time und size legen fest, wie lange bzw. wie viele Events zwischengespeichert werden

    Code Block
    languagejavascript
    PATTERN({type = 'ALL', eventTypes = ['person', 'bid'],
    time = 10, timeUnit = 'MINUTES',
    assertions = ['person.id = bid.bidder && bid.price > 200'],
    outputmode = 'EXPRESSIONS',
    return = ['person.id', 'person.name', 'bid.price']},
    person, bid)
    ANY
    Beschreibung

    Die Anfrage wählt jedes Gebot mit einem Wert höher als 200 aus.

    Mögliche Parameter

    assertions, outputMode, return, inputPort

    Besonderheiten

    Die i-te Assertion gilt jeweils nur für Events des Typs, der an i-ter Stelle der relevanten Event-Typ-Liste steht.

    Code Block
    languagejavascript
    PATTERN({type = 'ANY', eventTypes = ['bid'],
    assertions = ['bid.price > 220'],
    outputMode = 'INPUT'}, bid)
    ABSENCE
    Beschreibung

    Die Anfrage erkennt, wenn 400 Millisekunden kein Gebot abgegeben wurde.

    Mögliche Parameter

    assertions, time, timeUnit, outputMode

    Besonderheiten

    Erfolg und Genauigkeit ist abhängig von den Heartbeats. Als Ausgabemodus ist nur SIMPLE möglich.

    Code Block
    languagejavascript
    PATTERN({type = 'ABSENCE', eventTypes = ['bid'],
    outputMode = 'SIMPLE', time = 400}, bid)

    Threshold

    COUNT
    Beschreibung

    Die Anfrage ist erfüllt, sobald mehr als 20 Gebote abgegeben wurden.

    Mögliche Parameter

    assertions, outputMode, return, inputPort

    Besonderheiten

    Beruht momentan auf dem Any-Pattern, das als Eingabe eine Aggregation von außen bekommt.

    Code Block
    languagejavascript
    PATTERN({type = 'FUNCTOR', eventTypes = ['aggr'],
    assertions = ['count_price > 20']},
    AGGREGATE({aggregations = [['COUNT', 'price',
    'count_price', 'double']]}, bid))
    VALUE-MAX
    Beschreibung

    Die Anfrage ist erfüllt, sobald der maximale Wert eines Gebots 300 übersteigt.

    Mögliche Parameter

    assertions, outputMode, return, inputPort

    Besonderheiten

    Beruht momentan auf dem Any-Pattern, das als Eingabe eine Aggregation von außen bekommt.

    Code Block
    languagejavascript
    PATTERN({type = 'FUNCTOR', eventTypes = ['bid'],
    assertions = ['max_price > 300']},
    AGGREGATE({aggregations=[['MAX', 'price', 'max_price',
    'double']]}, bid))
    VALUE-MIN
    Beschreibung

    Die Anfrage ist erfüllt, solange der minimale Wert eines Gebots größer als 50 und kleiner als 100 ist.

    Mögliche Parameter

    assertions, outputMode, return, inputPort

    Besonderheiten

    Beruht momentan auf dem Any-Pattern, das als Eingabe eine Aggregation von außen bekommt.

    Code Block
    languagejavascript
    PATTERN({type = 'FUNCTOR', eventTypes = ['bid'],
    assertions = ['min_price > 50 && min_price < 100']},
    AGGREGATE({aggregations=[['MIN', 'price', 'min_price',
    'double']]}, bid))
    VALUE-AVERAGE
    Beschreibung

    Die Anfrage ist erfüllt, wenn das arithmetische Mittel eines Gebotes kleiner als 140 ist.

    Mögliche Parameter

    assertions, outputMode, return, inputPort

    Besonderheiten

    Beruht momentan auf dem Any-Pattern, das als Eingabe eine Aggregation von außen bekommt.

    Code Block
    languagejavascript
    PATTERN({type = 'FUNCTOR', eventTypes = ['bid'],
    assertions = ['avg_price < 140']},
    AGGREGATE({aggregations=[['AVG', 'price', 'avg_price',
    'double']]}, bid))

    Subset Selection

    RELATIVE-N-HIGHEST
    Beschreibung

    Die Anfrage wählt alle sechs Sekunden die drei höchsten Gebote aus.

    Benötigte Parameter

    attribute, count, time oder size

    Mögliche Parameter

    assertions, outputMode, return, inputPort, timeUnit

    Besonderheiten

    Der Ausgabemodus SIMPLE ist zwar möglich, macht aber nicht soviel Sinn.

    Code Block
    languagejavascript
    PATTERN({type = 'RELATIVE_N_HIGHEST', eventTypes = ['bid'],
    attribute = 'price', count = 3,
    time = 6, timeUnit = 'SECONDS',
    outputmode = 'expressions',
    return = ['bid.timestamp', 'bid.bidder',
    'bid.price']}, bid)
    RELATIVE-N-LOWEST
    Beschreibung

    Die Anfrage wählt alle sechs Sekunden aus den Geboten, die höher als 80 sind, die drei niedrigsten Gebote aus.

    Benötigte Parameter

    attribute, count, time oder size

    Mögliche Parameter

    assertions, outputMode, return, inputPort, timeUnit

    Besonderheiten

    Der Ausgabemodus SIMPLE ist zwar möglich, macht aber in diesem Kontext normalerweise keinen Sinn.

    Code Block
    languagejavascript
    PATTERN({type = 'RELATIVE_N_LOWEST', eventTypes = ['bid'],
    assertions = ['price > 80'],
    attribute = 'price', count = 3,
    time = 10, timeUnit = 'SECONDS',
    outputmode = 'TUPLE_CONTAINER'}, bid)

    Modale

    ALWAYS
    Beschreibung

    Wenn innerhalb dem festen Intervall von drei Sekunden alle Gebote größer als 140 sind, werden diese von dem Pattern ausgegeben.

    Benötigte Parameter

    time oder size

    Mögliche Parameter

    assertions, outputMode, return, inputPort, timeUnit

    Besonderheiten

    Ist der Ausgabemodus nicht SIMPLE, werden bei der Erfüllung des Patterns alle relevanten Events ausgegeben, die die Assertions erfüllen.

    Code Block
    languagejavascript
    PATTERN({type = 'ALWAYS', eventTypes = ['bid'],
    time = 3, timeUnit = 'SECONDS',
    assertions = ['bid.price > 140'],
    outputMode = 'INPUT'}, bid)
    SOMETIMES
    Beschreibung

    Das Pattern ist erfüllt, wenn innerhalb dem festen Intervall von zehn Sekunden mindestens ein Gebot größer als 280 ist.

    Benötigte Parameter

    time oder size

    Mögliche Parameter

    assertions, outputMode, return, inputPort, timeUnit

    Besonderheiten

    Ist der Ausgabemodus nicht SIMPLE, werden bei der Erfüllung des Patterns alle relevanten Events ausgegeben, die die Assertions erfüllen.

    Code Block
    languagejavascript
    PATTERN({type = 'SOMETIMES', eventTypes = ['bid'],
    time = 10, timeUnit = 'SECONDS',
    assertions = ['bid.price > 280']}, bid)

    Temporal Order

    SEQUENCE
    Beschreibung

    Die Anfrage wählt Attribute von Personen und den Geboten der jeweiligen Personen aus, bei denen die Person vor den Gebot auftritt und sein Gebot größer als 200 ist.

    Besonderheiten

    Die Anfrage basiert auf dem SASE-Operator. Der Parameter query erwartet eine Anfrage, die in der SASE-Anfragesprache formuliert ist.

    Code Block
    languagejavascript
    SASE({query = 'PATTERN SEQ(person p, bid b)
    WHERE skip_till_next_match(p,b)
    {p.id = b.bidder, b.price > 200}
    RETURN p.id, p.name, b.price', schema=[['id','Integer'],'name','String'], type='PersonEvent1'} , person, bid) 
    FIRST-N
    Beschreibung

    Die Anfrage wählt alle zehn Sekunden die ersten drei Gebote aus, die größer als 100 sind und gibt die angegebenen Attribute aus.

    Benötigte Parameter

    count, time oder size

    Mögliche Parameter

    assertions, outputMode, return, inputPort, timeUnit

    Besonderheiten

    Der Ausgabemodus SIMPLE ist zwar möglich, macht aber in diesem Kontext normalerweise keinen Sinn.

    Code Block
    languagejavascript
    PATTERN({type = 'FIRST_N', eventTypes = ['bid'],
    time = 10, timeUnit = 'SECONDS',
    count = 3,
    assertions = ['bid.price > 100'],
    outputmode = 'EXPRESSIONS',
    return = ['bid.timestamp', 'bid.bidder',
    'bid.price']}, bid)
    LAST-N
    Beschreibung

    Die Anfrage wählt alle zehn Sekunden die letzten drei relevanten Events aus. Dies können Gebote und Auktionen sein.

    Benötigte Parameter

    count, time oder size

    Mögliche Parameter

    assertions, outputMode, return, inputPort, timeUnit

    Besonderheiten

    Der Ausgabemodus SIMPLE ist zwar möglich, macht aber in diesem Kontext normalerweise keinen Sinn. Beinhaltet die Ausgabe verschiedene Event-Typen macht der Ausgabemodus TUPLE_CONTAINER Sinn, da bei dort das Schema der Daten keine Rolle spielt. Bei anderen Ausgabemodi entstehen unter Umständen null-Werte oder ähnliches.

    Code Block
    languagejavascript
    PATTERN({type = 'LAST_N', eventTypes = ['person',
    'auction'],
    time = 10, timeUnit = 'SECONDS',
    count = 3,
    outputmode = 'TUPLE_CONTAINER'}, auction, person)

    Trend

    INCREASING
    Beschreibung

    Das Pattern ist erfüllt, wenn die Werte der Gebote innerhalb des festen Zeitintervalls von 2 Sekunden streng monoton steigen.

    Benötigte Parameter

    attribute, time oder size

    Mögliche Parameter

    assertions, outputMode, return, inputPort, timeUnit

    Besonderheiten

    Ist der Ausgabemodus nicht SIMPLE, werden bei der Erfüllung des Patterns alle relevanten Events ausgegeben, die die Assertions erfüllen.

    Code Block
    languagejavascript
    PATTERN({type = 'INCREASING', eventTypes = ['bid'],
    attribute = 'price',
    time = 2, timeUnit = 'SECONDS'}, bid)
    DECREASING
    Beschreibung

    Das Pattern ist erfüllt, wenn die Werte der Gebote innerhalb des festen Zeitintervalls von 2 Sekunden streng monoton fallen.

    Benötigte Parameter

    attribute, time oder size

    Mögliche Parameter

    assertions, outputMode, return, inputPort, timeUnit

    Besonderheiten

    Ist der Ausgabemodus nicht SIMPLE, werden bei der Erfüllung des Patterns alle relevanten Events ausgegeben, die die Assertions erfüllen.

    Code Block
    languagejavascript
    PATTERN({type = 'DECREASING', eventTypes = ['bid'],
    attribute = 'price',
    time = 2, timeUnit = 'SECONDS'}, bid)
    STABLE
    Beschreibung

    Das Pattern ist erfüllt, wenn sich die Werte der Gebote innerhalb des festen Zeitintervalls von 2 Sekunden nicht ändern.

    Benötigte Parameter

    attribute, time oder size

    Mögliche Parameter

    assertions, outputMode, return, inputPort, timeUnit

    Besonderheiten

    Ist der Ausgabemodus nicht SIMPLE, werden bei der Erfüllung des Patterns alle relevanten Events ausgegeben, die die Assertions erfüllen.

    Code Block
    languagejavascript
    PATTERN({type = 'STABLE', eventTypes = ['bid'],
    attribute = 'price',
    time = 2, timeUnit = 'SECONDS'}, bid)
    NON-INCREASING
    Beschreibung

    Das Pattern ist erfüllt, wenn die Werte der Gebote innerhalb des festen Zeitintervalls von 2 Sekunden monoton fallen.

    Benötigte Parameter

    attribute, time oder size

    Mögliche Parameter

    assertions, outputMode, return, inputPort, timeUnit

    Besonderheiten

    Ist der Ausgabemodus nicht SIMPLE, werden bei der Erfüllung des Patterns alle relevanten Events ausgegeben, die die Assertions erfüllen.

    Code Block
    languagejavascript
    PATTERN({type = 'NON_INCREASING', eventTypes = ['bid'],
    attribute = 'price',
    time = 2, timeUnit = 'SECONDS'}, bid)
    NON-DECREASING
    Beschreibung

    Das Pattern ist erfüllt, wenn die Werte der Gebote innerhalb des festen Zeitintervalls von 2 Sekunden monoton steigen.

    Benötigte Parameter

    attribute, time oder size

    Mögliche Parameter

    assertions, outputMode, return, inputPort, timeUnit

    Besonderheiten

    Ist der Ausgabemodus nicht SIMPLE, werden bei der Erfüllung des Patterns alle relevanten Events ausgegeben, die die Assertions erfüllen.

    Code Block
    languagejavascript
    PATTERN({type = 'NON_DECREASING', eventTypes = ['bid'],
    attribute = 'price',
    time = 2, timeUnit = 'SECONDS'}, bid)
    NON-STABLE
    Beschreibung

    Das Pattern ist das Gegenstück zum Stable-Pattern. Es ist erfüllt, wenn sich die Werte von drei aufeinanderfolgenden Gebote ändern.

    Benötigte Parameter

    attribute, time oder size

    Mögliche Parameter

    assertions, outputMode, return, inputPort, timeUnit

    Besonderheiten

    Ist der Ausgabemodus nicht SIMPLE, werden bei der Erfüllung des Patterns alle relevanten Events ausgegeben, die die Assertions erfüllen.

    Code Block
    languagejavascript
    PATTERN({type = 'NON_STABLE', eventTypes = ['bid'],
    attribute = 'price', size = 3}, bid)
    MIXED
    Beschreibung

    Das Pattern ist erfüllt, wenn die Werte der Gebote innerhalb des festen Zeitintervalls von 2 Sekunden mindestens einmal streng monoton steigen und mindestens einmal streng monoton fallen.

    Benötigte Parameter

    attribute, time oder size

    Mögliche Parameter

    assertions, outputMode, return, inputPort, timeUnit

    Besonderheiten

    Ist der Ausgabemodus nicht SIMPLE, werden bei der Erfüllung des Patterns alle relevanten Events ausgegeben, die die Assertions erfüllen.

    Code Block
    languagejavascript
    PATTERN({type = 'MIXED', eventTypes = ['bid'],
    attribute = 'price',
    time = 2, timeUnit = 'SECONDS'}, bid)

    Spatial Pattern

    MIN-DISTANCE
    MAX-DISTANCE
    AVERAGE-DISTANCE
    RELATIVE-MIN-DISTANCE
    RELATIVE-MAX-DISTANCE
    RELATIVE-AVERAGE-DISTANCE

    Spatial Temporal Pattern

    MOVING-IN-A-CONSTANT-DIRECTION
    MOVING-IN-A-MIXED-DIRECTION
    STATIONARY
    MOVING-TOWARD

    SASE

    Description

    This operator allows to define temporal pattern to match against the stream. For this purpose we use the SASE+ query language. The query is expressed in the parameter query. The used source has to be of the correct type (for the example s05 and s08). The order is not important. If the type of the source is not set or wrong it can be set using the Rename Operator.

    Dieser Operator ermöglicht es, Anfragen mit zeitlichen Mustern zu definieren. Zu diesem Zweck wird die SASE+ Anfragesprache verwendet und die Anfrage im Parameter query übergeben. Die angebenen Quellen müssen den passenden Typ (warning) haben (für das folgende Beispiel also s05 und s08), die Reihenfolge ist egal. Ggf. muss der Typ einer Quelle vorher noch mit Hilfe der Rename-Operation definiert werden. Der Parameter heartbeatrate legt fest, wie oft ein Hearbeat generiert werden soll, wenn ein Element verarbeitet wurde, welches aber nicht zu einem Ergebnis geführt hat.

    Parameter

    • heartbeatrate: The rate to generate heartbeats if an element was processed without given a result.
    • query: The SASE+ query
    • OneMatchPerInstance

    Example

    Code Block
    themeEclipse
    languagejavascript
    titleSASE Operator
    linenumberstrue
    s05 = RENAME({type='s05', aliases = ['ts', 'edge']},...)
    PATTERNDETECT({heartbeatrate=1,query='PATTERN SEQ(s05 s1, s08 s2) where skip_till_any_match(s1,s2){ s1.edge=s2.edge } return s1.ts,s2.ts'}, s05, s08)

    Benchmark

     

    BATCHPRODUCER

    Description

    Parameter

    Example

     

    BENCHMARK

    Description

    Parameter

    Example

     

    PRIOIDJOIN

    Description

    Parameter

    Example

     

    TESTPRODUCER

    Description

    Parameter

    Example

     

    Machine Learning / Data Mining

    Available mining or machine learning operators are described here: Machine Learning

    Storing

     

    DATABASESOURCE

    Description

    This operator can read data from a relational database. Look at Database Feature for a detailed description.

     

    DATABASESINK

    This operator can write data to a relational database. Look at Database Feature for a detailed description.

     

    Probabilistic Processing

    LINEARREGRESSION

    Description

    This operator performs a linear regression on the given set of explanatory attributes to explain the given set of dependent attributes

    Parameter

    • dependent: List of dependent attributes
    • explanatory: List of explanatory attributes

    Example

    Code Block
    themeEclipse
    languagejavascript
    titleLinearRegression Operator
    linenumberstrue
    output = linearRegression({dependent = ['x'], explanatory = ['y']}, input)

     

    LINEARREGRESSIONMERGE

    Description

    Parameter

    • dependent: List of dependent attributes
    • explanatory: List of explanatory attributes

    Example

    Code Block
    themeEclipse
    languagejavascript
    titleLinearRegressionMerge Operator
    linenumberstrue
    output = linearRegressionMerge({dependent = ['x'], explanatory = ['y']}, input)

     

    EM

    Description

    This operator fits gaussian mixtures model to the input stream.

    Parameter

    • attributes: The attributes to fit
    • mixtures: Number of mixtures to fit the data

    Example

    Code Block
    themeEclipse
    languagejavascript
    titleEM Operator
    linenumberstrue
    output = em({attributes = ['x','y'], mixtures = 2}, input)

    SampleFrom

    Description

    This operator samples from the given list of probabilistic continuous distributions.

    Parameter

    • attributes: The attributes to sample from
    • samples: Number of samples

    Example

    Code Block
    themeEclipse
    languagejavascript
    titleSampleFrom Operator
    linenumberstrue
    output = sampleFrom({attributes = ['x','y'], samples = 50}, input)

    ExistenceToPayload

    Description

    The input object gets one new field with tuple existence.

    Parameter

    The operator is parameterless.

    Example

    Code Block
    themeEclipse
    languagejavascript
    titleExistenceToPayload Operator
    linenumberstrue
    output = existenceToPayload(input)


    Table of Contents
    maxLevel3
    outlinetrue