You are viewing an old version of this page. View the current version.

Compare with Current View Page History

« Previous Version 9 Next »

Joins tuple from two input streams iff their timestamps are overlapping and if the optional predicate validates to true.

Parameters

  • predicate: The predicate to evaluate over each incoming tuple from left and right
  • card: ONE_ONE, ONE_MANY, MANY_ONE, MANY_MANY (same as empty, see below)
  • SweepAreaName: Overwrite the default rule for using sweepAreas (e.g. TIJoinSA is used if the predicate contains other operations than "==", HashJoinSA is used if the predicate only contains "==")

Parameters for the element join

  • elementsizeport0, elementsizeport0 (see below)
  • group_by_port_0, group_by_port_1 (see below)
  • keepEndTimestamp (see below)

The card parameter describes how input elements can be joined. This optional information can be used to optimize processing (i.e. using less memory, because elements can earlier be discarded).

    • ONE_ONE: In each input stream there is exactly one corresponding object. With this setting, windows can potentially be avoided.
    • ONE_MANY: Each element in the right input stream has exactly one corresponding element in the left input stream
    • MANY_ONE: Each element in the left input stream has exactly one corresponding element in the right input stream
    • MANY_MANY: For each element in both input streams there may be multiple corresponding elements.

Example

PQL
Join Operator
output = join({predicate = 'auction_id = auction'}, left, right)
CQL
SELECT * FROM left, right WHERE auction_id = auction

Element Join

Sometimes it is necessary to have an element window before a join, for example, to only use the latest element for each or from one of the two input ports. Unfortunately, an element window has a blocking behavior. Using the element join avoids this blocking behavior by integrating the element window into the join operator itself. Therefore, there is no need to manually add an element window operator before this join.

With the parameters elementsizeport0 and elementsizeport1, the size of the element window can be defined for each input port. Optionally, the counter can be grouped, for example by a certain id, as can be seen in the example below.

JOIN({
    elementsizeport1 = 1,
    elementsizeport0 = 1,
    group_by_port_1 = ['id_right'],
    group_by_port_0 = ['id_left']              
}, left, right)

When using an element window inside of the join, the blocking behavior is omitted, but the end-timestamp of the results cannot be known. Therefore, the end-timestamp is removed (set to infinity) in this case. If you want to keep the (semantically incorrect) end timestamp, you can use the keepEndTimestamp parameter and set it to true.

  • No labels