Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

Remark: This window is very complex and may somethime not behave as espected. It can simulate any other window, but if not necessary, you should prefer TimeWindow or ElementWindow

The predicate window opens and closes the window regarding a start and optional an end condition. It can simulate any other windowElements that are not inside a window are discarded and send to output port 1.

The operator works as follows:

  • It first checks, if the maxWindowTime is reached. In this case all internal buffers for each group is cleared, are cleared (and the content is send to the output when outputIfMaxWindowTime is true), where the first element is older than the given threasholdthreshold.
  • After then, it checks, if closewindowafternoupdatesfor closeWindowAfterNoupdatesFor is set and closes all buffers where the last element has reached the buffer a time longer than the parameter.
  • The operator determines the group (partition) for the current input.


  • After that, if set, the clear condition is checked for the current group and the current input.

...

  • If the window for this group is already opened (this means, that some elements before the start predicates has been evaluated to true) the next step is to check,
    • if the end condition is true. Then the operator creates an output. Typically, the whole window is written and the buffer is cleared. With the clear and advanceWhen condition, this behaviour can be changed.
    • if the end condition is false, the current element is added to the window and kept inside the operator.
  • If the window for this group is not opened, the start condition is checked. It the condition is true, the operator opens a new window and adds the current element to the window.

      For the output there are different configurations:

      • samestarttime: Each element in the output will get the same starttime, i.e. the starttime from the first element
      • nesting: In Odysseus the output is typically a set of elements that are send one after the other. If this flag is set to true, the output of the window is a single list, with all elements from the window. This can be advantage, if the processing afterwards treats the elements together (e.g. in a MAP-Operation).  Samestarttime sets the time for each element in the list, the list get the union of all intervals inside the list.

      So simulate some kind of sliding window, the following parameters are used:

        • the end predicate is ignored in this case, you could use allowSameStartAndEndTS to allow single element windows! Attention, in this case, the produced result is not really valid (as start and end timestamp are the same) and you need to do some window processing afterwards.

      For the output there are different configurations:

      • samestarttime: Each element in the output will get the same starttime, i.e. the starttime from the first element. This could be used in Aggregate (and Group) operator to get only a single result for a complete window.
      • nesting: In Odysseus the output is typically a set of elements that are send one after the other. If this flag is set to true, the output of the window is a single list, with all elements from the window. This can be advantage, if the processing afterwards treats the elements together (e.g. in a MAP-Operation).  Samestarttime sets the time for each element in the list, the list get the union of all intervals inside the list.

      To simulate some kind of sliding window, the following parameters can be used:

      • AdvanceWhen: This condition checks, if the window should move, i.e. if elements in the current buffer (for the current group) should be removed. If this predicate evalutes to true, the next parameter is used to determine which number of elements the move of the window should be
      • AdvanceSize: This size tells the operator if cases of AdvanceWhen is true, how many elements should be removed from the start of the current window. If the
      • AdvanceWhen: This condition checks, if the window should move, i.e. if elements in the current buffer (for the current group) should be removed. If this predicate evalutes to true, the next parameter is used to determine which number of elements the move of the window should be
      • AdvanceSize: This size tells the operator if cases of AdvanceWhen is true, how many elements should be removed from the start of the current window. If the value is below 0 or the current window has less elements that this value, the buffer is cleared.

      ...

      • start: The start condition for a predicate window. If the condition evaluates to true, the windows is opened until the end predicate evaluates to true (or if not given the start predicate evaluates to false). Note, that all elements that are not inside a window are send to ouput port 1
      • end: The end condition for a predicate window. The tuple for which this condition is evaluated to true is only part of the result, if keepEndingElement is set to true!
      • clear: If this parameter is set, the window will only be cleared, if the condition is true. By this, the same element can be part of multiple windows (sliding)
      • sameStartTime: For predicate windows: If set to true, all produced elements get the same start timestamp
      • size: The maximum size of the window. Can be either a single number or a pair of a number and a time unit. Possible values for the unit are one of TimeUnit like SECONDS, NANOSECODS etc. - default time is the base time of the stream (typically milliseconds)
      • keepEndingElement: Typically, the object that fulfils the end condition will not be part of the result. If setting this attribute to true, the element will be part

      • partition: Evaluate the predicates on partitioned defined by different values of this attribute (similar to group by in aggreation)

      • useElementOnlyForStartOrEnd: Typically, an object is only used to evaluate the start or and the end condition . If this value is set to true, and an element can be used for both and can be part of multiple windows.keepTimeOrder: , i.e. the same element can be used to close a window and open a new window. If set to false true, the output could be out of order. only the start or end predicate will be evaluted, i.e. an element cannot be used to close an open window and use the same element to open a new window.
      • keepTimeOrder: If set to false, the ouput generation does not care about order. Typically, this makes only sense when using nesting=true!
      • closeWindowWithHeartbeat: closeWindowWithHeartbeat: if true, the window is closed when a heartbeat is received. Take a look at the session window to see how it works.
      • closewindowafternoupdatesforcloseWindowAfterNoupdatesFor: A time parameter by which the window could be closed if some time no new element reaches the buffer. Mostly makes sense for partioned windows but works also with heartbeats.
      • closeWindowAfterNoUpdateTimePort: In cases, the window closes because of closeWindowAfterNoUpdateTime this port is used for the output. Default is 0, i.e. the default output port. This can be used to handle outputs of this kind differently.

      ...

      Remark: This is a blocking operator. The operator does not write elements before it sees new elements not belonging to the window anymore (similiar to ElementWindow)

      Example

      NEEDS WORK!

      In In the following we provide some examples and the corresponding output.

      ...

      Preprocessing

      With some preprocessing (to allow more examples)

      Code Block
      #PARSER PQL
      #ADDQUERY
      in = CSVFILESOURCE({SCHEMA = [['ID', 'String'],['pos','STARTTIMESTAMP'],['isLast','Boolean']], DELIMITER = '\t', SOURCE = 'source', FILENAME = '${PROJECTPATH}/input.csv'})
      
      map = STATEMAP({EXPRESSIONS = [['isNull(__last_1.ID) OR (__last_1.ID != ID)','newElem']], KEEPINPUT = true}, in)


      we will get:

      Code Block
      ID|TIME|ISLAST|NEWELEM
      A|1|false|true | META | 1|oo
      A|2|false|false | META | 2|oo
      A|3|false|false | META | 3|oo
      A|4|true|false | META | 4|oo
      B|5|false|true | META | 5|oo
      B|6|false|false | META | 6|oo
      B|7|false|false | META | 7|oo
      B|8|false|false | META | 8|oo
      B|9|false|false | META | 9|oo
      B|10|true|false | META | 10|oo
      C|11|false|true | META | 11|oo
      C|12|false|false | META | 12|oo
      C|13|false|false | META | 13|oo
      C|14|false|false | META | 14|oo
      C|15|false|false | META | 15|oo
      C|16|false|false | META | 16|oo

      Using only a start predicate

      Code Block
      win = PREDICATEWINDOW({start = 'newElem', SAMESTARTTIME = true}, map)

      will result in:

      Code Block
      A|1|false|true | META | 1|2
      B|5|false|true | META | 5|6
      C|11|false|true | META | 11|12

      Here the window is opened for every true evaluation of the start condition and is closed for every evaluation of !start. All elements between these elements are discarded. They do not open a new window.

      Using a start and an end predicate

      Code Block
      win = PREDICATEWINDOW({start = 'newElem', end = 'newElem', SAMESTARTTIME = true}, map)
      Code Block
      A|1|false|true | META | 1|5
      A|2|false|false | META | 1|5
      A|3|false|false | META | 1|5
      A|4|true|false | META | 1|5
      B|5|false|true | META | 5|11
      B|6|false|false | META | 5|11
      B|7|false|false | META | 5|11
      B|8|false|false | META | 5|11
      B|9|false|false | META | 5|11
      B|10|true|false | META | 5|11
      C|11|false|true | META | 11|17
      C|12|false|false | META | 11|17
      C|13|false|false | META | 11|17
      C|14|false|false | META | 11|17
      C|15|false|false | META | 11|17
      C|16|false|false | META | 11|17

      Here each time a new window opens, the old window is closed, i.e. the same input element is responsible for starting and closing a window.

      Using a start and an end predicate and keeping the ending element:

      Code Block
      win = PREDICATEWINDOW({start = 'newElem', end = 'isLast', KEEPENDINGELEMENT = true, SAMESTARTTIME = true}, map)

      will result in:

      ...

      Image Added

      As you can see, the new field newElem is added, that is set to true, if it is the first element or if the element is different that the last element.

      Using only a start predicate

      Code Block
      win = PREDICATEWINDOW({start = 'newElem', SAMESTARTTIME = true}, map)

      will result in:

      Image Added

      Here the window is opened for every true evaluation of the start condition and is closed for every evaluation of !start. All elements between these elements are discarded. They do not open a new window. This may not be, what you expect! 

      By this, you could keep a window open as long the start condition is true.

      Using an end predicate

      Code Block
      win = PREDICATEWINDOW({start = 'true', end = 'newElem', SAMESTARTTIME = true}, map)

      Image Added

      Here each time a new window opens, the old window is closed, i.e. the same input element is responsible for starting and closing a window. The output contains two windows, one from 1 to 5 and one from 5 to 11. As the final window is not closed, the out starting from C (at 11) is discarded.

      Using a start and an end predicate and keeping the ending element:

      Code Block
      win = PREDICATEWINDOW({start = 'newElem', end = 'isLast', KEEPENDINGELEMENT = true, SAMESTARTTIME = true}, map)

      will result in:

      Image Added

      ...

      Remark the difference: This operator blocks only until the end predicate is reached. This works only, if samestarttime is set to true, else e.g.  AA|4|true|false | META | 1|4 would be Abe A|4|true|false | META | 4|4, this has no validitiy and will not be produced.

      ...