...
| Code Block |
|---|
#PARSER PQL
#PARALLELIZATION (type=INTER_OPERATOR) (degree=4) (buffersize=AUTO) (optimization=true)
#INTEROPERATORPARALLELIZATION aggregateId 2 GLOBAL NonGroupedAggregateTransformationStrategy ShuffleFragmentAO
/// other possible definition of parameters for this keyword
#INTEROPERATORPARALLELIZATION (id=aggregateId) (degree=2) (buffersize=GLOBAL) (strategy=NonGroupedAggregateTransformationStrategy) (fragment=ShuffleFragmentAO)
#RUNQUERY
windowBid = TIMEWINDOW({SIZE = [1, 'MINUTES'],
advance = [1, 'SECONDS']
}, bid)
windowAuction = TIMEWINDOW({SIZE = [10, 'MINUTES'],
advance = [1, 'SECONDS']
}, auction)
join = JOIN({ID = 'joinId', PREDICATE = 'bid.bidder == auction.id'}, windowBid, windowAuction)
sum_price_bidder = AGGREGATE({ID = 'aggregateId',
aggregations = [
['SUM', 'price', 'sum_price_bidder']
]
},
join
) |
Strategies and supported fragmentation types
...
JoinTransformationStrategy
...
HashFragmentAO
...
NonGroupedAggregateTransformationStrategy
...
ShuffleFragmentAO
...
GroupedAggregateTransformationStrategy
...
HashFragmentAO
...
Definition of Endpoints for parallelization
If the #INTEROPERATORPARALLELIZATION-keyword is used, it is also possible to define a start and enpoint for parallelization. The definition of the start and end point of parallelization is possible with a tuple or triple inside of the id-parameter. The following example shows how to use this feature.
| Code Block |
|---|
#PARSER PQL #PARALLELIZATION (type=INTER_OPERATOR) (degree=4) (buffersize=AUTO) (optimization=true) #INTEROPERATORPARALLELIZATION (id=(aggregateId:selectId)) (degree=24) (buffersize=GLOBAL) (strategy=GroupedAggregateTransformationStrategy) (fragment=ShuffleFragmentAOHashFragmentAO) /// avoidsassoure anno semantic change of the querychanges #INTEROPERATORPARALLELIZATION (id=(aggregateId:selectId:true)) (degree=24) (buffersize=GLOBAL) (strategy=GroupedAggregateTransformationStrategy) (fragment=ShuffleFragmentAOHashFragmentAO) #RUNQUERY windowBid = TIMEWINDOW({SIZE = [1, 'MINUTES'], advance = [1, 'SECONDS'] }, bid) sum_price_bidder = AGGREGATE({ID = 'aggregateId', aggregations = [ ['SUM', 'price', 'sum_price_bidder'] ] , GROUP_BY = ['auctionbidder'], FASTGROUPING = true }, windowBid ) selectBidder = SELECT({ID = 'selectId',PREDICATE = 'bid.bidder > 1'}, sum_price_bidder) |
In this example the aggregate operator is parallelized, but it is also needed that the following selection is also parallelized. So to do this at first the id of the aggregation is set, after that, seperated with a double quote, the second id of the selection is definied as the endpoint of parallelization. There is the possibility, that the parallelization changes the semantic of the query. If you want to avoid this append :true to the pair of start and end id. By default this option is set to false. If it is enabled, a possible semantic change results in an exception.
Strategies and supported fragmentation types
| Logical operator | Parallelization strategies | Description | Supported fragmentation types | Allows definition of endpoint |
|---|---|---|---|---|
| JoinAO | JoinTransformationStrategy | Uses an Hash-Fragmentation for both input streams. The fragmentation attributes are gathered from the join attributes. Note that only equals-predicates (which are concatenated with &&) are supported. The fragmented datastream is merged with an UNION Operator. | HashFragmentAO | |
| AggregateAO | NonGroupedAggregateTransformationStrategy
| Uses an RoundRobin or Shuffle Fragmentation for splitting the input datastream. This strategy works with partial aggregates and merges the datastream both with an union operator and an additional aggregate operator for merging the partial aggregates. This strategy works with and without grouping. Only aggregations with one input attribute are supported. | RoundRobinFragmentAO | |
ShuffleFragmentAO | ||||
GroupedAggregateTransformationStrategy | Uses a Hash-Fragmentation for the input stream. The fragmentation attributes are gathered from the grouping attributes. So this strategy only works if the aggregate operator has an grouping. The fragmented datastream is merged with an UNION Operator. | HashFragmentAO |
| Info |
|---|
Bold fragmentation types shows the preferred type if nothing is defined. |