You are viewing an old version of this page. View the current version.

Compare with Current View Page History

« Previous Version 7 Next »

You are able to use Odysseus Script to parallelize an created script automatically. To use this functionality Odysseus Script provides two keywords.

#PARALLELIZATION

This keyword tells Odysseus, that the given query needs to be parallelized. There are two parameters that are mandatory and one optional parameter. 

  • Parallelization-Type: (mandatory) Inter-Operator or Intra-Operator. If Inter-Operator type is selected, the given query plan is modified. If Intra-Operator type is selected, different physical operators are used, that provides multithreading (in Development). 
  • Parallelization degree: (mandatory) Defines the degree of parallelization that should be used. It is also possible to use the constant AUTO, to detect the available cores and use this value.
  • Buffer-size: (optional) Defines the number of elements inside of the used buffers. There is also the possibility to use the constant AUTO, to use an default value.

 

The following example shows the usage of this keyword. This example uses the inter-operator parallelization with an degree of 4 and an automatic buffersize.

#PARSER PQL
#PARALLELIZATION INTER_OPERATOR 4 AUTO
#RUNQUERY
windowBid = TIMEWINDOW({SIZE = [1, 'MINUTES'],
                  advance = [1, 'SECONDS']
                  }, bid)

windowAuction = TIMEWINDOW({SIZE = [10, 'MINUTES'],
                  advance = [1, 'SECONDS']
                  }, auction)

join = JOIN({PREDICATE = 'bid.bidder == auction.id'}, windowBid, windowAuction)

If this keyword is used, every operator of the query, which has an compatible parallelization strategy is transformed. If only a one are a few operators should parallelized the following keyword need to be used in addition.

 

#INTEROPERATORPARALLELIZATION

The #INTEROPERATORPARALLELIZATION keyword is an addition to the #PARALLELIZATION keyword. With this keyword it is possible to select one or more operators, which should be parallelized. The is also the possibility to configure the parallelization for each operator. This keyword provides some parameters:

  • Operator-Ids (mandatory): 
  • Parallelization degree: (mandatory) Defines the degree of parallelization that should be used. It is also possible to use the constant AUTO to detect the available cores and use this value, or GLOBAL to use the value defined in the #PARALLELIZATION keyword.
  • Buffer-size: (mandatory) Defines the number of elements inside of the used buffers. There is also the possibility to use the constant AUTO to use an default value, or GLOBAL to use the value defined in the #PARALLELIZATION keyword.
  • Parallelization strategy (optional):
  • Fragmentation type (optional):

 

The following code example shows the usage of this keyword. Only the aggregation is parallelized, because only this id is defined. The global parallelization degree is overwritten with the value of 2. With the constant GLOBAL the value for the buffersize is used from the global definition. In addition to this parameters, also the parallelization strategy is defined manually. In this case the AggregateMultithreadedTransformationStrategy is used. Note that the strategy need to be fit to the operator type defined with the id. In addition the strategy need to be compatible for the operator. In some cases it is not possible to use the selected strategy, e.g. an Grouping inside the aggregation is needed. The last parameter in this example is the optional selection of an fragmentation type. Note that also on this point not every strategy supports all fragmentation types. See the list below for all possible combinations.

#PARSER PQL
#PARALLELIZATION INTER_OPERATOR 4 AUTO
#INTEROPERATORPARALLELIZATION aggregateId 2 GLOBAL AggregateMultithreadedTransformationStrategy ShuffleFragmentAO
#RUNQUERY

windowBid = TIMEWINDOW({SIZE = [1, 'MINUTES'],
                  advance = [1, 'SECONDS']
                  }, bid)

windowAuction = TIMEWINDOW({SIZE = [10, 'MINUTES'],
                  advance = [1, 'SECONDS']
                  }, auction)

join = JOIN({ID = 'joinId', PREDICATE = 'bid.bidder == auction.id'}, windowBid, windowAuction)

sum_price_bidder = AGGREGATE({ID = 'aggregateId',
                              aggregations = [
                                ['SUM', 'price', 'sum_price_bidder']
                              ],
                              FASTGROUPING = true                                                                    
                            },
                            join
                          )

Strategies and fragmentation types

Logical operatorParallelization strategiesDescriptionSupported fragmentation types
JoinAO

JoinMultithreadedTransformationStrategy

 

HashFragmentAO

AggregateAO

AggregateMultithreadedTransformationStrategy

 

 RoundRobinFragmentAO

ShuffleFragmentAO

GroupedAggregateMultithreadedTransformationStrategy

 

HashFragmentAO

Bold fragmentation types shows the preferred type.

  • No labels