Date: Fri, 29 Mar 2024 06:38:07 +0100 (CET) Message-ID: <1347230254.115.1711690687977@vmisdata19.uni-oldenburg.de> Subject: Exported From Confluence MIME-Version: 1.0 Content-Type: multipart/related; boundary="----=_Part_114_615797716.1711690687976" ------=_Part_114_615797716.1711690687976 Content-Type: text/html; charset=UTF-8 Content-Transfer-Encoding: quoted-printable Content-Location: file:///C:/exported.html
This document describes the basic concepts of the Procedural Query Langu= age (PQL) of Odysseus and shows how to use the language. In contrast to lan= guages SQL based languages like the Continuous Query Language (CQL) or Stre= amSQL, PQL is more procedural and functional than declarative. This documen= t shows how to formulate queries with PQL.
PQL is an operator based language where an operator can be seen as a log= ical building block of the query. Thus, PQL is the connection of several op= erators. Since Odysseus differentiates between logical operators and their = physical operators, which are the implementing counterpart, PQL is based up= on logical operators. Therefore, it may happen that the query gets changed = during the transformation from the logical query plan into the physical que= ry plan. This includes also logical optimization techniques like the restru= cturing of the logical query plan. To avoid this, you can explicitly turn o= ff the query optimization.
An operator can be used in PQL via its name and some optional settings, = which can be compared with a function and the variables for the function:= p>
OPERATORNAME(parameter, operator, o=
perator, ...)
The first variable (parameter) describes operator dependent parameters a= nd is used for configuring the operator. Note, that there is only one param= eter variable! The other variables (operator) are input operators, which ar= e the preceding operators that push their data into this operator. The inpu= ts of an operator can be directly defined by the definition of another oper= ator:
OPERATOR1(parameter1, OPERATOR2(Par=
ameter2, OPERATOR3(...)))
Except for source operators (usually the first operator of a query) each= operator should have at least one input operator. Thus, the operator can o= nly have parameters:
OPERATOR1(parameter1)
Accordingly, the operator may only have input operators but no parameter= s:
OPERATOR1(OPERATOR2(OPERATOR3(...))=
)
Alternatively, if the operator has neither parameters nor input operator= s, the operator only exists of its name (without any brackets!), so just:= p>
OPERATORNAME
It is also possible to combine all kinds of definitions, for example:
OPERATOR1(OPERATOR2(Parameter2, OPE=
RATOR3))
Since the nesting of operators may lead to an unreadable code, it is pos= sible to name operators to reuse intermediate result. This is done via the = "=3D" symbol. Thus, we can temporary save parts of the query, for example <= strong>(it is important to place blanks before and after the "=3D" symbol!)= :
Result2 =3D OPERATOR2(Parameter2, =
OPERATOR3)
The defined names can be used like operators, so that we can insert them= as the input for another operator, for example:
Result2 =3D OPERATOR2(Parameter2, =
OPERATOR3)OPERATOR1(Result2)
There could be also more than one intermediate result, if they have diff= erent names:
Result1 =3D OPERATOR1(Parameter1, =
=E2=80=A6)Result2 =3D OPERATOR2(Parameter2, Result1)Result3 =3D OPERATOR3(P=
arameter3, Result2)
And you can use the intermediate name more than one time, e.g. if there = are two or more operators that should get the same preceding operator:
Result1 =3D OPERATOR1(Parameter1, =
=E2=80=A6)OPERATOR2(Parameter2, Result1)OPERATOR3(Parameter3, Result1)
All intermediate results that are defined via the "=3D" are only valid w=
ithin the query. Thus, they are lost after the query is parsed and runs. Th=
is can be avoided with views.
A view is defined like the previous described intermediat=
e results but uses ":=3D" instead of "=3D", e.g.:
Result2 :=3D OPERATOR2(Parameter2,=
OPERATOR3)
Such a definition creates an entry into the data dictionary, so that the=
view is globally accessible and can be also used in other query languages =
like CQL.
Alternatively, the result of an operator can also be stored as a s=
ource into the data dictionary by using "::=3D"
Result2 ::=3D OPERATOR2(Parameter2=
, OPERATOR3)
The difference between a view and a source is the kind of query plan tha=
t is saved into the data dictionary and is reused. If a view is defined, th=
e result of the operator is saved as a logical query plan, which exists of =
logical operators. Thus, if another query uses the view, the logical operat=
ors are fetched from the data dictionary and build the lower part of the ne=
w operator plan or query. If an operator is saved as a source, the result o=
f the operator is saved as a physical query plan, which exists of already t=
ransformed and maybe optimized physical operators. Thus, reusing a source i=
s like a manually query sharing where parts of two or more different querie=
s are used together. Additionally, the part of the source is not recognized=
if the new part of the query that uses the source is optimized. In contras=
t, the logical query plan that is used via the a view is recognized, but wi=
ll not compulsorily lead to a query sharing.
Finally, all possibilities gives the following structure:
QUERY =3D (TEMPORARYSTREAM | VIEW =
| SHAREDSTREAM)+
TEMPORARYST=
REAM =3D STROM "=3D" OPERATOR
VIEW =3D VIEWNAME ":=3D" OPE=
RATOR
SHAREDSTREAM =3D SOURCENAME "::=3D" OPERATOR
<=
/p>
As mentioned before, the definition of an operator can contain a paramet= er. More precisely, the parameter is a list of parameters and is encapsulat= ed via two curly brackets:
OPERATOR({parameter1, paramter2, =
=E2=80=A6}, operatorinput)
A parameter itself exists of a name and a value that are defined via a "= =3D". For example, if we have the parameter port and want to set this param= eter to the 1234, we use the following definition:
OPERATOR({port=3D1234}, =E2=80=A6)=
The value can be one of the following simple types:
OPERATOR({port=3D1234}, =E2=80=A6)
=
li>
OPERATOR({possibility=3D0.453}, =E2=80=A6)
OPERATOR({host=3D'localhost'}, =E2=80=A6)
Furthermore, there are also some complex types:
OPERATOR({predicate=3D'1<1234'}=
, =E2=80=A6)
Hint: In some cases the predicate must be= in this form PREDICATE_TYPE('1<1234'), where PREDICATE_TYPE can be some= thing like RelationalPredicate.
OPERATOR({color=3D['green', 'red', =
'blue']}, =E2=80=A6)
(Type of elements: integer, double, stri=
ng, predicate, list, map).
OPERATOR({def=3D['left'=3D'green',=
'right'=3D'blue']}, =E2=80=A6)
It is also possible that values are lists:OPERATOR({def=3D['left=
'=3D['green','red'],'right'=3D['blue']]}, =E2=80=A6)
Remember, although the key can be another data type than the value, all ke=
ys must have the same data type and all values must have the same data type=
Notice, that all parameters and their types (string or integer or list or=
=E2=80=A6) are defined by their operator. Therefore, maybe it is not guaran=
teed that the same parameters of different operators use the same parameter=
declaration =E2=80=93 although we aim to uniform all parameters.
There are some operators that have more than one output. Each output is = provided via a port. The default port is 0, the second one is 1 etc. The Route for example, allows to= split a stream according to predefined predicates to different output port= s. So, if you want to use another port, you can prepend the port number wit= h a colon in front of the operator. For example, if you want the second out= put (port 1) of the select:
PROJECT({=E2=80=A6}, 1:SELECT({pre=
dicate=3D'1<x'}, =E2=80=A6))
QUERY =3D (TEMPORARYSTREAM | VI=
EW | SHAREDSTREAM)+
TEMPORAR=
YSTREAM =3D STREAM "=3D" OPERATOR
<=
code> VIEW =3D VIEWNAME ":=3D" OPE=
RATOR SHAREDSTREAM &n=
bsp;=3D SOURCENAME "::=3D" OPERATOR
OPERATOR =3D QUERY | [OUTPUTPORT ":"] OP=
ERATORTYPE "(" (PARAMETERLIST [ "," OPERATORLIST ] | OPERATORLIST) ")"
OPERATORLIST =3D [ =
OPERATOR ("," OPERATOR)* ]
P=
ARAMETERLIST =3D "{" PARAMETER ("," PARAMETER)* "}"
PARAMETER =3D NAME "=3D=
" PARAMETERVALUE
PARAMETERVA=
LUE =3D LONG | DOUBLE | STRING | PREDICATE | LIST | MAP
LIST &=
nbsp;=3D "[" [PARAMETERVALUE ("," PARAMETERVALUE)*] "]"
MAP =
=3D "[" [MAPENTRY ("," MAPENTRY*] "]"
MAPENTRY &nbs=
p; =3D PARAMETERVALUE "=3D" PARAMETERVALUE
STRING =3D "'=
" [~']* "'"
PREDICATE =
=3D PREDICATETYPE "(" STRING ")"
Odysseus has a wide range of operators build in and are explained here.<= /p>
Available mining or machine learning operators are described here: Machine Learning