Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

This page lists some of the features and components that are provided by Odysseus. Notice that this list is neither complete nor may contain some functions that are not working at the moment. We distinguish between core contepts concepts that are generally given by its framework architecture and addtional additional features that use the framework to extend Odysseus with new concepts and functionalities.

...

The core feature is the basis of Odysseus. It holds all stuff that is needed by Odysseus in any case, e.g. that the query processing needs a transformation process etc. Since Odysseus is a framework, its components can be extended and configured via services and interfaces. However, the core also includes the most common implementations for the components, so that there is at least one complete system configuration. In the following, we list some of the basic framework concepts of Odysseus and say how a framework contept concept is implemented by a dedicated technology/concept

Component-based Architecture

Odysseus is based on OSGi so that it is a component-based architecture. Most of the functionalities in Odysseus can be extended or configured via services through the components. This also allows the adaption of the system during runtime. Since it is also possible to substitute components, default concepts can be replaced by other/own concepts.

...

Most systems only have one processing type (e.g. relational, xml XML or just strings). Odysseus can handle arbitrary processing types. The default processing type is a "relational".

...

Each object that is processed by the system can be enriched with metadata. Besides fixed metadata, each object can be optionally annoted annotated with key-value pairs. The fixed metadata is not optional, because it is for example used for the processing. Therefore, the default metadata is "TimeInterval" (or also referred to as "interval" or "intervalapproach"). The TimeInterval metadata provides two timestamps that indicate the start and the end of the validity of the processing object. Each operator that recognizes the timeinterval metadata uses this metadata to process only those processing objects that are valid at the same time. This allows the processing of an potentially unbounded data stream by windowing the stream through time intervals. Furthermore, there is the possibility to use a latency metadata - which is used for measuring the latency of a processing object.

...

The processing objects can be specified by a schema and datatypes that is extendable and is called simple descripton description framework (SDF). In the relational processing, for example, the schema describes the names (attributes) and the datatypes of the tuple (which is the processing object in the relational world). The schema can be seen as a list of attributes and each attribute has a name and a datatype. Although there are datatypes for integer, float or something else, it is also possible to introduce your own datatype. This could be everything and is only a marker that could be used by operators, but neither the schema nor the datatypes are normally used by the processing (Needless to say, that some relational operators during the relational processing, e.g. a projection, need the schema, so that it have to fit to the processing object).

...

Transformation: Logical and Physical Operators

Odysseus distinguishes between logical operators that only say "what" this operator does but does not say "how". Thus, the logical layer is independent from any implementation and normally also from any processing object types (see above). The transformation converts the logical into a physical representation. This physical counterpart provides the real implementation. Thus, it is possible to have more than one implementation for one logical representation. A set of rules and a rule engine manages how a logical operator is transformed and which physical operator is used. Since these rules can be complemented and overloaded, it is also possible to announce own rules (e.g. to transform the logical operators into physical operators for a new processing type)

...

A given logical plan (which is a graph based on logical operators) can be optimized via restructering restructuring using a set of rewrite rules. For the relational processing for example, there is a rule that pushes a selection down to the source, so that the number of processing objects is reduced as early as possible. These rules can also be overloaded and completet completed by new ones.

Creating New Operators

...

The parser is used for transforming a query (normally its a string) into a logical query plan. It is possible to have multiple parser at one.There exists, e.g. PQL and CQL. PQL is a default parser where each operator can be expressed via a procedure. CQL is based on SQL and is similar to StreamSQL. It is also possible to integrate new languages. Furthermore, PQL can be easily extended for new operators by annoting annotating the logical operator.

Webservice and Console Executor

The executor manages all things (installs and runs query or adds and removes sources). So the executor is the interface for external accesses. It can be used via code, a webservice or a console. However, it is also possible to extend the executor to provide a new accessibility. The webserive webservice interface, for example, can be used by other applications (even non-Java) to access to Odysseus.

...

An own script language that is called "Odysseus Script" provides the possiblity possibility to run a set of queries or setting params parameters through one script-file.

Access Framework

The access framework of Odysseus is responsible for creating source and sink operators. For examapleexample, a source operators is used for connecting to a sensor to open a data socket where the sensor can push its data. The access framework provides several layers/parts which can be combined to build a suitable access operator. For example, there is a transport layer that describes how the data is provided (e.g. a tcp socket, as a file or through a serial port). Based on this, a protocol handler tells how the data is represented (e.g. as lines or text or byte buffers). This protocol handler uses a set of data handlers which say how the text or the bytes are transformed into the data. All these handlers can be extended as well.

...

Odysseus is a multi user system. For this, each installed query or source is dedicated to a user. Since more than one user can access the system, it is also possible to grant or revoke special rights to other users.

...

Punctuations / Heartbeat Mechanism

Odysseus has a built-in punctuation mechansim mechanism (in the relational processing). Since the window concept of the interval approach may cause a blocking of operators, because a processing step of an operartor operator needs further elements to produce results. However, if there are no further elements, the processing blocks. At this point, punctuations indictate indicate that the stream is still "alive" but there are currently no elements (thus, also called heartbeat). So, punctuation are used to unblock the operator earlier.

...

Additional Features

Admission Control Feature

The admission control pays attention to the system load when new queries are added. For example, if the system load is too high, a new queries could be refused by the system. For this, it estimates the load of a query by measuring the selectitivty selectivity and data distribution.

Action Feature

...

Although Odysseus is designed for streaming data, there could be also some static data that is (usually) stored in a database. Therefore, this feature provides drivers and interfaces for opening connections e.g. to a mysqlMySQL, oracle Oracle or postgresql PostgreSQL database. Furthermore, there are for the one hand a sink and a source operator to write or read static data as streams from the database. For the other hand, there is an enrich operator to enrich streaming objects with static data from a database.

...

This feature provides concepts for learning from data streams. Besides clustering and classficationclassification, there exists also a concept for learning frequent patterns from the stream.

...

This feature distributes Odysseus over a network of peers

Probabilistic Feature

This feature allows the processing of discrete and continuous probabilistic values.

Prototyping Feature

With this feature, it is possible to write "user definied defined aggregates" and "user defined functions" in several script languages like java script or rubyJava-Script or Ruby.

Scheduling Feature

As mentioned above, the scheduling feature installs several scheduling mechanism like a priority-based or an aurora-based scheduling

...

Security punctuations are points in a data streams that tolds tells the processing who is allowed to read the data. Thus, the stream itself can define rights like "as from now you are not allowed to see the next 100 tuples".

...

This feature allows the definition of service level agreements, which is for example used for a sla SLA based scheduling

SPARQL Feature

...

The spatial feature introduces spatial data types like pointpoints, lines, linepolygons, polygone etc., and defines some functions on them, e.g. covers, crosses etc. Thus, a geographic processing is possible.

...