Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

The operator builds a tree from the incoming tuples and counts for each node, how often it was used. With this information, it can calculate the probability for each node in comparison to its siblings and therefore for each path in the tree. If the probability is below the value given by the user, the current tuple is marked as anomaly and will be reported.

Parameters

...

  • treeDepth The maximum depth of the tree. Default is 2.
  • minRelativeFrequencyPath The minimal relative frequency of that path. The default is 0.3 (which means 30%).
  • minRelativeFrequencyNode The minimal relative frequency of all single nodes of the path. The default is 0.3 (which means 30%)
  • firstTupleIsRoot If true, tuples which are equal to the first tuple are always the root. It could be good to set the maxdepth higher than the expected number of states between two occurences of the root state.
  • uniqueBackupId A unique ID for this operator to save and read backup data.

Example

Code Block
linenumberstrue
stateAnalysis = RARESEQUENCE({
                    treedepth = 100,
                    minrelativefrequencyPath = 0.1,
					minrelativefrequencyNode = 0.3,
					firsttupleisroot = 'true'
                  },
                  state
                )

Using the backup functionality

The operator can save the learned tree into a database via the second output port. It continuously put out a serialized version of the tree which can be used to save it in a database, e.g. a MongoDB. When the analysis is started again, the information from the database can be used. This is useful, if a lot of knowledge is learned and if it would take a while to gain this knowledge again. On the startup, the information from the database is read a single time (if an information is available) and written into the operator. The PQL query in the codeblock shows how to use the backup functionality.

Code Block
linenumberstrue
#PARSER PQL
#DEFINE BACKUPSCHEMA [['tree', 'String'],['backupId', 'String']]
#RUNQUERY
/// Backup-Daten aus der Datenbank auslesen
backupMongo = MONGODBSOURCE({
                  database = 'odysseus',
                  port = 27017,
                  host = 'localhost',
                  collectionname = 'condition'
                }
              )
/// Backup-Daten in Tupel konvertieren
backupTuple = KEYVALUETOTUPLE({
            schema=${BACKUPSCHEMA},
            TYPE = 'Backup',
            KEEPINPUT = 'false'
          },
          backupMongo
        )
stateAnalysis = RARESEQUENCE({
                    treedepth = 100,
                    minrelativefrequencyPath = 0.1,
                    firsttupleisroot = 'true',
                    UNIQUEBACKUPID = 'rareSequence_1'
                  },
                  state,
                  backupTuple
                )
                
analysis = SELECT({PREDICATE = 'anomalyScore > 0'}, stateAnalysis)
                
/// Backup-Daten von Port 1 in ein Key-Value Objekt konvertieren
keyValueOp = TUPLETOKEYVALUE({
                  type='KEYVALUEOBJECT'                            
                },
                1:stateAnalysis
              )

/// Backup-Daten in der Datenbank speichern
mongoSink = MONGODBSINK({
                database = 'odysseus',
                port = 27017,
                host = 'localhost',
                collectionname = 'condition',
                batchsize = 1,
                deletebeforeinsert = 'true',
                deleteequalattribute = 'backupId'
              },
              keyValueOp
            )