View Source

There are two operations to realize a Recommender System with Odysseus:

Example how to use the operators

In this example, the MovieLens dataset is used.

The file u_ordered.data is ordered by timestamp (this is not necessary but allows implementations to take advantage of temporal effects, e. g. concept drift).

The file unique_temporal_ordered_users.data has only the user column of u_ordered.data. Duplicates are removed.

#PARSER CQL

#RUNQUERY
CREATE STREAM ml100k (userid Integer, itemid Integer, rating Double, timestamp Long)
   WRAPPER 'GenericPull'
   PROTOCOL 'CSV'
   TRANSPORT 'File'
   DATAHANDLER 'Tuple'
   OPTIONS (
      'filename' '${PROJECTPATH}/datasets/ml-100k/u_ordered.data',
      'delimiter' '\t'
      ,'scheduler.delay' '100'
   )

#RUNQUERY
CREATE STREAM ml100k_users (userid Integer)
   WRAPPER 'GenericPull'
   PROTOCOL 'CSV'
   TRANSPORT 'File'
   DATAHANDLER 'Tuple'
   OPTIONS (
      'filename' '${PROJECTPATH}/datasets/ml-100k/unique_temporal_ordered_users.data',
      'delimiter' '\t'
      ,'scheduler.delay' '1000'
   )

#PARSER PQL

#ADDQUERY
recommendationModels = RECOMMENDATION_LEARN(
   {
      item = 'itemid',
      user = 'userid',
      rating = 'rating',
      learner = 'Mahout',
      options = [
         'OptionRecommender'='SVDRecommender',
         'OptionFactorizer'='SVDPlusPlusFactorizer'
      ]
   },
   ml100k)

#ADDQUERY
recommendations = RECOMMENDATION(
   {
      recommender = 'recommender',
      user = 'userid',
      no_of_recommendations = 5
   },
   ml100k_users,
   recommendationModels)