» Documentation

RSSFeedReader

RSS feed example demonstrates how to use custom .NET plugin.

Initialize source database

Execute "solution.zip\SQLScripts\RSSFeed.sql". That's it.

The SQL script:

  • Creates a Feed sources table and inserts pre-selected feeds (Reuters, Yahoo and CNN). Reuters uses higher degree of categorization than CNN and Yahoo (the Feed column).
  • Defines 6 Log tables (Facts) with PARTITION by PublishDate field.
  • Implement SynchronizeBlock procedure. The inner procedure SynchronizeSource uses managed plugin FeadReader.dll to access RSS feed. The source code of the plugin is included in the solutions (solution.zip/FeedReader).

    To test plugin, execute SQL/DQL query:

    execute dql
    begin
        plugin([#FeedReader]:[#FeedReader.RSSReader]) 
        'http://rss.cnn.com/rss/edition.rss',
        0
        to @Results
        project by @this.title, @this.summary, @this.id, @this.publishDate
    end 
  • Registers RSSFeed source with metadata and add dimensions. AddSource procedures' parameters:

    • RSSFeed - the name of source. By default the source name binds to "EnitytPath" dimension. "EnitytPath" is a built-in dimension that optionally can be expanded for additional categorization.
    • PublishDate - the Log column name representing the time. The meatada creates the default dimension "Created" and binds facts' PublishDate with the dimension.
    • join RSSFeed.[sources] [sources] on ([log].feedId=[sources].feedId) - this is the SQL fragment used by metadata to join Log tables with additional table(s) (in this case [sources]) in order to access additional information for dimensions.

For example, we may register EntityPath dimension by projecting [sources] table's values feed & category:

execute AddDimension 'RSSFeed','EntityPath', 'concat([sources].[feed],\\\'/\\\',[sources].[category])','VARCHAR',4 

Where, for each fact, the EntityPath will represent the place in hierarchy: EntityPath='market/stocks/Reuters' or, EntityPath='market/bonds/Bloomberg', etc...

AddDimension procedure accepts 4 parameters:

  • Source name
  • Dimension name. The dimension name doesn't have to be cross Sources Unique. Metadata has built-in dimensions EntityPath and LogText:

    • LogText dimension. When you bind the facts column with LogText dimension , the content of the field will be freetext searchable.
    • By default EntityPath will be mapped to Source name. However, EntityPath, can be optionally expanded to identify the fact hierarchy.
  • SQL script to map source column to the dimension. The SQL expression can be the name of the field ([log].[myfield]), CASE expression, functions like RegEx (for ascii)/WRegex (unicode) to map sub-text, etc...

For example (AccountID dimension mapped by regex expression):

execute AddDimension 'RSSFeed','AccountID','regex(\\\'\\b[Aa]ccount[=:\\s]+([\\w|/]+)\\\',[log].[Summary])','VARCHAR',6 
  • Dimension type. For now use only "VARCHAR".
  • Dimension visibility & indexing action mask. The mask actions:
    • 1 - show the dimension in the search output
    • 2 - show as searcharble
    • 4 - index the dimension. The fact field's values will be indexed and searchable in the multidimensional index.
    • 8 - in search output mark the value with hyperlink to all the related information.