Configure a connection to a news service

This article explains how to connect Liberator to a news service integration adapter. The adapter used in this article is the TREP Adapter.

We’ve assumed that you use the Caplin Deployment Framework to manage your installation, and that you’ve already installed the Deployment Framework, Liberator and Transformer. For more information on installing the Deployment Framework and deploying Caplin components to it, see here.

Install the TREP Adapter

To install the TREP Adapter and connect it to a news service, follow the instructions in Installing the TREP Adapter.

Configure Liberator

To configure Liberator:

Configure the news-headline cache

News headlines and news stories are handled differently by Liberator:

  • News headlines are broadcast by the TREP Adapter to Liberator, where they are cached and served on a subscription basis to clients. Liberator does not issue requests or queries for headlines. If a news headline is not in Liberator’s cache it cannot be served to clients and it cannot be searched for by clients.

  • News stories, the content behind the headlines, are requested by Liberator directly from the TREP Adapter by active subscription.

The standard configuration for Liberator’s in-memory cache of headlines should be sufficient for most purposes. By default, up to 500 headlines are cached in memory, with no purging of old headlines. When the cache is full, the next headline is accommodated by dropping the oldest headline from the cache.

If you have a slow-moving news feed, the oldest headlines may crowd headline searches with irrelevant news. To prevent this from happening, you have the option of purging old headlines from the cache using the configuration items news-purge-days and news-purge-time. Note that Liberator’s news-headline cache and Liberator’s general object cache are two different caches and use separate sets of configuration items.

Here’s an example of how to configure purging of the news headline cache:

# Schedule a time for a daily purge of old news headlines.
# Purge headlines every morning at 4 a.m. (240 minutes after midnight).
news-purge-time   240

# Specify how many day's worth of headlines to retain when purging.
# Keep the most recent 2 days of headlines and purge the rest.
news-purge-days   2

If you have a fast-moving news feed, the default cache size may not be enough to cache every contemporary headline and Liberator may start dropping headlines from the cache that are still relevant to users. To prevent this from happening, you have the option of increasing the size of Liberator’s headline cache using the configuration item newsitems-saved.

# Cache 1000 headlines in memory (500 is the default)
newsitems-saved   1000

Configuring the news log and news-log replay

Liberator caches a maximum of newsitems-saved headlines in memory, but by default this cache is lost when Liberator is shutdown or restarted. However, you can configure Liberator to log headlines and replay their arrival on startup, returning Liberator’s headline cache to the state it was in before shutdown.

To enable replaying of logged headlines:

  1. Enable logging of news headlines by specifying a file name for the current news log (see news-log).

  2. Enable log replay on startup by specifying a point in the past from which to begin replaying news headlines (see news-replay and news-replay-days).

  3. Optionally, specify archived logs to be replayed in addition to the current news log (see news-replay-files and add-log).

Enable logging of news headlines

To enable logging of news headlines, specify a name for the current news log file using the news-log configuration item. The conventional name for the current news log is news.log.

# Enable logging of news headlines
news-log            news.log

Enable replay of the news log

You enable log replay by specifying a point in the past relative to the time Liberator is started. Two configuration items are used: news-replay and news-replay days. How Liberator interprets these configuration items depends on the value of news-replay.

Here are some examples that implement these behaviours:

# Example 1: Replay all headlines received after 12 a.m. (0 minutes after midnight) yesterday
news-replay         0
news-replay-days    1

# Example 2: Replay all headlines received after 1 a.m. (60 minutes after midnight) two days ago
news-replay         60
news-replay-days    2

# Example 3: Replay all headlines received within the last 1440 minutes (24 hours)
news-replay         -1440

Optionally, specify archived logs to be replayed in addition to the current news log

By default, only the headlines in the current news log are replayed. The current news log is cycled daily, and therefore only contains headlines for a maximum of 24 hours, and probably for a smaller period. So, to guarantee replay of at least 24 hour’s worth of headlines, the news log archives will need to be replayed too.

Use the configuration item news-replay-files to specify archived news logs for replay. news-replay-files requires that news log files be listed in a fixed chronological order, which is not compatible with the default settings for log-cycling.

By default, the current news log is cycled daily at 4 a.m. to an archive suffixed with a numeral representing the day the current news log was created (Mon=1, Sun=7). For example, on Tuesday at 4 a.m. the current news log (created on Monday) is archived as news.1 and a new current news log is created as news.log. The archive naming scheme is cyclical: as one week transitions to the next, the previous week’s archives are overwritten with the archives for the new week. The age of an archive relative to other archives changes through the cycle: news.1 (Monday) will be the oldest archive on Monday after 4 a.m. and the youngest archive on Tuesday after 4 a.m.

Before you can use news-replay-files to include archived news logs, you’ll need to use add-log to override the default log-cycling settings for the news log. Here’s an example configuration that lengthens the period between log-cycles and avoids a cyclical naming-scheme by specifying only one archive: news.old

# Cycle the news log every 7 days (10080 minutes) at 4 a.m. (240 minutes)
# Retain only one archive (news.old)
add-log
    name               news_log
    time               240
    period             10080
    suffix             .old
end-log

You can now specify a list of news logs, in fixed chronological order, to replay at server startup:

news-replay-files     news.old news.log

See also: