Content and Engine Updates, and how to implement the required tasks

The SCA supports two types of updates:

Content Updates

These include updates for the URL or mail signature database, category updates and updates for other classifiers and modules. The updates are provided by our update servers.

In general the updates are applied internally inside the SCA and will be used immediately.

Note:
Please note that the categories XML schemata can also be updated, for instance if new categories or locales are supported for a classifier. Never hardcode the available categories and locales in your source code. To get always the current categories XML schemata for the applied classifiers use the function dca::DcaInstance::getCategoriesInfo.

Engine Updates

These include updates to binary modules (Shared Objects) and also may contain files in the init directory.
Generally this type of update will be applied only after the SCA is re-initialized.

How to implement updates in a client application

There are two tasks that are related to the update application mechanism of the SCA:

Each of those tasks are implemented independently in the SCA, and have to be called from two dedicated threads.

This also guarantees that smaller (but higher prioritized) updates are applied immediately, even when a large database update is currently being downloaded.

The two threads must call two different SCA functions in a loop.

If a task fails, an error code is returned to the calling thread and the SCA needs to be restarted, or at least the affected classes have to be re-initialized.
This would happen very rarely and only under exceptional circumstances, as errors related to updates and scheduling are trapped and handled internally.

We provide working samples how to implement this in detail. Please refer to samples/urldbsample_extended or samples/customdbsample_extended. These examples implement both tasks by using two dedicated threads.

Update Task

The update task checks for available updates for files related to all currently initialized classifiers and modules, and downloads any updates found. Additionally, the update task uploads the collected unknown URLs if the Feedback mechanism feature in a URL Classifier is turned on.

If an update is very large e.g. a URL or mail signature database update, the update file will not be downloaded immediately, but the download will be scheduled.
Whenever the Scheduler Task is performed, the scheduled download will be started.

Note:
Updates available for non-initialized classifiers or modules will not be downloaded.

Database updates may become very large when the current database is very old. In this case the update server may suggest that a complete new database should be downloaded, instead of many update files. This will also be done in the background via the scheduler task.

For a regular update file the update task will download the file and will apply it immediately. Information on downloaded and installed updates can be examined by using the return structure of the API function dca::UpdateModule::performUpdate(). This also contains information regarding whether or not the updates have been applied immediately and whether the SCA needs to be re-initialized.

Scheduler Task

This task downloads the large update files and applies them in the background. The frequent database updates downloaded by the update task are installed inside the scheduler task.
Additionally, this tasks performs the merge process for a Custom Database (see Custom Database Module) and local databases (URL and mail).
Implementing the scheduler task is necessary whenever you want to use

URL

classification functions. If the scheduler task is not implemented, the signature databases will not be updated.

Schedule Subscribers

A schedule subscriber is an object which receives notifications when certain schedule tasks begin or end, or when new progress information is available. A schedule subscriber object can be passed to the dca::DcaInstance::schedule() function as an optional parameter.

To implement a schedule subscriber, derive a new class from the abstract class dca::ScheduleEventSubscriberIntf and implement the dca::ScheduleEventSubscriberIntf::onEvent() method.

Sample pseudo implementation of the Update and Scheduler tasks

Sample of a worker function of an dedicated update thread

void updateTask( void *myData )
{
        const MyDcaContext *myDcaContext = 
                reinterprete_cast< MyDcaContext *>( myData );

        if( myDcaContext ) {
                while( !myDcaContext->inShutdown ) {
                        dca::UpdateResults myUpdateResults;

                        dca::FunctionResult fr = 
                                myDcaContext->updateModule.performUpdate( false, 
                                        myUpdateResults );
                                        
                        if( !fr ) {
                                // got an error... try to re-initialize the DCA
                                return;
                        }
                        // enumerate myUpdateResults...
                        sleep( 1000 ); // wait 1 second until next call to performUpdate()
                }
        }
}

Sample of a worker function of an dedicated scheduler thread

// subscriber to catch the events of the DcaInstance::schedule function
class MyScheduleEventSubscriber : public dca::ScheduleEventSubscriberIntf
{
public:
        MyScheduleEventSubscriber() { }
        virtual ~MyScheduleEventSubscriber() { }
        
        virtual void onEvent( dca::ScheduleActionType actionType, 
                dca::ScheduleModuleId moduleId, const std::string& version, 
                const std::string& text )
        {
                // Handle the event...
        }
};

void schedulerTask( void *myData )
{
        const MyDcaContext *myDcaContext = 
                reinterprete_cast< MyDcaContext *>( myData );

        if( myDcaContext ) {
                while( !myDcaContext->inShutdown ) {
                        MyScheduleEventSubscriber mySubscriber;
                        dca::FunctionResult fr = 
                                myDcaContext->dcaInstance.schedule( &mySubscriber );
                        if( !fr ) {
                                // got an error... try to re-initialize the DCA
                                return;
                        }
                        sleep( 5000 ); // wait some seconds until next call to schedule()
                }
        }
}

Generated on 26 Sep 2016 for dca_interface by  doxygen 1.6.1