The main goal of the SCA API is to support very complex functionality by providing an easy to use API.

Overview of packages and modules

The SCA comes with several packages such as

: To access any of the functions of a module, the module must first be loaded by using your license (ticket data).

: After the module has been loaded you can create a classifier to analyze/classify the related data objects.

Create a dca::UrlClassification object (using license data)
Create a dca::UrlDbClassifier object
For each dca::Url to classify:
- Create a dca::Url object with your URL string
- Create a dca::UrlClassificationResults object
- Call the dca::UrlDbClassifier::classify() method
- Enumerate the dca::UrlClassificationResults

Create a dca::TextClassification object (using license data)
Create an dca::HtmlTextClassifier object
For each dca::HtmlText to classify
- Create an dca::HtmlText object with your HTML file contents
- Create a dca::TextClassificationResults object
- Call the dca::HtmlTextClassifier::classify() method
- Enumerate the dca::TextClassificationResults

: Result-parsing of the Text Classification or URL Classification is very similar.

: AnyClassificationResults myResults;

dca::FunctionResult fr = myClassifier.classify(object, myResults);

if (!fr) // error occurred

return fr.getReturnCode();

if (!myResults.isCategorized())

return DCA_SUCCESS;

for (DCA_INDEX_TYPE i = 0; i < myResults.size(); ++i) {

AnyClassificationResult myResult = myResults[i];

...

}

Using native C++

: All classes of the SCA API are native C++ classes. Internally they use the SCA DLLs / Shared Objects provided in the distribution.

: No class-hierarchy has been implemented - there are no base classes of classifiers or modules, because the data types they deal with are quite different. But they all use the same look-and-feel, so once you know how to use one classification class, you will easily be able to use another.

Instances

: All instances in the SCA API are implemented as real C++ classes - pointers are not used.

: Real pointers necessary for accessing DLLs and Shared Objects are invisible to the user, and are handled by the API internally as private smart pointers.
// myDca is a real C++ instance

dca::DcaInstance myDca = dca::DcaInstance::create(...)

Auto-Destructor-Cleanup

: Additionally, the auto-destructor cleanup features of C++ are used. Whenever a SCA API object goes out of scope, all related handles, structures and memory are freed without the need for an explicit delete or cleanup call.
main( int argc, char *argv[] )

{

if( argc > 1 ) {

// when going out of scope the myDca instance will be safely destructed and

// all related resources will be freed

dca::DcaInstance myDca = dca::DcaInstance::create(...)

}

}

Operator Overloading

: Variable comparison uses the native C++ comparison operator, and assignments are made using the C++ assignment operator of the related classes - just like you would assume when using native C++ classes.
// C++ class assignment

dca::Category myCategory = myCategories.byId( CAT_ID_URL_PORN );

// C++ class comparison

if( myCategory == NullCategory ) ...

Assignment Operator

: If object A is assigned to another object B of the same type, A is not a copy B, instead it is a reference to B.

: If you change object A, therefore, object B will also be changed, and vice versa.

Collections and Items

: Whenever a collection of items is used (e.g. a ClassificationResults collection) they are handled the same way.

: A collection class is named using plural notation -> Results, Categories, Groups etc.

: An item of a collection is named using singular notation -> Result, Category, Group etc.

To access an item of a collection, the C++ operator [] has been overloaded (alternatively there is an at() function available)
All collections support the size() method
Some collections support convenience functions, e.g. byId() to lookup a specified item by using it's (numeric) id

: Enumeration sample:
const DCA_SIZE_TYPE countOfResults = Results.size();

for( DCA_INDEX_TYPE i = 0; i < countOfResults; ++i ) {

Result aResult = Results[i];

}

Error Handling

: Errors are handled in a C++ fashion, by using either return code classes or exception classes, which are easy to catch using a try...catch block.

: Functions that return a dca::FunctionResult do never throw a SCA exception, but all other functions may throw a SCA exception!

: try {

dca::DcaInstance myDca;

myDca.createXYZ(...) // assume this could raise an exception

}

catch( const dca::ExDca& ex ) {

cout << "error: " << ex.getReturnCode() << " occured" << endl;

}

STL string and container support

: STL strings and containers have been used wherever it is possible and useful.

: std::string myStlString( "www.ibm.com" );

dca::Url myUrl = dca::Url::create( myStlString );

: Whenever it is necessary to perform an asynchronous task, the user must create and start a thread which calls the related SCA functions.

: For best scalability the user can define thread priorities, the threading model etc.

: For a first impression on how this works, take a look at the extended URL sample (samples/urldbsample_extended). This example demonstrates how to create and start up threads to check for SCA updates and download of a URL database, an asynchronous task that uses just two SCA calls (the sample is available in two versions, one for Windows and one for Linux).

: Classes used for classification can be used multithreaded, but the data objects to classify (URLs, emails, HTML text etc) should not be shared among threads. These must be created inside the classifying thread itself.