url_samples: urldbsample_remote


Introduction

Shows how to perform URL Classification using a remote URL database.

Syntax:

urldbsample_remote <dca-redist-folder> <ticket> <product> 
        <hexstring-encryption-data> <encryption-key> <url-list-file> [<locale>] 
        [<log-level>]
hexstring-encryption-data
the encryption data (given as hex formatted string) as included in your license (See enduser download homepage).
encryption-key
the encryption key as included in your license (See enduser download homepage).
url-list-file
file that contains the URLs to classify (one per line)
locale
optional locale string that is used for categories names, defaults to en_US
log-level
optional log-level, defaults to 3 (LOG_Notice)

Workflow:

See also:

Files

file  url_samples/urldbsample_remote/main.cpp
 

URL Classification using a remote URL database sample program.


Defines

#define DCA_BINDIR   "bin/linux"
 DCA subdirectory of the DCA binaries.
#define DCA_INITDIR   "init"
 DCA subdirectory of the DCA initialization data.
#define DCA_LOGDIR   "./logs"
 Relative directory for logfile(s).

Functions

static void SetupInitData (const std::string &redist_folder, InitData &initData)
 Sets up the given initData by substituting the given redist_folder with DCA subdirectories.
static bool StartupLibraries ()
 Initializes 3rd party library libcurl and set up open ssl callbacks to startdard implementation.
static void ShutdownLibraries ()
 Shuts down 3rd party libraries. On Windows also WSACleanup is called to shutdown Windows sockets for this process.
static void SetupLicense (const std::string &ticket, const std::string &product, LicenseData &licenseData)
 Sets up the given licenseData by copying the given ticket and product strings.
static bool SetupConnectionData (const std::string &encData, const std::string &encKey, DbConnectionData &cData)
 Sets up the given cData to use a remote URL database.
static void PrintResults (const CategoriesInfo &catinfos, const UrlClassificationResults &cats)
 Prints out the classification results and uses the categories info for textual representation of the matched categories.
static void PrintToolHeader ()
 Prints out the name and the version of this sample.
static void PrintUsage ()
 Prints out the syntax of the sample.
static void PrintDbConnectionInfo (const DbConnection &aDbConnection)
 Prints out the version and datestamp of the remote database.
static void PrintLicenseInfo (const License &aLicense)
 Prints out the information about the provided License.
static void LoadUrlFile (const std::string &fileName, std::vector< std::string > &urlList)
 Loads given fileName and puts each line to given urlList (by deleting trailing CRLFs).
void TestUrlClassification (const std::string &aUrlListFile, const DcaInstance &myDca, const UrlDbClassifier &myUrlDbClassifier, const CategoriesInfo &myCategoriesInfo)
 Performs the URL database classification with URLs contained in a given text file.
std::string HexToString (const std::string &arg)
 Takes the provided hexstring and returns it as decoded string. If you supply a common string (not started with 0x) this is returned without modifications.
int main (int argc, char *argv[])
 The main routine.

Variables

const std::string S_UsageString
 Usage string, displayed if a parameter is missing.

Function Documentation

static void SetupInitData ( const std::string &  redist_folder,
InitData initData 
) [static]

Sets up the given initData by substituting the given redist_folder with DCA subdirectories.

Parameters:
[in] redist_folder This is the folder where the DCA has been installed to (assuming trailing fileslash)
[out] initData The InitData structure to set up
Note:
Only DCA_BINDIR differs between Windows and Linux
The directory ./logs is used for the logfile(s)

Definition at line 139 of file url_samples/urldbsample_remote/main.cpp.

static bool StartupLibraries (  )  [static]

Initializes 3rd party library libcurl and set up open ssl callbacks to startdard implementation.

On Windows its necessary to initalize Windows sockets to support IP(v6) addresses as input data.

Returns:
true if nothing fails, false only on Windows if WSAStartup returned an error

Definition at line 158 of file url_samples/urldbsample_remote/main.cpp.

static void SetupLicense ( const std::string &  ticket,
const std::string &  product,
LicenseData licenseData 
) [static]

Sets up the given licenseData by copying the given ticket and product strings.

Parameters:
[in] ticket This is the ticket data as provided with your DCA license
[in] product This is the product shortcut e.g. DC oder MS etc
[out] licenseData The LicenseData structure to set up

Definition at line 206 of file url_samples/urldbsample_remote/main.cpp.

static bool SetupConnectionData ( const std::string &  encData,
const std::string &  encKey,
DbConnectionData cData 
) [static]

Sets up the given cData to use a remote URL database.

Parameters:
[in] encData The encryption data to be used for the remote URL database server
[in] encKey The encryption key to be used together with the encData
[out] cData The DbConnectionData structure to set up
Returns:
false if given encKey is not convertable to an int

Definition at line 222 of file url_samples/urldbsample_remote/main.cpp.

static void PrintResults ( const CategoriesInfo catinfos,
const UrlClassificationResults cats 
) [static]

Prints out the classification results and uses the categories info for textual representation of the matched categories.

Parameters:
[in] catinfos The CategoriesInfo class associated with the given URL database
[in] cats The results of a URL classification

Definition at line 246 of file url_samples/urldbsample_remote/main.cpp.

static void PrintDbConnectionInfo ( const DbConnection aDbConnection  )  [static]

Prints out the version and datestamp of the remote database.

Parameters:
[in] aDbConnection The database connection for which a version should be displayed.

Definition at line 298 of file url_samples/urldbsample_remote/main.cpp.

static void PrintLicenseInfo ( const License aLicense  )  [static]

Prints out the information about the provided License.

Parameters:
[in] aLicense The license for which information should be displayed.

Definition at line 312 of file url_samples/urldbsample_remote/main.cpp.

static void LoadUrlFile ( const std::string &  fileName,
std::vector< std::string > &  urlList 
) [static]

Loads given fileName and puts each line to given urlList (by deleting trailing CRLFs).

Parameters:
[in] fileName The file that contains the input URLs.
[in] urlList The list to be filled with the URLs found in fileName.

Definition at line 341 of file url_samples/urldbsample_remote/main.cpp.

void TestUrlClassification ( const std::string &  aUrlListFile,
const DcaInstance myDca,
const UrlDbClassifier myUrlDbClassifier,
const CategoriesInfo myCategoriesInfo 
)

Performs the URL database classification with URLs contained in a given text file.

The given aUrlListFile contains one URL per line. The URLs will be added to a vector and for each URL a URL database classification is invoked. The results are printed out by using the PrintResults() function.

Parameters:
[in] aUrlListFile The file that contains the input URLs
[in] myDca A valid set up DCA Instance
[in] myUrlDbClassifier A valid set up UrlDbClassifier
[in] myCategoriesInfo A valid set up CategoriesInfo
Note:
The results of a classification returns either "URL is unknown", "not categorized" or a "set of matched categories".
When creating a multi-threaded application you can include something similar to this function into each thread's worker function.

Definition at line 378 of file url_samples/urldbsample_remote/main.cpp.

std::string HexToString ( const std::string &  arg  ) 

Takes the provided hexstring and returns it as decoded string. If you supply a common string (not started with 0x) this is returned without modifications.

Parameters:
arg The argument to strip the quotations marks from
Returns:
stripped argument

Definition at line 466 of file url_samples/urldbsample_remote/main.cpp.

int main ( int  argc,
char *  argv[] 
)

The main routine.

Parameters:
[in] argc The count of arguments provided
[in] argv An array of provided arguments
Returns:
5 on usage error, 10 on exception and internal error and 0 on success

Definition at line 493 of file url_samples/urldbsample_remote/main.cpp.


Variable Documentation

const std::string S_UsageString
Initial value:
"<dca-redist-folder> <ticket> <product> <encryption-data> <encryption-key> "
"<url-list-file> [<locale>] [<log-level>]\n"
        "  dca-redist-folder - the folder where the DCA is installed to\n"
        "  ticket - a valid ticket\n"
        "  product - the product associated with your ticket\n"
        "  hex-encryption-data - the encryption data (as hex string) included in "
"your license\n"
        "  encryption-key - the encryption key included in your license\n"
        "  url-list-file - file that includes the URLs to classify\n"
        "  locale - optional locale for the categories names, default = en_US\n"
        "  log-level - optional log-level, default = 3 (LOG_Notice)\n\n"

Usage string, displayed if a parameter is missing.

Definition at line 94 of file url_samples/urldbsample_remote/main.cpp.


Generated on 26 Sep 2016 for dca_interface by  doxygen 1.6.1