Custom Categories

For the Custom Database Module you can use your own categories, groups and locales.

Custom Categories are defined in a categories XML schema.

Categories XML Schema

A schema consists of several XML files, describing the relations of categories to groups.
There is also a mechanism to provide multiple translations of categories and groups.
Note:
All references to id or groupid assumes a minimum id of 0 and a maximum id of 255.
Do not delete any already used id. The translation for an unused group or category id can simply be left empty to prevent the category being returned in a classification result.

Categories XML: Locales

Locales define what translations for groups and categories are available for your schema.
Locales are specified in the file locales.xml located in your categories folder.
The XML file contains a collection of locales nodes with a number of locale nodes.
Each locale node consists of
  • languageid
  • displayname
languageid is the international code of the language to be used e.g. en for English, en_US for US English, de for German etc.
displayname is the locale name of the language using also non-ASCII chars e.g. UTF-8 characters.
For each languageid, a file with specific translations will be loaded that is named like: locale_<languageid>.xml
Examples:
locale_en.xml - Contains the English translations for the category and group names
locale_de.xml - Contains the German translations.
Sample locales.xml
<?xml version='1.0' encoding='utf-8'?>
<locales version="6.00000000" date="31:08:2009" >
        <locale languageid="de" displayname="Deutsch" />
        <locale languageid="en" displayname="English" />
        <locale languageid="en_US" displayname="English-US" />
</locales>

Categories XML: Groups

Groups consist of a set of categories. Each group consists of one or more categories.
Groups are specified in the file groups.xml located in your categories folder.
The groups.xml file contains a collection of groups nodes with a number of group nodes.
Each group node consists of
  • id
id is the unique id of the group.
Sample groups.xml
<?xml version='1.0' encoding='utf-8'?>
<groups>
        <group id="0" />
        <group id="1" />
        <group id="2" />
</groups>

Categories XML: Categories

Categories consist of a set of category nodes. Each category has an unique id and is related to one single group.
Categories are specified in the file categories.xml located in your categories folder.
The categories.xml file contains a collection of categories with a number of category nodes. Each category node consists of
  • id
  • groupid
id is the unique id of the category.
groupid is the unique id of the group this category is related to.
Sample categories.xml
<?xml version='1.0' encoding='utf-8'?>
<categories>
        <category id="0" groupid="0" />
        <category id="1" groupid="0" />
        <category id="2" groupid="1" />
        <category id="3" groupid="1" />
        <category id="4" groupid="2" />
</categories>

Locale

For each of the locale XMLs there is a mixed collection of group and category nodes together with the translations for the locale.
A locale node consists of
  • languageid
  • displayname
languageid and displayname are the same used in the locales.xml
A group node consists of
  • id
  • displayname
id is the unique id of the group.
displayname is the UTF-8 translation string for the given group (id)
A category node consists of
  • id
  • displayname
id is the unique id of the category.
displayname is the UTF-8 translation string for the given group (id)
Sample locale_en.xml
<?xml version='1.0' encoding='utf-8'?>
<locale languageid="en" displayname="English" >
   <group id="0" displayname="Pornography / Nudity" />
   <group id="1" displayname="Ordering" />
   <group id="2" displayname="Society / Education / Religion" />
   <category id="0" displayname="Pornography" />
   <category id="1" displayname="Erotic / Sex" />
   <category id="2" displayname="Shopping" />
   <category id="3" displayname="Auctions / Classified Ads" />
   <category id="4" displayname="Education" />
</locale>

Summary

To support a complete set of custom categories (like the given samples), you will need the following files:
  • locales.xml
  • locale_en.xml
  • locale_en_US.xml
  • locale_de.xml
  • groups.xml
  • categories.xml
We provide the following samples for the use of Custom Database.
customdbsamples/createdbsample show how to start an application without initially having a custom database, then how to create it and how to add and remove URLs from the custom database. It also demonstrates how to use a UrlCustomDb and a UrlDbClassifier to retrieve back the categorization results.
customdbsamples/customdbsample shows how to overwrite the standard categories used by the SCA internal with a custom URL database.
customdbsamples/customdbsample_extended shows how to use a set of custom categories with a custom URL database.
See also:
Custom Database Module

Generated on 26 Sep 2016 for dca_interface by  doxygen 1.6.1