For the Custom Database Module you can use your own categories, groups and locales.
Custom Categories are defined in a categories XML schema.
Categories XML Schema
- A schema consists of several XML files, describing the relations of categories to groups.
- There is also a mechanism to provide multiple translations of categories and groups.
- Note
- All references to id or groupid assumes a minimum id of 0 and a maximum id of 255.
-
Do not delete any already used id. The translation for an unused group or category id can simply be left empty to prevent the category being returned in a classification result.
Categories XML: Locales
- Locales define what translations for groups and categories are available for your schema.
- Locales are specified in the file locales.xml located in your categories folder.
- The XML file contains a collection of locales nodes with a number of locale nodes.
- Each locale node consists of
- languageid is the international code of the language to be used e.g. en for English, en_US for US English, de for German etc.
- displayname is the locale name of the language using also non-ASCII chars e.g. UTF-8 characters.
- For each languageid, a file with specific translations will be loaded that is named like: locale_<languageid>.xml
- Examples:
- locale_en.xml - Contains the English translations for the category and group names
- locale_de.xml - Contains the German translations.
- Sample locales.xml
<?xml version='1.0' encoding='utf-8'?>
<locales version="6.00000000" date="31:08:2009" >
<locale languageid="de" displayname="Deutsch" />
<locale languageid="en" displayname="English" />
<locale languageid="en_US" displayname="English-US" />
</locales>
Categories XML: Groups
- Groups consist of a set of categories. Each group consists of one or more categories.
- Groups are specified in the file groups.xml located in your categories folder.
- The groups.xml file contains a collection of groups nodes with a number of group nodes.
- Each group node consists of
- id is the unique id of the group.
- Sample groups.xml
<?xml version='1.0' encoding='utf-8'?>
<groups>
<group id="0" />
<group id="1" />
<group id="2" />
</groups>
Categories XML: Categories
- Categories consist of a set of category nodes. Each category has an unique id and is related to one single group.
- Categories are specified in the file categories.xml located in your categories folder.
- The categories.xml file contains a collection of categories with a number of category nodes. Each category node consists of
- id is the unique id of the category.
- groupid is the unique id of the group this category is related to.
- Sample categories.xml
<?xml version='1.0' encoding='utf-8'?>
<categories>
<category id="0" groupid="0" />
<category id="1" groupid="0" />
<category id="2" groupid="1" />
<category id="3" groupid="1" />
<category id="4" groupid="2" />
</categories>
Locale
- For each of the locale XMLs there is a mixed collection of group and category nodes together with the translations for the locale.
- A locale node consists of
- languageid and displayname are the same used in the locales.xml
- A group node consists of
- id is the unique id of the group.
- displayname is the UTF-8 translation string for the given group (id)
- A category node consists of
- id is the unique id of the category.
- displayname is the UTF-8 translation string for the given group (id)
- Sample locale_en.xml
<?xml version='1.0' encoding='utf-8'?>
<locale languageid="en" displayname="English" >
<group id="0" displayname="Pornography / Nudity" />
<group id="1" displayname="Ordering" />
<group id="2" displayname="Society / Education / Religion" />
<category id="0" displayname="Pornography" />
<category id="1" displayname="Erotic / Sex" />
<category id="2" displayname="Shopping" />
<category id="3" displayname="Auctions / Classified Ads" />
<category id="4" displayname="Education" />
</locale>
Summary
- To support a complete set of custom categories (like the given samples), you will need the following files:
-
locales.xml
-
locale_en.xml
-
locale_en_US.xml
-
locale_de.xml
-
groups.xml
-
categories.xml
- We provide the following samples for the use of Custom Database.
- customdbsamples/createdbsample show how to start an application without initially having a custom database, then how to create it and how to add and remove URLs from the custom database. It also demonstrates how to use a UrlCustomDb and a UrlDbClassifier to retrieve back the categorization results.
- customdbsamples/customdbsample shows how to overwrite the standard categories used by the SCA internal with a custom URL database.
- customdbsamples/customdbsample_extended shows how to use a set of custom categories with a custom URL database.
- See also
- Custom Database Module