Categorization
Categories produced by the Semantic Hacker API are based on the Open Directories Project (ODP) categorization scheme.
Categorization of content is performed by analyzing the dimensions and weights of the provided content's Semantic Signature and identifying those dimensions that are most relevant to the signature. In some cases similar dimensions are combined to provide a broader theme. Thus, not all categories have a direct correspondence to the ODP categorization scheme.
Each dimension in a Semantic Signature represents a path in a tree of categories. For example, the dimension "Computers/Internet/On the Web/Weblogs/Resources" points to a category named "Resources" which has the categories "Weblogs", "On the Web", "Internet" and "Computers" as parent or ancestor categories.
The category API call identifies the main topic categories for the input text or URI, ordered by weight. A category's weight indicates how relevant it is to the topics contained in the input.
Examples include:
- 'Sports/Football/American/NFL' and 'Sports/Football/American/Players' for the input URI 'http://www.nfl.com'
- 'Computers/Software/Operating Systems/Linux' for the input URI 'http://www.kernel.org'