API Overview

The TextWise Semantic Cloud provides an API for clients and users to actively engage with our core technology - Semantic Signatures. See our Semantic Signatures used for Similarity Search, Categorization, and Concept Tagging in action by trying the demo.

The features of the latest SemanticHacker API include the following:

Similarity Search

Similarity Search is the process of matching documents via their Semantic Signatures to words, phrases, paragraphs or documents via the terms contained in them. Instead of using the terms contained in the text and their frequencies, SemanticHacker API matching uses the dimensions in the document's Semantic Signature and their associated weights. The operation can be viewed as determining the distance between the Semantic Signatures of an item and an entire collection.

In summary, the process is:

  • Finding the semantic dimensions that are shared between two words, phrases, paragraphs or documents.
  • For each shared semantic dimension, calculating a weight for the dimension (a similarity factor) by multiplying its weight in the Semantic Signatures of the two documents by each other.
  • Adding the weights of all dimensions shared by the item and a collection.
  • The result is a matching score. The higher the score, the more similarity in the semantic space.

Learn more about Similarity Search.

Categorization

Categorization of content is performed by analyzing the dimensions and weights of the provided content's Semantic Signature and identifying those dimensions that are most relevant to the signature. Each dimension in a Semantic Signature represents a path in a tree of categories. For example, the dimension "Computers/Internet/On the Web/Weblogs/Resources" points to a category named "Resources" which has the categories "Weblogs", "On the Web", "Internet" and "Computers" as parent or ancestor categories. Learn more about Categorization.

Concept Tags

Concept Tags produced by the Semantic Hacker API can be leveraged to generate metadata or tags for content. These tags are significant-term tags which are derived from the content’s Semantic Signature, a semantic dictionary and the original text. Each tag is accompanied by a weight that describes the relevance of that term to the content. Significant term tags are generated by obtaining a list of words that appear in both the semantic dictionary and the content text for the highest weighted semantic dimensions of the text. To create these tags, a signature is generated for the text, a list of words for that signature’s top dimension is retrieved from the semantic dictionary and this list is filtered using the words that appear in the text. Learn more about Concept Tagging.

Widget & WordPress Plugin

We've created a WordPress Plugin (for bloggers) and a Widget (for content publishers) to enable you to use Similarity Search right away on your blog or Web site, without requiring any programming skills. Learn more about the Widget & WordPress Plugin.

Other Significant Changes

Other changes include enhanced signatures, scalability improvements to the API iteself, and user-level usage reporting on your API utilization.

 

If you implemented the SemanticHacker API prior to December 31, 2008, you must read the Migration Guide to cut-over to the new version of the API. The old API will be available for a short period of time before we will completely remove it. Please be sure you take the minimal steps required to update your application with the new code.