Exploring semantic map embeddings

Click for: original source

How do you convey the “meaning” of a word to a computer? Nowadays, the default answer to this question is “use a word embedding”. A typical word embedding, such as GloVe or Word2Vec, represents a given word as a real vector of a few hundred dimensions. By Johannes E. M. Mosig.

Semantic map embeddings are inspired by Francisco Webber’s fascinating work on semantic folding. Our approach is a bit different, but as with Webber’s embeddings, our semantic map embeddings are sparse binary matrices with some interesting properties. In this post, we’ll explore those interesting properties. Then, in Part II of this series, we’ll see how they are made.

Semantic map embedding family overlaps with children

Source: https://blog.rasa.com/exploring-semantic-map-embeddings-1/

The article will help you make sense of:

  • Semantic Map Embeddings of Particular Words
  • Semantic Similarity and the Overlap Score
  • Merging: How to Embed Sentences and Documents

A semantic map embedding of a word is an M ⨉ N sparse binary matrix. We can think of it as a black-and-white image. Each pixel in that image corresponds to a class of contexts in which the word could appear. If the pixel value is 1 (“active”), then the word is common in its associated contexts, and if it is 0 (“inactive”), it is not. Importantly, neighboring pixels correspond to similar context classes! That is, the context of the pixel at position (3,3) is similar to the context of the pixel at (3,4).

You can try our pre-trained semantic maps yourself, using our SemanticMapFeaturizer on the rasa-nlu-examples repo. Good read!

[Read More]

Tags big-data machine-learning cio data-science software