How can I create an Artificial Intelligence system that can learn new words that are out of their initial vocabulary?

Elmarlyn_Calderon · April 9, 2018, 2:25pm

I use cortical.io and other natural language process systems but I can not find a way to make my program learn new words that are outside the initial vocabulary. Learning new words in Real Time gives my system the advantage of being able to expand its database of concepts and apply them to new situations automatically.

rhyolight · April 9, 2018, 2:29pm

Wouldn’t it need to know something about word etymology to accomplish that? That information, as far as I know, is not part of the Cortical IO fingerprint. So you would have to encode word etymology as part of the term’s representation.

Paul_Lamb · April 9, 2018, 11:42pm

It is fairly easy to create SDRs for new words by defining a new word in say 20 best existing keywords. Take the SDRs for all the keywords and score all the indices (if same index is in two words, it gets a score of 2, etc). Then take the top 2% total array size of indices with the highest scores (using a random tie breaker), and discard the rest. Resulting indices make your new SDR.

BTW, if you need an online learning system, you’ll need to change the semantic folding process a bit. As I understand cortical.io implementation, it requires knowing all source text ahead of time. I used eligibility traces to solve that problem.

Elmarlyn_Calderon · April 10, 2018, 12:03am

eligibility traces?

Paul_Lamb · April 10, 2018, 3:19am

This is a concept from reinforcement learning (Google “RL backward TD”). A nice image that is often used to help visualize the concept:

Where the nodes in this case depict streaming inputs, and the current input is being shouted backwards in time. The closer an input is to the megaphone, the louder it can hear.

The basic idea here is that inputs which are nearer to a given input are more likely to have a cause/effect relationship than inputs which are further away. We assume word semantics are based largely on causality (which cortical.io’s algorithm does as well), such that words which appear near to each other often share semantics.

For example, consider sequence A -> B -> C -> D. When forming associations, C will have more weight on D than B, which will have more impact than A. If later we see something like X -> C -> Y -> D, the semantics between C and D can be adjusted to be a tiny bit closer. Rinse and repeat.

Over time, inputs which appear near to each other more often share more semantics than inputs which appear near to each other less often. Inputs that randomly appear near to each other simply generate noise that is overwritten by subsequent random noise. Semantic similarities only coalesce between inputs that appear near to each other consistently.

Topic		Replies	Views
Words to SDR? Engineering	28	4229	November 27, 2018
Creating a support chat AI Applications	17	1667	April 17, 2018
Cortical.io encoder algorithm docs Getting Started	7	2039	May 6, 2019
Cortical IO new video on semantic folding Lounge semantic-folding	5	1234	July 12, 2017
Tools for NLP Engineering nlp , tools , nupic-wiki	3	1932	May 24, 2017

How can I create an Artificial Intelligence system that can learn new words that are out of their initial vocabulary?

Related topics