Creating a support chat AI


#1

I’ve been working on an implementation of semantic folding, and I have reached the point where I can generate word SDRs with proper semantics encoded in them (crawling Wikipedia for the input). The process takes days and eats up a ton of disk space for caching while it is running, but it works. I’ve done some comparisons of SDR operations between my SDRs and between cortical.io’s word fingerprints, and get a similar level of usability from them. I am still struggling with topology (the “folding” part of semantic folding), but thought I would build an application that could make use of what I have so far.

The idea for this application will be a “support chat” type of AI, which starts by asking “How can I help you today?”. The user is free to type in whatever they want. The application would then do some type of SDR comparisons to read the semantics of what the user types in, and use that to determine what the user wants to do from a list of possible actions.

My first thought is to create a tree structure in which the outer nodes represent goals. Top level might be “Report a bug”, “Install the application”, and “Leave feedback”. Next level under “Install the application” might be an OS selection: “Windows”, “Mac”, and “Linux”. And so-on down to the actual goal.

Each element in the tree structure would have an associated question that the system could ask. For example, “Which OS will you be installing the application on?” It would try to answer all of the questions itself based on the semantics of what the user typed, using an overlap threshold. Anything under the threshold, the tool would then ask the user. The semantics of what the user types in answer to the question will be used to answer more questions. Once all questions necessary to reach a goal have been answered, the system will perform that action (show a user guide, etc).

Thought I would post here to get some ideas and feedback from the community, and to post my progress on the application. Stay tuned for more info!


#2

Do you mean you make an SDR from the answer, and compare that bit by bit with the SDR’s from several standard answers?


#3

Correct, the answers to the questions will each have an SDR representation (basically a union of the important keywords). This then gets an overlap score with the SDR representing a union of words SDRs that the user typed in.


#4

Are you trying to use HTM sequence memory as well or is it enough if the AI can only deal with user input where the order of the words doesn’t matter?


#5

Possibly, but that comes with the challenge of saying something in a slightly different way having a big impact on the representation.

I’ll probably explore distal input from something like part of speech (noun, verb, adjective, etc), and use temporal pooling to form the representations. Basically feature (word) location (part of speech) pairs representing an object (sentence)


#6

This sounds promising.


#7

Sounds cool. Have a look at question answering in traditional machine learning, you might find some interesting ideas there. In particular I was reminded of memory networks [1] in which answers are arrived upon by recurrent similarity computation. From a review that summarized it:

"Specifically, a learned, dense feature-vector representation of an input query (e.g., ‘where is the milk?’) is used to retrieve the sentence with the most similar feature vector in the database (e.g., ‘Joe left the milk’): a combined feature representation of the initial query and retrieved sentence is then used to identify similar sentences earlier in the story (‘Joe traveled to the office’); this process iterates until a response is emitted by the network (‘the office’). "


[1] Weston, Jason, Sumit Chopra, and Antoine Bordes. “Memory networks.” arXiv preprint arXiv:1410.3916 (2014).


#8

Thinking about this a little more, simply classifying the words at a high level like noun vs verb may not be much of a benefit. It would need to dig into how the words are used in the sentence. For example, take this silly example:

Question which the AI needs to answer: "Who got hit?"
What the user typed: “OMG, Bob just smacked Tom upside the head!”

Since Bob, Tom, and head are all nouns, that classification alone would not help to answer the question. On the other hand, if I were to classify Bob as the subject of the sentence, and Tom as the object of the sentence, and “head” as the object of a preposition, that plus the semantic similarity between verbs “hit” and “smacked” would be enough to answer the question.


#9

Quora has a problem what looks a but like your problem. They want some algorithm that checks if questions are the same. Questions and answers are different, but maybe you could use their dataset to test you own program. you can find it at: https://data.quora.com/First-Quora-Dataset-Release-Question-Pairs


#10

The main purpose for my project is not necessarily to solve a problem, but to utilize word semantics in a useful application. Does Quora utilize SDRs, or is it more of a lookup table type of framework?

EDIT: Actually, I get your point – this could be used to see how accurate the tool is at determining if things are the same.


#11

If you can build something with the help of HTM and/or Cortical.io, that is able to compare sentences and decide if they have the same semantic meaning, you solve a lot of problems.


#12

Sounds Great !!! It would even more great if you could combine semantic folding with HTM pattern sequence learner in a way that it could predict the best AI answer.
I’m starting my Master degree thesis on comparing Semantic Folding with other state of the art approaches (word2vec, fasttext) when dealing with several NLP tasks. So It’s good to hear that someone is working on SM too.


#13

To me this is more of an “object recognition” (SMI) problem than a “sequence memory” (TM ) problem. Language has a structure to it – of course order is important, but I can say grammatically the same thing in many different orders. Additionally, I can insert modifiers for expressiveness which would completely change a “sequence” in TM but would merely add additional “features” in SMI.

Anyway, I will give this some proper exploration – I have a feeling that the simple “word bag” approach will not scale very well.


#14

I agree here. I think this round of research is going to enable some cool things in the NLP space. We have to start thinking at the “object” and “thing” and “idea” level to take representations further. To understand when someone says “dog” in a sentence, you must understand the basic idea of “dog” and have a representation for “dog”. That representation comes from many different sensors. Each sensor contributes to the idea of “dog”, and when you hear a woof!, the auditory sense triggers the idea of dog to emerge because those neurons in other regions start becoming predictive.


#15

I had a chance to participate in a Hackathon this week, which was judged by some of the top brass at Ericsson, including the senior VP and head of North America. My team’s hack involved an implementation of semantic folding to learn domain-specific vocabulary, and a support chat bot with some of the ideas discussed here. We won the award for coolest hack. It was also a great opportunity for me to plug cortical.io and Numenta :slight_smile:


#17

Maybe I’m missing context, but what (if anything) would you be using for that initial part-of-speech (referred to as POS) classification?

If you wouldn’t mind going down the rabit hole, I’ve used spacy.io recently (uses deep learning) for a client’s chatbot, and it performs fairly well at classifying input paragraphs.

A pooler could then act as the symantic glue on top of the POS system? Not terribly biological, but would be an interesting design experiment. I’d be curious as to the results of such a system.


#18

Nothing actually. This was just my initial reaction after the thought occurred to me that sentences are more like objects than they are like sequences.

This is ultimately the route I went. Grammatic use of a word in the sentence provides a lot of capability. These make up “locations” and the words make up “features”. This gives a sentence a useful fingerprint that can be compared with other semantically similar sentences.


#19

Another advantage of having multi sensory input for a concept is that, hopefully, it would allow flexible learning using fewer example objects, right? For a deep learning object classification system, it takes 10’s of thousands of examples to make a robust system… millions of examples is even better, but still produces brittle systems that can be easily fooled by bit-flipping. It’s also difficult to implement the different DL architecture types reliably together, in addition to being memory and processing intensive.

For practical AI, we’d need systems that use a lot less energy, memory, and compute. Not everything can (or should) be cloud hosted or controlled by a single powerful entity. In my mind, flexible, noise-tolerant HTM-based systems could be a potential alternative to that.