Question about linking individual TMs

@Andrew_Stephan I believe that you can get better classification results if you use Mnist images and corresponding words , „one“, „two“ etc. here you have two types of data and encode them using 2 different encoders.
By this way, you do fusion of invariant texts with all possible image appearances and should get better result

1 Like

This was actually my intention at first, but when I finished the GloVe word-vector encoder I discovered that the words “one”, “two”, etc are actually nearly identical according to the GloVe metric. This makes it a poor representation for training on something like this, where every word fits into the single narrow category of “the first 10 digits” and I never need to encode anything else. I may come back to this after finding a different word encoder that is better with narrow sets.

I did try something similar to that–combining the invariant scalar label encoding with the images. I haven’t done much testing on its ability to recognize new digits, since what I’m actually interested in is whether the system can take the scalar label and produce an image to match. Image generation rather than image recognition. (See my last post: Input-invariant representations )