Encoder for convert image to binary matrix

breznak · August 7, 2020, 7:16am

This is a very good point. The encoder we have so far is more low-level and imho only models the nerve signal at a single saccade.
It is based on cv2.bioinspired.Retina which models peripheral and foveal vision, sending spatio-temporal signals.

As you mention, and we have it in TODO too, is the saccadic movements and actually “seeing at a bigger picture” where we need to describe an object as a (relative) set of features, ie “2 eyes, a nose, mouth → face”.

It is interesting to note that modern “deeplearning vision models” have converged to this from the practical point of view - the networks train feature extraction and location module (draws bounding box where the feature is) at the same time.

My understanding, or simplified model on how saccades could be implemented (in HTM) is a “cropping problem”. Say we train on MNIST, but then use digits on larger background.
A simple model would

crop a portion of the image
ask SP “do you recognize this?” (classify SP’s SDR output)
(randomly) repeate until found.
That is the “significant points” recognition part.

The other part would be

describing an object as a set of these features (simply union of SDRs?)
encoding relative positions between the features (can GridCell encoder do that?), this is what CapsuleNets do.

Topic		Replies	Views
Why do we need binary representation(encoder) and can we directly not create SDRs? NuPIC sequence-memory , spatial-pooling , encoders	5	701	June 23, 2020
Encoder and Spatial Pooler Confusion Getting Started	17	898	April 5, 2019
How to encode images and other visual data for HTM system Engineering	5	945	July 14, 2020
Squeezing more from HTM Engineering	12	744	June 22, 2019
The HTM Spatial Pooler: a neocortical algorithm for online sparse distributed coding Related Papers	49	4501	November 25, 2019

Encoder for convert image to binary matrix

Related topics