Connecting Hinton's capsules to Numenta research

rhyolight · June 17, 2019, 2:39pm

This morning, Marcus is planning on discussing capsules on the whiteboard, connecting them to our work.

Here are 3 Hinton capsules papers and 1 talk.

2011 Paper: http://www.cs.toronto.edu/~hinton/absps/transauto6.pdf
2017 Paper: http://www.cs.toronto.edu/~hinton/absps/DynamicRouting.pdf
2018 Paper: http://www.cs.toronto.edu/~hinton/absps/EMcapsules.pdf
2014 Talk: https://www.youtube.com/watch?v=rTawFwUvnLE

1_uItEGzY1I9NK6hl1u4hPYg

Watch live at 10AM PDT on https://www.twitch.tv/rhyolight_. I will post the video here afterwards as always.

rhyolight · June 17, 2019, 5:03pm

mpihlstrom · June 18, 2019, 6:43am

Nice to see this addressed by you guys. You didn’t mention what Hinton calls doing “inverse graphics”, it’s in the video link I think. Hinton seems to have the idea of objects being represented in some archetypical form by incorporating the spatial prior into the network architecture. The 4x4 dimensionality of the matrices are for sure inspired by the affine transformation matrices of 3D space, though it’s never motivated as such anywhere in the paper. I suppose it’s a more constrained computation than what you are proposing is possible with the grid cell like stuff in the minicolumns. Anyway, there are many similarities, and I think it’s exciting to see this kind of bridging between your approaches and fields, with none less than Hinton on the other end.

bkaz · June 18, 2019, 11:27am

Just came out: https://arxiv.org/abs/1906.06818
“An object can be seen as a geometrically organized set of interrelated parts. A system that makes explicit use of these geometric relationships to recognize objects should be naturally robust to changes in viewpoint, because the intrinsic geometric relationships are viewpoint-invariant. We describe an unsupervised version of capsule networks, in which a neural encoder, which looks at all of the parts, is used to infer the presence and poses of object capsules. The encoder is trained by backpropagating through a decoder, which predicts the pose of each already discovered part using a mixture of pose predictions. The parts are discovered directly from an image, in a similar manner, by using a neural encoder, which infers parts and their affine transformations. The corresponding decoder models each image pixel as a mixture of predictions made by affine-transformed parts. We learn object- and their part-capsules on unlabeled data, and then cluster the vectors of presences of object capsules. When told the names of these clusters, we achieve state-of-the-art results for unsupervised classification on SVHN (55%) and near state-of-the-art on MNIST (98.5%).”

Paul_Lamb · November 20, 2019, 7:13pm

I’m re-watching a lot of these research meetings again, and I noticed something interesting in this one that I had glossed over the first time through. Marcus talks about it a bit starting around 7:02. Hinton’s 2017 paper discusses it in section 5.

A capsule is able to learn on its own different ways of spanning the space of variations in the way a given digit is drawn (in the case of MNIST). By perturbing these different dimensions, you can see that the capsules learn interesting things like scale, thickness, scew, etc. as well as more abstract distortions.

Personally, I have been focused on the idea of pooling and making associations, but this extracting of properties/dimensions seems intuitively to me that it must also be a core part of what the cortical circuitry is doing as well.

I know @Bitking has mentioned something related to this as well on a few occasions. For example:

Has anyone given some thought to an HTM-compatible algorithm for this type of extraction process?

pulinagrawal · November 20, 2019, 8:12pm

Exactly, that is what HTM is doing, just unsupervised and the properties/dimensions are allowed to be more abstract and do not have to map to visual properties like scale, skew or thickness.

Paul_Lamb · November 20, 2019, 8:26pm

Could you elaborate? I am not aware of where this functionality currently exists in the HTM algorithms (or in the theories that I have seen discussed so far)

bkaz · November 20, 2019, 8:26pm

I don’t think there is a general way to define “instantiation parameters” in CapsNet, this is mostly application-specific. In my model, they are derived by cross-comparing parameters of input capsules (my patterns).

Also, their “object” is defined as a recurring configuration of different parts. But such recurrence can’t be assumed, it should be derived by cross-comparing relative position among parts of matching objects. Which can only be done after their positions are cross-compared, which is after their objects are cross-compared: two levels above the level that forms initial objects. So, objects formed by positional equivariance would be secondary. But they may be stronger, displacing initial similarity-defined objects as a primary representation of the same parts.

Paul_Lamb · November 20, 2019, 8:34pm

I am more talking from a higher level of abstraction – does the idea of “unsupervised” extraction of different dimensions/spaces of variation among concepts (I do realize there aren’t really discreet “concepts” in HTM, but it is difficult to word this in a way which takes a continuum into account) seem like a core function that the cortical circuitry should be doing? If so, has anyone thought about this from an HTM perspective?

bkaz · November 20, 2019, 8:42pm

As far as I understand, HTM and basic neuronal models only work in a positive fashion, detecting coincidences. Detecting “variation” and “equivariance” would require inverse operations: subtraction, division, etc. I think to do that you have to model deeply coupled neuron-interneuron systems.

Paul_Lamb · November 20, 2019, 9:49pm

This is a good observation. It is clear that basic SDR math supports these types of operations.

For example, I have been working with Cortical IO technology a lot recently, and one thing you can do with a word representation is extract a list of semantic categories it can exist in (which is a similar class of problem to extracting dimensions in the above scenario). You start with the closest match, subtracting those bits, then take the closest match to the resulting representation, subtracting those bits, etc. Repeat until all the bits have been subtracted, and you end up with a nice list of contexts.

Probably need to think about biologically plausible circuitry for executing these types of subtractive operations. My initial thought is that one place this might potentially happen is around the borders of neighboring grids when they encounter each other as waves of activity spread out and re-integrate.

bkaz · November 20, 2019, 10:50pm

I think you are talking about synapse pruning in SDR, for that anti-Hebbian learning should work fine. I meant deriving grey-scale differences or ratios between specific inputs, which could become an output. Yes, it could be part of grid interactions, where primary output is input-driven and secondary / delayed output is driven by lateral inhibition. The problem is, that secondary output won’t be “signed”, so you may need two of them, one for each sign.

Bitking · November 20, 2019, 10:54pm

The signed input is not that far out of line with what is actually in the sensory stream. Both light and dark spots and lines have responsive cells. With this in mind, it is not a stretch to extend that through the processing streams.

bkaz · November 20, 2019, 11:19pm

I guess you already worked out all that in your grid models

Topic		Replies	Views
Jeff's Sensorimotor multiple view theory like Hinton's capsule intuition? Numenta Theory sensorimotor	20	5561	January 31, 2018
Dynamic Routing Between Capsules (new paper from Hinton) Lounge capsules , hinton	2	1411	November 15, 2017
Capsule Network General Neuroscience	2	994	January 31, 2018
Hinton's capsule and what is wrong with convolutional neural nets Tangential Theories	2	1577	October 3, 2017
Geoffrey Hinton 's "Capsule" Neural Net is SIMILAR to HTM? Numenta Theory question	3	1685	August 26, 2017

Connecting Hinton's capsules to Numenta research

Related topics