Thanks for the great questions @davidra!
Yes, we can scale arbitrarily deep. Normally, we just add layers until the performance (on the task) starts to saturate versus the slowdown due to the added layers. The best depth depends on the task; many tasks are fine with just two layers, most are in the 4-6 range, and a few need up to 8 levels. We'll include details of these hyperparameters with each demo when we release them.
It's little buried in the paper, but we're really excited about using the Feynman Machine in distributed environments, where the bottom few layers run on a low-power client (phone, tablet, Raspberry Pi), and we can dynamically augment it with a stack of upper layers running in the cloud. Inter-region comms are just SDRs, so it's really robust and performant.
In terms of width, we normally start with a guess of 128 (ie 128x128 units per region), and then go bigger or smaller as appropriate (on GPUs we can make pretty big regions). The heuristic is that a region is either too small, or big enough, and we can see a big drop in the capability if we drop below that boundary. We should do a paper on depth/width tradeoffs, thanks for the question.
I believe the regions are stable because of the fixed spatial and temporal sparsity - ahem, density - of the encoder hidden outputs, and because the decoder is used only for learning in the encoder, not in the activation phase. The temporal sparsity is important, and there are a few ways to achieve it, like having activity-dependent decaying biases on each unit, or zeroing the ramping activations after firing. Also, the top-down feedback tends to smooth out the effect of the fast-changing encoder on the downward pass. Finally, the use of local receptive fields and local inhibition tends to limit instability. We've tried a number of connection schemes and rejected the ones that cause runaway activity.
I'll ask @ericlaukien to answer this one in depth, but briefly it's not meant to be an exhaustive demonstration of performance on MNIST itself, but a demo of visual odd-man-out detection, using failure to model the spatiotemporal evolution of the stimulus to indicate a visual anomaly.
The training time usually refers to the network running and learning at full-speed (hundreds of fps on a consumer GPU), not the framerate in the videos, which is slowed right down so we humans can see what's going on.
We test each iteration on more than a dozen tasks to identify how it rates relative to previous designs. Some of the tasks are proxies for standard benchmarks, reduced in size to speed the workflow, others are full-size. Most designs we've tried are either really poor at most things, or pretty good at most things and great at some subset. We'll be pushing out updates about performance on standard tasks over the next few weeks and months.
I understand that point, but we really like the name for many reasons. I genuinely believe that if Feynman had lived, he would have discovered this connection years ago, as he spent the last decade of his life working with Hopfield, Wolfram and others, on cellular automata, dynamical systems and neural networks. His sister, Joan, who used Takens' Theorem in one of her seminal papers, would surely have been a part of it too - if so, we might have to call it the Feynman-Feynman Machine. She's still working after retiring a few years ago, so perhaps we can ask her.
Yes, one of @ericlaukien's main quality criteria is whether something takes seconds or minutes to train. Any more than a few minutes usually indicates you've gone wrong somewhere. As I said earlier, on many tasks you need quite a big layer size, because you want to have very sparse SDRs but still have many units on, as temporal sparsity requires more active units per SDR. And the only way to run big layers (512x512 or bigger) is on a GPU, the CPU versions are hundreds of times slower.
@riccro is looking at Matlab integration, so people will be able to plug this into their current workflow.
Thanks again for the great questions and comments. It's a pleasure to be able to talk details now after several months of radio silence!