I’m sure that it’s been discussed before, yet it’s rather difficult to find one’s way in this forum. Specially if you’re new like me. So I have two basic questions, which I’m pretty sure should have been talked about before.
My first question is about Spatial Pooler. Given HTM wants to exactly mimic our brain, why does then columns have such a wide receptor field? I’ve noticed that each column is somehow connected to a random subsample of the input space. I understand that this is in part due to easy SDR generation, the fact that encoders usually don’t create SDRs but they are created at the level of spatial pooler. My objection to this is on encoders, they should somehow act like convolutional neural networks. In a sense, encoders should screen the input holistically and create the SDR themselves. I feel, as if these encoders where somehow designed to fit text processing, while in fact, most of our knowledge is in domain of visual processing.
My second question is on the topic of temporal memory. Now, I get the sequence memory (and the beautiful implementation of SDR in the temporal memory itself), I also see how can this be interpreted in the context of STPD or any modified STPD in the brain. Yet, how about adding an spatial pooler and temporal pooler on top of the layer one (making it a real hierarchy). After all, theoretically, deep learning is so powerful since it captures hierarchical features. In a sense, adding V2, V3, and so on and so forth. Has there been an attempt to tackle this, how hard or computationally expensive would be such a thing to do? (it can be done, maybe, by connecting each column of a second layer to a subsample of a columns in the first layer, and do the learning there).