I discussed an idea here where the HTM stuff is divorced from the DL stuff. To train the DL, you still need to have some sort of error measurement in order to find your gradients (unless you’re trying to just do some regression from a trained network).
What are you intending this combined system to be used for? If it’s classification, you’d want to add a softmax on top of it all during training. You’d have to consider at what point you would be updating your HTM components… At the end of each forward pass (when you get your y^ value)? End of each batch?
As a simple test, make your batchsize one, then update your pools at that time, dependent upon the error level of your y^ value.
Slightly unrelated, but perhaps worth mentioning if you’re feeling like mixing DL and HTM, I think Mark Brown has floated the idea before of using anomaly detection as a means of guiding DL training. That is something worth looking into as well.