Predictive coding in HTM

The way prediction is supposed to work in HTM seems to be the exact opposite of conventional interpretation, as well that in On Intelligence. Prediction in apical denrites is reinforcing, basically an AND to basal dendrites.

In conventional predictive coding, prediction is supposed to suppress the input, with only prediction error / surprise propagating to the next level. That makes a lot more sense because it implements input compression, which I think is a core principle of cognition.

I guess the surprise in HTM would be when the input from basal dendrites is strong enough to burst-fire the neuron without appical input? But that requires SUB | XOR (appical input, basal input), which is contrary to current model, and I don’t know how it could be implemented?

Edit: article on conventional interpretation:
Quanta Magazine,
Predictive coding - Wikipedia

I often wonder if there’s a way to combine (Rao&Ballard 1999) predictive coding with HTM, using the Neocortex Layer interactions as a guide. I don’t have any good ideas though!
Perhaps predictive coding produces a representation (SDR) and HTM rides on top of that? But HTM lacks a hierarchy and PC provides a hierarchy…. So glue them together… dot dot dot…. Profit!

I get confused between “supervised predictive coding” and “unsupervised predictive coding” but I imagine the unsupervised version is more compatible with HTM.

I was thinking that some apical AND (expectations) is needed to get basal input above the noise level. But when such expectation becomes strong enough, apical input may turn negative, perhaps though those Martinotti cell - Wikipedia s that @dmac mentioned. That will block basal input when the environment is too predictable. But this doesn’t account for negative surprise: strong expectation but no confirmation?