I don't have any knowledge of neuroscience myself, but I am experimenting with this particular topic and can talk about how I am approaching the problem.
The approach I am using, is to use temporal memory to remember that a certain action in a certain context leads to a negative outcome. In my implementation I have separated the cells which represent actions into a second layer, distinct from the layer which does normal temporal memory on sensory input. And I have separated cells which represent reinforcement (positive or negative) into a third layer, distinct from the other two.
The second layer grows distal connections with cells in the first layer (versus cells in its own layer), and the third layer grows distal connections with cells in the second layer. In this way, the first layer represents the sensory context and predicts future sensory input, second layer represents motor commands in the current context and predicts next motor commands, and third layer represents the current reinforcement and can be used to predict the positivity/ negativity of an action in the current context.
This is certainly different than how it works in neuroscience. In particular, the only way to make "positivity/ negativity" predictions with the third layer is to manually activate columns representing motor commands in the second layer to predict how good or bad those motor commands might be in the current context. I'm sure neurscience will provide an elegant solution to this problem, but for now this is a relatively simple way to use pure high-order sequence memory to remember how good or bad something is.
If you are not interested in motor commands being part of the system (i.e. if you just want a purely predictive implementation), you could do away with the second layer, and grow distal connections from the third layer to the first layer instead. This would eliminate the most biologically infeasible element of the design, but it could no longer be used to execute actions. It could be quite useful in an anomaly detection system, though, in which the system can not only identify anomalous input, but also learn how "bad" the anomaly is and predict it in the future -- for example, are we about to have minor latency, crippling latency, or is the server about to go down?