STREAMER: Streaming Representation Learning and Event Segmentation in a Hierarchical Manner A

Happened to me many times in the past; it’s only fair I do it to someone. :sweat_smile:

In all seriousness though, STREAMER is an early attempt to explore hierarchical structures in data, but extending this to a more general architecture for multimodal sensorimotor integration is the next major step. Even reinforcement learning would benefit from a hierarchical structure as it would simplify the credit assignment problem by considering higher-level state-action pairs, leading to more sample efficiency. Many routes to explore from here :grinning:.

1 Like

The intensity part of any information is created by the model to remove noise(i don’t know how to do that, i have a theory but not tested it yet). From removing noise the information reduced to important parts (with some exceptions). This important parts are again reduced essential parts of the information (again not tested). By storing how these parts are reduced, the model can predict the incoming new information. Because every information has millions of minute similarities which helps for the prediction mechanism.

The model will solve the millions of sub-problems of the big problem.


I see, the biggest challenge here would be how to remove the noise and redundant information in a sequence such that it predicts the future observation better. Instead of focusing on removing the noise to predict better, perhaps we can first try to predict better which in turn will force the encoder to remove noise in the process. The premise is, if we improve the prediction, the encoder will have learned to only keep high-intensity parts of the input that are useful for prediction. Then we send those noise-free parts to the layer above where the same predictive process is applied at a higher order. The following is from the paper:

Theories of cognition hypothesize that the brain only extracts and selects features from the previous context that help in minimizing future prediction errors, thus making the sensory cortex optimized for prediction of future input [1]. A measure of intelligence can be formulated as the ability of a model to generate accurate, long-range future prediction [2].

It is also worth looking at theories and implementation of predictive coding where only the error in prediction is propagated to the higher layer while top-down prediction comes from the higher layer. Each layer removes some noise and sends the leftover noise to the layer above. PredNet is a good example of implementing this theory. This recent paper also explains it very well by covering the fundamental works of Rao and Ballard, as well as the free energy principle by Karl Friston. However, I have not seen a predictive coding implementation that explicitly builds a hierarchical structure as in STREAMER.


This is very interesting work. Immediatley brought to mind Jaynes.

All the procedures of consciousness are based on such metaphors and analogies with behavior, constructing a careful matrix of considerable stability. And so we narratize the analogic simulation of actual behavior, an obvious aspect of consciousness which seems to have escaped previous synchronic discussions of consciousness.

Consciousness is constantly fitting things into a story, putting a before and an after around any event. This feature is an analog of our physical selves moving about through a physical world with its spatial successiveness which becomes the successiveness of time in mind-space. And this results in the conscious conception of time which is a spatialized time in which we locate events and indeed our lives. It is impossible to be conscious of time in any other way than as a space.