I am interested in sequence prediction as it pertains to video frames. Supposedly a solution to this problem is important for: (a) completing missing information – whether just missing or because of noise, or (b) predicting the next k frames – perhaps for surveillance? though I can see how this would be important for medical imaging… I’d love to hear from others with more experience!
There are methods that exist now that separate a frame into “motion” and “content” using CNNs and then use encoders to prepare the frames for LSTMs that are used to predict based on the “motion” information from the images. From my research on other posts, it seems that LSTMs could be replaced by HTM. And the advantage would be the sparsity, which may lead to faster processing too. I’m wondering whether someone could speak to any other advantages?
Thanks for thinking about this with me!