The features in an unordered sequence could also be encoded with positional information. As in my other thread Learning an ordered sequence out of order this would rely on the concept of “current position”, which would be provided by some external system or process. Current position could be a number of things: the slider bar on a media player, positions on an object, etc.
“Current position” could be encoded along with each feature using the spacial pooler. For example, connections could be grown from the columns to another set of inputs that represent the current position. This would give the ability to control the weight of the current position in the generated SDR. The result would be not just a collection of features, but also their positions on the object.