Sample efficiency can also be viewed as number of datapoints needed to solve the problem.
In cartpole case, this means the number of environment observations and with this encoder sometimes is lower than 30
Here-s the updated code - with both “robust” and “brittle” solutions. GitHub - Blimpyway/CartPoleChallenge: Balancing the cartpole. Fast