I want to make universal sequence test for TM.
My idea is to use NLTK Context Free Grammar to generate sequences.
The question is that I need to be able to generate the sequences with increased complexity … but can’t wrap my head of how do I can come up with more and more complex Grammar.
I have to do this programmatically and dont even know how to come up with Next complexity level. F.e.
LVL1:
S → “a”
LVL2:
S → “a” “b”
LVL3:
S → A B
A → “a”  “z”
B → “b”
how do u decide what should be next lvl grammar.
so I can say :
grammar = cfg.generate_grammar(level=5)
cfg.generate_sequence(grammar)
as i was thinking I came up with another idea …turning everything upside down …

pick sequence complexity measure f.e. Temporalentropy : H

pick sequence difference measure to compare original with predicted f.e. Levenstein distance : D
Score = H / D
so Score indirectly measures the performance of TM no matter what is the Grammar
may be it needs Coef ?
what do u think ?