Encoder for sound?

Hello eveyone, I am trying to build a HTM model for speech recognition. But I still can’t find a way to encodes sound into SDR, is there any way for that? I tried scalar encoder but it seems not worked

1 Like

I know of one project here, though I haven’t had a chance to play around with it yet myself.

5 Likes

Frequence encoder should be used for this case!

2 Likes

Further information can be found here:
https://github.com/rcrowder/nupic.audio/tree/SpeechRecognition/SpeechRecognition

4 Likes

Sorry to ask, but did you use the cochlear encoder in the analysis? I can see some examples and test code but I can’t see the conversion between train spikes and SDR. It’s possible to do it?

Thanks

Marian’s Frequency Encoder allows for taking a peice of audio, and with the appropriate overlap and window function, convert to an SDR per peice using Scalar Encoders. The Cochlea Encoder, using the cochlea Python package, can take quite a while to simulate the inner-ear. And so I batch convert entire audio samples to SDRs using the batch_encode.py script. The batch encode produces NGM files (neurogram format) that contains consecutive SDRs. That are then loaded into each Python script, and the script iterates through the SDRs.

1 Like

I’m also curious about sound/word recognition. Can this be accomplished just by looking at the cells of a single TM being fed sequential SDRs after the final SDR comes through, or do you need a second hierarchical layer than can somehow look back at the history of inputs?