Distributed HTM?

If we’re distributing, where each column is doing its own work, there’s almost no need to sync threads. Each one can act, for the most part, independently of the others, so long as they all have access to the memory.

My thought approach is to avoid syncing as much as possible. Keep column memory in contiguous arrays of memory, and copy only when needed (i.e. utilize DRDY pins on a per-area basis). In any case, RAM will always be faster than any FPGA we can reasonably get our hands on.

BTW, which FPGA you are using?

At the moment, using an ICE40LP8K as provided on a TinyFPGA BX board, but have a DE10 Nano on the way with a CycloneV SoC (with embedded ARM hard core) which will give a nice jump up from 8K LUT to 110k LE.

For FPGA implementation ideas, we should continue this discussion here.

2 Likes