I was thinking of a way to represent color, since all of the neural networks I’ve seen either seem to work with a greyscale image or a black and white image, and the normal sparse ways of representing color seemed… a bit much.
With images, there are already millions of inputs, and, without losing accuracy, if I represented them using an encoder that used about one new column per color value, I’d have about 768 columns per pixel, which would mean 768 million columns for a decent screen size. However, if I used non-sparse encodings like normal binary, then similar numbers, like 7 and 8, would have huge amounts of differences in on/off columns (0111 and 1000: 0% in common). That would mean a learning algorithm predicting upwards counting would have to predict a change in every bit, and if a single bit was changed, the number could change by a large amount, no matter how many bits the representation was.
However, this problem is actually what grey code was created for. (Well, actually, it was made because not all switches would always switch at the same time before calculations were carried out, so 0111 to 1000 would have spurious numbers in between that could be used for calculations.) 7 and 8 in grey code are 1001 and 1000. Any change in a single bit in grey code will result in a much smaller change in the number, since only one bit change is needed to increment or decrement grey code.
There is another problem with normal grey code though: it’s cyclical. 0 is 0000, and 15 is 1000. (Though, 0010 is an equal distance from 1010, which is another problem.) That’s fine for repeating numbers like seconds or minutes, but for things like color, bright red shouldn’t be adjacent to black. Luckily, there are monotonic grey codes, where the number of bits on increases as the number increases, so it will be unlikely for a neural network to confuse a large number with a small number. Also, any n-bit monotonic grey code representation can be created using the Savage-Winkler algorithm, so we don’t have to worry about certain numbers. (Unfortunately, the algorithm seems to produce the entire sequence of numbers instead of mapping from binary to monotonic grey code, so it’ll be necessary to create a mapping function. I was thinking I could use a Karnaugh map for mappings of different representations in amounts of bits to find a general algorithm.)
Anyway, due to the locality of the representations of numbers with respect to nearby numbers, I believe grey codes could be used for neural network friendly large number representations, where the neurons can learn anything related each different number, while using the smallest number of columns possible. If a more redundant representation is needed, some of the bits could be repeated.
[For anyone interested in monotonic grey codes, a good description of them and the algorithm used to produce them can be found here.]