If I could get some advice. The Network API of NuPIC currently specifies the need for one model per predicted field. In HTM.java, I added (with intention on optimizing this later), the ability to have multiple classifiers. So the NAPI will create one Classifier per input field, and then you can access the “Classification” by field name from the Inference object coming out the other end.
Obviously this is expensive to do by default, but I wanted some advice on whether this should be changed to mandate one predicted field per Network to mirror what’s done in NuPIC, or should I keep the ability to have multiple classifiers at the top of each Network and only maybe add the ability to specify multiple fields (maybe from the header of the input csv file)? or the Parameters? I’m thinking it’s an improvement to have the ability for more than one classifier, but maybe not? And it’s probably urgent to stop classifying every single field by default?
The time cost of CLAClassifier scales linearly on the number of ON bits in the input. I have tried to run multiple (~5) classifier simultaneously in the past and didn’t notice dramatic slow down. Although I didn’t profile my code. The reason I did this is I found combining outputs of multiple CLAclassifiers gave better prediction accuracy as the outliers in the prediction tend to be eliminated.
The time cost of the new SDRClassifier depends on the product of the number of ON bits in the input and the dimension of the output (number of output categories). It is more expensive than the CLAClassifier if you have a lot of output categories.
This is why I’ve observed hardly any dramatic slow-down for HTM.java’s Network API default of having a classifier automatically configured for every input field. At least with the CLAClassifier, the cost isn’t as great but I definitely am looking forward to the performance of your new SDRClassifier - so I guess we need to write up an issue to have the “classified fields” be configurable - and I’m going to keep the multiple classifier “ability” but when we move to the SDRClassifier, we will need to do some testing to be able to make recommendations for users as to what the impact is per number of classified fields and their coincident number of classifier instances.