Multidimensional encodings using Java API

I’m trying to write a model that works off of up to 10 different input values on a single timestamp. To do this, I’m passing in a vector of input values with a CoordinateEncoder implementation. I am then sending it data in the form of a Tuple that includes an int[] as the first entry.

This is choking when it tries to find a bucket in the Layer. The logic appears to be looking for a DateTime or a number; and if it finds neither, it attempts to case things to a string. Whereupon I get

java.lang.ClassCastException: org.numenta.nupic.util.Tuple cannot be cast to java.lang.String

Any help? What format should I be sending? It seems to me that getting a bucket for a Tuple should be something the CoordinateEncoder itself covers?

Moved from #nupic into #htm-java. @cogmission or others have any help?

1 Like

Am I correct in thinking the encoder bucket mapping is somewhat arbitrary in this class?

The class matching exercise in Encoder appears to simply be choosing the first bucket for a given sub-encoder. Each encoder is getting an int[] with dimensions equal to the number of sub encoders it has. The code simply takes the first one. It looks like there’s a method to this - the 4 sub encoders in my DateEncoder are always ordered with “season” first - but otherwise it’s simply a number (which presumably wants to be predictable, non-negative, and “well-distributed” in some fashion):

            Object o = encoderInputMap.get(name);
            if(DateTime.class.isAssignableFrom(o.getClass())) {
                bucketIdx = ((DateEncoder)e).getBucketIndices((DateTime)o)[0];
            } else if(Number.class.isAssignableFrom(o.getClass())) {
                bucketIdx = e.getBucketIndices((double)o)[0];
            } else {
                bucketIdx = e.getBucketIndices((String)o)[0];

If this is the case, it would appear to me the solution might be something like adding a branch

            } else if(Tuple.class.isAssignableFrom(o.getClass())) {
                bucketIdx = ((CoordinateEncoder)e).getBucketIndices((Tuple)o)[0];
            } else {           

and then overriding CoordinateEncoder#getBucketIndices (which has only a single encoder). This override might return a bucket that is some measure of its Tuple's contained vector - perhaps the floor of its Manhattan length?

@rhyolight @cogmission How can I bump this question? I am working on R&D right now to use HTM in a fast-paced startup environment and have significant pressure from our funding source not to devote time to “science projects.”

The theory behind HTM makes sense to me as well as its potential applicability to our use cases, and it would also appear to solve problems we have with scaling some traditional AI approaches. However, I’m blocked on this, and the longer I’m blocked the less likely I’m going to be able to convince my partners that I’m not wasting time ($).

If my guesses above are correct, I can fork the code and use it and (of course) contribute a PR. I don’t quite see how this could not be a bug - earlier methods that take this same object clearly require a Tuple - but it seems odd that this has made it past test cases (though it’s certainly an integration level bug). So perhaps it’s a matter of missing configuration?

1 Like

I know that @cogmission did a pretty close port of NuPIC python code to java code, so should be close to It might help you to see the original codebase.

I suspect this is a bug in the java implementation (and perhaps the python one), because handling an arbitrary number of dimensions may not have been declared in the test suite.

1 Like

When I compare the code, the python appears to benefit from the fact that it can send through a map of values to the Encoder class. It’s not ever explicitly trying to enforce a String conversion so there’s no failure. Still, I’m not sure I see (without writing a test case, which potentially makes my time investment a bit larger ;)) how it’s calculating this bucket.

Perhaps it would help if you could explain to me what requirements this bucket has? It’s not in the comments and I’m obviously not familiar with the theory at this level. Does my list above meet them?

it’s simply a number (which presumably wants to be predictable, non-negative, and “well-distributed” in some fashion

Thanks in advance for the help.

I have not implemented this type of encoder before, but that sounds right. This video might help:

I’ve seen the video. But the functionality seems to be something for the classifier in the Layer. The video is about the encoding itself, which has already been done by the time it reaches the method in question.

The method in question is Layer#doEncoderBucketMapping. The logic appears to be

  1. Get encoding from an Inference (already calculated)
  2. Find all encoders in the system. Foreach
    a. Get a “bucket index”
    b. Store it in the inference’s classifier using this index

Mostly I would need to know what a “bucket index” is in order to make a sensible choice. The selection of this index appears to be somewhat arbitrary (the 0th index of various int[]s, which are not ordered) so I’m hopeful it’s an easy task/small change.

Hello @keith.nordstrom, I’ve created a pull request in october 16th 2018 fixing this bug but it hasn’t been approved yet.

At that time I was experimenting with the FileSensor, so I’m not sure if it fixes your bug too.

Could you post your code that’s throwing the excpetion?

Take a look at what I’ve changed in the FieldMetaType file:

case COORD :{
            	String[] parts = input.split("[\\s]*\\;[\\s]*");
                int[] coord =new int[2];
            	return (T)new Tuple(coord, Double.parseDouble(parts[2]));


	public void encodeIntoArray(Tuple inputData, int[] output) {
//REMOVED thisLine		List<int[]> neighs = neighbors((int[])inputData.get(0), (double)inputData.get(1));
		List<int[]> neighs = neighbors((int[])inputData.get(0), Math.max((double)inputData.get(1), w)); //Added this line
		int[][] neighbors = new int[neighs.size()][];
		for(int i = 0;i < neighs.size();i++) neighbors[i] = neighs.get(i);

Why don’t you use 10 scalar encoders instead of a coordinate encoder?

A coordinate encoder is supposed to encode coordinates which are made up of only two values: longitude and latitude.

1 Like

Hi @keith.nordstrom, @Matheus_Araujo, @rhyolight,

I apologize, I’ve been “away” from the maintenance of this code for a while now, and every time I think I will have time to “catch up” with the PRs and such, I have been pulled back in to my day job at an intense level. I used to be afforded 30-40% of my day job allocation for working on, but for the last year or so my company’s ( focus has mandated a shift in my priorities. Also, concurrently Numenta shifted its focus to “research mode” which means core NuPIC API development was reprioritized, and so mentally I put on the back burner.

All that to say that I will try to give some attention, as much as I can - starting with looking at merging PRs that have been “waiting” for some lovin’. :slight_smile:

@keith.nordstrom can you send me a short test case that reproduces the error in question? Meanwhile, I will work on getting the waiting PRs settled (@Matheus_Araujo - thank you for your help and work btw!). Also, I will spend some time reviewing the code (it’s been a while), to try and gain some insight or intuition with regard to the problem you are having.


1 Like

@Matheus_Araujo you or someone else should correct me if I’m wrong, but it was my impression that latitude and longitude were handled in the special case GeospatialCoordinateEncoder. I’m using the values as a vector because I’m trying to predict their behavior in the 10D phase space.

This is definitely one of my questions - if they are separate encoders does the relationship between values get recognized by the algorithm? Picture the ideal gas law in physics, PV = nRT. There are 3 degrees of freedom in this equation and predicting the movement of all three coordinates at once (a hyperplane) is the goal. Can this be accomplished using scalar encoding? If so, what is the purpose of the general CoordinateEncoder?

1 Like

@Matheus_Araujo I’m also not using a FileSensor, I’m piping data directly into the HTM Layer using computeImmediate in an Akka Stream. I did see your PR and went through it day before yesterday, but I believe it deals with very different areas of the code from my issue. I dealt with the bug you found in CoordinateEncoder by a workaround in my parameters.

Still, if you’re able to run something with a coordinate encoder using a FileSensor this way then I wonder if there’s not some piece of missing config that’s contributing to my issue (ie. something that implies a transformation from a Tuple to some other value before creating the “buckets”). Of course, the other encodings in my model (a string value and a DateTime value) are not modifying their data.

Thanks David, I have a short piece of code that handles this but the bulk of it is in scala. I’ll endeavor to pare it down so it’s more targeted and post a github later today.

You’re not wrong. The coordinate encoder is a valid way to encode multidimensional data. It should result in similar semantics as a grouping of scalar encoders.

Short answer: yes. But think of the entire encoding space as the input space, and splitting it up in different ways. If you use a MultiEncoder and serially chunk up the space for different scalar values, that’s one way to do it. But the RDSE shows that we can also distribute those “buckets” across the entire space. So from the standpoint of the spatial pooling operation, the it is going to recognize semantics whether they are localized, reversed, resorted, or randomly distributed throughout the space.

So it does not really matter to the SP (which is reading the input and normalizing it into minicolumn activations) how the encoding space is chunked up.

Thanks @Matheus_Araujo and @rhyolight, that’s helpful to know. It seemed better to bring them together as a single representation, but I suspect that’s because my mind was (still is) conflating the encoding with the logic of the thing.

It’s also intuitive given that a second internal inconsistency for me was how the category and date fields were being treated separately and still being recognized as context if other measurements in my vector were not.

@rhyolight I’m pretty new to this stuff, so apologies for a basic question: what’s “RDSE”?

@cogmission I’ll still try to get you your test case even though this is no longer blocking me.

That’s true! Before I made that change in the FieldametaType code, CoordinateEncoder wasn’t used and GeospatialCoordinateEncoder was used instead even if the Sensor’s field metatype was COORD.

I don’t know if both CoordinateEncoder and GeospatialCoordinateEncoder are able to encode more than 2 dimensions.

From the newtwork API, using multiEncoder, I infered that it’s writen to encode latitue, longitude and a radius.

But looking at CoordinateEncoder it seems that it is able to handle any number of dimensions…

Although GeospatialCoordinateEncoder is cleary focused on latitude, longitude and radius. Take a look at it’s encodeIntoArray code where it creates a tuple with latitude and longtude and than calls super.encodeIntoArray (super is CoordinateEncoder):

	public void encodeIntoArray(Tuple inputData, int[] output) {
		double longitude = (double)inputData.get(0);
		double lattitude = (double)inputData.get(1);
		double speed = (double)inputData.get(2);
		int[] coordinate = coordinateForPosition(longitude, lattitude);
		double radius = radiusForSpeed(speed);
		super.encodeIntoArray(new Tuple(coordinate, radius), output);

@Matheus_Araujo yep, but it then uses that array and builds a CoordinateEncoder out of it. The signatures of the required parameters are pretty different (CoordinateEncoder uses Tuple(int[], double) rather than GeospatialCoordinateEncoder's Tuple(int, int)).

Side note: it would really be helpful to have stronger typing around requirements of parameters in this API - at least the config classes and the encoder requirements. It would certainly help with the avoidance of casting.

@cogmission For reference, the bug I am dealing with happens in the

   private void doEncoderBucketMapping(Inference inference, Map<String, Object> encoderInputMap) {
        if(encoderTuples == null) {
            encoderTuples = encoder.getEncoders(encoder);

        // Store the encoding
        int[] encoding = inference.getEncoding();

        for(EncoderTuple t : encoderTuples) {
            String name = t.getName();
            Encoder<?> e = t.getEncoder();

            int bucketIdx = -1;
            Object o = encoderInputMap.get(name);
            if(DateTime.class.isAssignableFrom(o.getClass())) {
                bucketIdx = ((DateEncoder)e).getBucketIndices((DateTime)o)[0];
            } else if(Number.class.isAssignableFrom(o.getClass())) {
                bucketIdx = e.getBucketIndices((double)o)[0];
            } else {
                bucketIdx = e.getBucketIndices((String)o)[0]; //ERROR Here!

            int offset = t.getOffset();
            int[] tempArray = new int[e.getWidth()];
            System.arraycopy(encoding, offset, tempArray, 0, tempArray.length);

                    new NamedTuple(
                            new String[] { "name", "inputValue", "bucketIdx", "encoding" },

The system gets an Object o from the encoderInputMap, which happens to be a Tuple. The case matching below it doesn’t account for Tuples (only Number, DateTime, and 'String`).

I see it a bit differente since both of them use Tuple(int[], double). The difference is GeospatialCoordinateEncoder always builds a 2D array

int[] coordinate = coordinateForPosition(longitude, lattitude);
new Tuple(coordinate, radius);