Multidimensional encodings using Java API

Matheus_Araujo · January 10, 2019, 6:20pm

@cogmission, This thread shows that my PR must be rejected since my code also forces only 2D arrays to be encoded using the sensors and network api…

keith.nordstrom · January 10, 2019, 6:20pm

Look at lines 58 through 73 in org.numenta.nupic.encoders.GeospatialCoordinateEncoder (which are making the required transformation:

	public void encodeIntoArray(Tuple inputData, int[] output) {
		double longitude = (double)inputData.get(0);
		double lattitude = (double)inputData.get(1);
		double speed = (double)inputData.get(2);
		int[] coordinate = coordinateForPosition(longitude, lattitude);
		double radius = radiusForSpeed(speed);
		
		super.encodeIntoArray(new Tuple(coordinate, radius), output);
	}
	
	public int[] coordinateForPosition(double longitude, double lattitude) {
		double[] coordinate = toMercator(longitude, lattitude);
		coordinate[0] /= scale;
		coordinate[1] /= scale;
		return new int[] { (int)coordinate[0], (int)coordinate[1] };
	}

In short I’m certain that CoordinateEncoder is designed to support an arbitrary number of dimensions (well up to Int.Max anyway ;)).

rhyolight · January 10, 2019, 6:23pm

Good insight @Matheus_Araujo. It’s true the coordinate encoder should be able to handle arrays of any dimension.

Matheus_Araujo · January 10, 2019, 6:26pm

I’m going to do it right later and send a new PR.

Matheus_Araujo · January 10, 2019, 6:27pm

As seen above, it’s always a 2D array…

keith.nordstrom · January 10, 2019, 6:30pm

That’s in the GeospatialCoordinateEncoder and it’s explainable because the geospatial encoding only recognizes two dimensions. There are no restrictions on size I can see in the CoordinateEncoder itself - it simply takes an int[] (and a radius).

There’s also no argument I can think of from a phenomenological standpoint to make such restrictions a priori - even in the case of Geospatial encoding at least 3 dimensions actually seems useful (eg. altitude).

But as noted above, I am very much not an expert on this topic so if someone who is can shed light on whether or not there are additional assumptions of limits on coordinate dimensions, it would probably be useful information (though also perhaps escaping the bounds of the topic/forum a bit).

rhyolight · January 10, 2019, 6:32pm

It’s really useful for processing XYZ points as I did here.

Matheus_Araujo · January 10, 2019, 6:37pm

What’s your stack trace?

Look at this method from Layer:

/**
     * If this Layer has a Sensor, map its encoder's buckets
     * 
     * @param sequence
     * @return
     */
    private Observable<ManualInput> mapEncoderBuckets(Observable<ManualInput> sequence) {
        if(hasSensor()) {
            if(getSensor().getMetaInfo().getFieldTypes().stream().anyMatch(ft -> {
                return ft == FieldMetaType.SARR || ft == FieldMetaType.DARR || ft == FieldMetaType.COORD || ft == FieldMetaType.GEO;
            })) {
                if(autoCreateClassifiers) {
                    throw new IllegalStateException("Cannot autoclassify with raw array input or " + " Coordinate based encoders... Remove auto classify setting.");
                }
                return sequence;
            }
            
            sequence = sequence.map(m -> {
                doEncoderBucketMapping(m, getSensor().getInputMap());
                return m;
            });
        }

        return sequence;
    }

It seems that when you are using a coordinateEncoder you can’t auto create classifiers…

keith.nordstrom · January 10, 2019, 6:43pm

Hmm thanks, that’s helpful. Because I didn’t use a sensor I didn’t get that error.

keith.nordstrom · January 10, 2019, 8:43pm

@cogmission Here’s a repo that reproduces the issue. Th only catch is it will require sbt: https://github.com/keith-nordstrom/htm-stacktrace-example

cogmission · January 10, 2019, 9:04pm

What is your current status? What do you have issues with, and what do you need?

I will be reviewing the code and the current PRs toward bringing things up to date (which I will need to do before introducing any new code modifications). Then, I will address whatever issues you and @Matheus_Araujo might currently have?

Also, what is your timeline for making the pitch to use NuPIC in your company?

keith.nordstrom · January 10, 2019, 9:27pm

@cogmission thanks for your attention (and everyone else’s). I am presently unblocked and proceeding. I’ll let you know if I run into anything else that appears blocking to me.

My timeline is somewhat up to me (privilege of being the CTO, I suppose). But my target is the end of the month - any longer than that and I will start having to make uncomfortable justifications to the very stern man who trickles a small portion of his large pile of money into my company each month.

My objective is to provide a viable alternative to a somewhat novel autoencoder approach created by my data science team to predict the health of certain pieces of industrial equipment. The category encoder concept alone gives a leg up, provided predictions can be evaluated with similar recall/precision. Once I can get this calculating something sensible I will likely hand the project to someone on that end of things for fine tuning and full evaluation. If/when that happens I’ll try to make sure context is maintained on forum posts.

Thanks again.

cogmission · January 11, 2019, 12:13pm

@keith.nordstrom,

You’re very welcome. Keep us up to date?

rhyolight · January 16, 2019, 3:28pm

You were asking about encoders and buckets earlier. Here is a recent post you may find informative.

keith.nordstrom · January 16, 2019, 7:02pm

Thanks, that’s definitely helpful. Is the lack of a bucketing scheme the reason why coordinate encoders don’t have classifiers?

Topic		Replies	Views
DateTime encoder - setup HTM.Java	11	1548	May 13, 2017
HTM School Episode 6: Datetime Encoding YouTube encoders	3	1458	June 27, 2016
How to find correct encoder? NuPIC encoders , category-encoding	4	956	October 16, 2018
Predicting a 3-tuple NuPIC encoders	5	1240	November 19, 2017
Periodic scalar encoder HTM.Java encoders	3	782	March 20, 2017

Multidimensional encodings using Java API

Related topics