Where can I find a good example (or detailed documentation) of 2D coordinate-encoder?
WITHOUT the pain of ‘speed’ involved in the Geospatial coordinate encoder.
Where can I find a good example (or detailed documentation) of 2D coordinate-encoder?
RDSE implementation is wasting buckets?
I used the CoordinateEncoder directly when processing XYZ points from Minecraft here.
If you don’t have a Z component, you can just use zero.
Hey Matt (and all).
So I’m trying to use the coordinate encoder for my scalar data here. Each is an x-y coordinate.
Before trying to use the coordinate encoder I was feeding the data in as 2 scalars each with the simple scalar encoder, though I thought the coordinate encoder could be the most apt since they are coordinates.
Here’s my attempt at emulating what Matt did in this example. First the model.run loop and then the model params file:
The params file:
I’m currently getting this error which I haven’t seen before:
AssertionError: Should be implemented in base classes that are not containers for other encoders
Any ideas or recommendations on how I might get this working? Thanks again,
One problem is that you are not sending in 3 coordinates. My code sends X, Y, and Z because the
location string has x,y,z fields:
xyz = [float(n) for n in location.split(",")] vector = np.array(xyz).astype(int)
Yours only has xy:
xy = [int(x_response*1000),int(y_response*1000)] vector = np.array(xy)
EDIT: Coordinate encoder does handle any number of coordinates. See correction below
Also, I’m curious what this data represents. This encoder was designed to represent an object moving through space.
So the data represents the 2-dimensional movement of the mouse in an x-y space. It’s a really simple flight simulator type of game, where crosshairs are moving around horizontally and vertically and the player’s objective is to compensate for these motions to keep it in the center of the screen. Each pair of x,y values is the horizontal and vertical response of the player to the game at a given time (3-4 per second). The raw data are 3-digit float values between -1 and 1, so I multiple them all by 1000 so they’re all integers for the coordinate encoder.
Here’s a screen cap of the game, the purpose is to move the yellow crosshairs with the mouse to pull the green crosshairs into the center box:
Do I need a third dimension variable to include in the vector? It’s just a 2D game, can 2-d work for the coordinate encoder? If not what would you recommend? In terms of the radius I didn’t know how to apply what you’d done in the Minecraft example, so I just picked a value of 50. Most values seems to be between -300 and positive 500, so 50 was just an eyeball guess. Thanks again!
According to documentation, you need not assume a fixed 3-dimensional space, so XY should be totally reasonable. Regardless of dimensionality, you do need to specify a
GeoSpatialCoordinateEncoder maps “speed” onto.
GeoSpatialCoordinateEncoder is a subclass of
CoordinateEncoder and is designed to work with latlong data.
I’m confused, however, in trying to put together a quick-and-dirty example and am running into what appears to be a bug. For example, this code snippet:
from nupic.encoders import CoordinateEncoder import numpy coords = numpy.array([1.0, 2.0, 3.0]) radius = 3.0 enc = CoordinateEncoder() enc.encode((coords, radius))
Produces this error:
Traceback (most recent call last): File "coord.py", line 7, in <module> enc.encode((coords, radius)) File "/Users/amarshall/nta/nupic/src/nupic/encoders/base.py", line 133, in encode self.encodeIntoArray(inputData, output) File "/Users/amarshall/nta/nupic/src/nupic/encoders/coordinate.py", line 109, in encodeIntoArray neighbors = self._neighbors(coordinate, radius) File "/Users/amarshall/nta/nupic/src/nupic/encoders/coordinate.py", line 130, in _neighbors ranges = [range(n-radius, n+radius+1) for n in coordinate.tolist()] TypeError: range() integer end argument expected, got float.
Casting radius to int doesn’t help, either because of this detail in the implementation of
ranges = [range(n-radius, n+radius+1) for n in coordinate.tolist()]
i.e. coordinates are floats, so
n+radius+1 results in a
float, which is an invalid value for the second argument to
int works (see below), but that requirement is inconsistent with the documentation and (I think) most people’s expectations.
These [1-3]-dimensional one-liner examples work fine:
print CoordinateEncoder().encode((numpy.array(), 3)) print CoordinateEncoder().encode((numpy.array([1, 2]), 3)) print CoordinateEncoder().encode((numpy.array([1, 2, 3]), 3))
Expect a proper github issue/bug report to follow.
Sam, this looks like a bug. Particularly, that
CoordinateEncoder has an incomplete implementation in that it does not implement
getBucketIndices() as required by the base
Encoder superclass. This renders
CoordinateEncoder unusable in the OPF, it seems. I’ll write up a bug report.
So would the code I showed work if it wasn’t for the bug? What would you recommend I do to get past it in the mean time? Could I change something in the coordinate_encoder.py file I already have to get solve it?
Just looking for any reasonable course of action here lol. Thanks again!
So I’m wondering if there’s a fix I can do on my end to get past this error. Here’s the getBucketIndices() from the base.py file that generates the error. The error gets raised when self.encoders == None, so it seems I need it not to lol.
You mentioned that coordinate.py doesn’t implement the getBucketIndices() function, do you think it might help if I just add a getBucketIndices() function definition to the file such as the one in scalar.py?:
Or this one from the RDSE?:
I’d love to understand this well enough to fix it! And have the coordinate encoder to use of course lol. Thanks again!!
I wanted to ask, what source code are you using for the coordinate encoder in this use case? I ask because Austin pointed out later in this thread that there’s a bug in the opf implementation of the coordinate encoder, that it:
You are using it without this bug though, so I’m wondering how? I’d love to try my data with the coordinate encoder since my data is a series of x-y coordinates. It seems that the version of coordinate.py that you’re using must be free of the error I’m encountering. Do you think I could replace the buggy one in the opf with the one you’re using? I’m curious for any viable path to getting the coordinate encoder working. Thanks!
I’m using it only for anomaly detection, which does not require a classifier. Without a complete encoder implementation, we cannot do prediction. The codebase I used is at https://github.com/htm-community/mine-hack.
Me too! I’m doing pure anomaly detection and using average anomaly score as my metric, so I don’t need the classifier either! Do you think I could get it working by bypassing the classifier in the opf since I only need the anomaly score for each time step and not the actual prediction?
I’m current running a TemporalAnomaly model which outputs the anomaly score along with the prediction. Is there another model type that doesn’t use the classifier? Thanks again!!
I think you should be fine using a classifier in the OPF, but you won’t be able to get predictions for the CoordinateEncoder. Anomalies should be created. You can see an example usage of the
GeospatialCoordinateEncoder in the OPF here. Since you seem to be using cartesian coordinates, you might want to use this
CartesianCoordinateEncoder I made this week. It is just a prototype, but handles resolution a bit better than manually calculating the radius.
Awesome, thanks!! I’ll give it a shot. Very quickly, will I need to have a ‘speed’ value as part of my input? I see in the cartesian_coordinate code you have which is looking for ‘speed’:
My data is simply x,y pairs:
Since my data are sub-second I don’t have a timestamp column either, just the raw x,y pairs. Since each of the rows are equally spaced in time, if there’s a big jump between one x or y-value and the next (x/y at time t+1) that means the movements had to be faster than usual. Should I maybe create a 3rd column for speed by taking the difference between the current x and the prior? (x at time t-1). Or might there be a more apt solution that comes to your mind?
Thanks again for your time Matt!!
I think speed might be important, so I would add it. You will need to play around to get the right
scale for your data. Sub-second timing should be okay when calculating speed as long as it is in units/second.
So yes, I would get a time value for each reading and use it to calculate speed relative to the last coordinate.
So I’m working to implement the coordinate encoder from your mine-hack example and wanted to check: since each value comes at the same time interval (about 1/3 of a second), would it make a difference to calculate a time delta rather than set it as a constant in getting the radius?
As I mention in the last post I don’t have timestamps to my data, which is just x,y values at this point. I though I’d use a modified version of your ‘calculate_radius’ function, with a fixed time delta of 0.33. How does this sound to you? The data are largely small positive and negative values between -0.5 and 0.5 (small snippet shown above). Thanks once again,
It looks like my example did not encode time at all, which was probably the right thing to do (real world time would not have meant anything, anyway). I might have done better if I’d encoded game time, and had daily routines in the minecraft world.
What this means is that “speed” differences will not be noticed.
Ok, I see. Though isn’t a faster speed accounted for by a larger radius? For instance in the code below from your ‘calculate_radius’ function, if the speed is greater than ‘p1’ and ‘p2’ will be further apart since there was further movement over the time, correct? If they’re further apart then the ‘dist’ variable will be larger, and the ‘blocks_per_second’ will also be larger even with a constant ‘time_delta’ value. This ‘blocks_per_second’ is returned as the output of ‘calculate_radius’. Do I have something missing in my logic here?
p1 = np.array(point1[“location”])
p2 = np.array(point2[“location”])
dist = np.linalg.norm(p1 - p2)
time_delta = point2[“time”] - point1[“time”]
blocks_per_second = (dist / time_delta) * 1000
Oh yeah I see what you are saying. You just fixed the time delta so the radius will change. That seems good. Sorry I didn’t read clearly.
Ok great, I’m glad I wasn’t way off there. There again