DateTime encoder - setup

#1

My data has the timestamp as epoch time but I can arrang it in whatever fashion works. Have tried YYYY-MM-DD HH:MM:SS and YYYY/MM/DD HH:MM:SS but neither seems to work.

The first gives this exception:

Exception in thread "Sensor Layer [r1:1] Thread" java.lang.IllegalStateException: 
java.lang.ClassCastException: org.joda.time.DateTime cannot be cast to java.lang.Double

The second gives:

Caused by: java.lang.NullPointerException
    at org.numenta.nupic.encoders.DateEncoder.parse(DateEncoder.java:446)    
    at org.numenta.nupic.FieldMetaType.decodeType(FieldMetaType.java:98)
    at org.numenta.nupic.network.sensor.HTMSensor$InputMap.get(HTMSensor.java:306)
    at org.numenta.nupic.encoders.Encoder.getInputValue(Encoder.java:699)
    at org.numenta.nupic.encoders.MultiEncoder.encodeIntoArray(MultiEncoder.java:102)

Tried both with and w/out (appropriate form of):

    fieldEncodings.get("stamp").put(Parameters.KEY.DATEFIELD_PATTERN.getFieldName(), "YYYY-MM-dd HH:mm:ss");

In both cases the encoder is setup as:

n:0, w:0, min:0, max:0, radius:0, resolution:0
periodic, clip and forced are unset
fieldName: stamp
fieldType: datetime
encoderType: DateEncoder

0 Likes

#2

Near as I can tell Im supposed to be setting a DateTimeFormatter somehow but Im failing to see where / how.

As a test I changed my (time) data to look (somewhat) more like the HotGym timestamp column. I removed the seconds column, ran it, and got this error:

Exception in thread "Sensor Layer [r1:1] Thread" java.lang.IllegalStateException: 
java.lang.IllegalArgumentException: Invalid format: "2017-05-09 11:05" is too short

So it clearly has a timestamp format of some sort. I fixed the data length issue and ended up with the original NPE.

0 Likes

#3

@phil_d_cat,

The tests are the most complete reference for api usage. Have a look at NetworkTestHarness which is used to collect variations in parameters for the tests. If you trace the parameter building methods, you’ll see a reference to a fieldEncodingMap here

Encoders which are used are typically:
Number fields = ScalarEncoder (if you know the range of your numbers beforehand) or RandomDistributedScalarEncoder (if you don’t know)
Date Fields = DateEncoder
String/Category fields = SDRCategoryEncoder
Long/Lat = GeospatialCoordinateEncoder

For the Full List of Encoders see here

The String used in the field encoding map is just the class name of the given encoder (i.e. “RandomDistributedScalarEncoder”)

This map is basically a mapping from the “fieldName” to a Map containing encoder parameters for that field. If the field is a DateTime field, then you will need to look at the parameters for the DateEncoder configuration. One of the DateEncoder parameters is the format of the date string:

fieldEncodings.get("timestamp").put(KEY.DATEFIELD_PATTERN.getFieldName(), "MM/dd/YY HH:mm");

…where the pattern “MM/dd/YY HH:mm” is a conventional Java DateTime pattern as seen here.

The DateEncoder requires some guidance as to what the “resolution” of time it is looking at - there you can have a look at the DateEncoder.java file and the DateEncoderTest.java file to see how to work with different time resolutions - or ask here in the NuPIC forum…

1 Like

#4

Still fighting this one:

java.lang.IllegalStateException: java.lang.ClassCastException: org.joda.time.DateTime cannot be cast to java.lang.Double
    at org.numenta.nupic.encoders.MultiEncoder.encodeIntoArray(MultiEncoder.java:105)
    at org.numenta.nupic.encoders.Encoder.encode(Encoder.java:625)

Here’s what I have for the encoder setup:

    Map<String, Map<String, Object>> fieldEncodings = setupMap(
            null,
            0, // n
            0, // w
            0, 0, 0, 0, null, null, null,
            "stamp", "datetime", "DateEncoder");

    fieldEncodings.get("stamp").put(Parameters.KEY.DATEFIELD_PATTERN.getFieldName(), "YYYY-MM-dd HH:mm:ss");
    fieldEncodings.get("stamp").put(Parameters.KEY.DATEFIELD_DOFW.getFieldName(), new Tuple(1, 1.0)); // Day of week
    fieldEncodings.get("stamp").put(Parameters.KEY.DATEFIELD_TOFD.getFieldName(), new Tuple(5, 4.0)); // Time of day

Sensors like so:

Sensor.create(ObservableSensor::create, SensorParams.create(SensorParams.Keys::obs, new Object[] {"name",
              PublisherSupplier.builder()
                 .addHeader("stamp, asn, probeType, responseCode, result")
                 .addHeader("datetime, int, int, int, int")
                 .addHeader("T, C, C, C, C")  
                 .build() }));
0 Likes

#5

What does the “timestamp” column of your data look like?

0 Likes

#6

yyyy-MM-dd HH:mm:ss

2017-05-11 06:46:21

0 Likes

#7

Ok, I’ve seen this before (done this myself!), and your setup is not right somewhere - without all your code it’s hard to tell where… But I’ll take a look at what you’ve posted above and see why the MultiEncoder doesn’t have a DateEncoder installed for that field location (because that’s what it is - or it’s not being recognized - something is not there to tell the code what the setup is).

I’ll look at it at some point today… While I’m doing that, take a look at one of the tests and see what you would “normally” get at the line pointed to by the exception:
at org.numenta.nupic.encoders.MultiEncoder.encodeIntoArray(MultiEncoder.java:105)

…actually debug trace the first line of input (make sure the date field is the first field in the test data, just like you’re csv data, you’re tracing) - and see what would “normally” happen at that line if the configuration was correct? Compare that to what you’ve got - configuration-wise… I can’t do this because I don’t have all your code, so my debugging is limited to what i might think the typical configuration problems might be for this Exception…

2 Likes

#8

Gist here.

0 Likes

#9

I updated the gist to reflect a few changes based on reading samples and such.

Stepping through the process there appears to be several things happening:

  • the field names and datatypes are getting out of order. The fields are getting parsed by alphanum. The first field that gets parsed is ‘asn’ and not ‘stamp’. If I change the name to ‘aastamp’ it comes first and the proper encoder is used
  • the first four fields appear in order with the appropriate data - but field 4 (should be ‘response’) is getting the timestamp (field 0) value again

I checked my data and Im sending 5 fields, comma delimited, and in the appropriate order.

Anyway - still poking at this.

0 Likes

#10

Hi @phil_d_cat,

Ok, the problem seemed to be two things:
1.) You have type “result” and “results” with an “s” in your setup. I changed all of them to read “result”.
2.) The type “bool” was not being decoded properly (fault in the code), due to the ScalarEncoder only being able to receive doubles and the FieldMetaType for “bool” always parsing into Integer (naturally because true=1 and false=0). I submitted the following pull request to fix this. When it is merged I will do a new release which should handle this.

In the meantime, I made some changes to your Gist and added a new one here, and made it executable so it should work for you. The parameters and inference quality you’ll have to experiment with yourself… :wink:

Cheers,
David

0 Likes

#11

Wow - all that time I spent on that and it was an extra ‘s’ (plus some errata).

Banging. Head. On. Desk.

Thank you.

1 Like

#12

You’re Welcome.

0 Likes