Ideas needed for encoding World Airport Codes and Airlines

Dear HTM Community, I’m hoping that you guys can help me with an experiment for my dissertation. I need ideas on how to encode World Airport Codes so that they have semantic meaning and am asking for your help. I’m thinking that I need to create some type of location relationship for each of the airports so that inferences can be made for the prediction of late flights (common ML problem). The same thing for Airlines. The dataset that I’m encoding can be found here. https://sourceforge.net/projects/moa-datastream/files/Datasets/Classification/airlines.arff.zip/download

There are 17 different airlines and quite a few different WACs (maybe 293?).
I was wondering if I normalize the data into a time data format that I could get the inference I’m looking for… ideas?
Idea 1: Using lat longs for each WAC? I believe there may already exist a Nupic encoder for points?

1 Like

What are your data constraints? How many bits do you have to store one value?

There really isn’t any data constraint as there are exactly 293 World Airport Codes in the file (i.e. “LAX”). As far as the number of bits, I suppose we can use as many as we need.

Oh do you mean the constraints for lat/long? The lat/long for “LAX” is 33.9416° N, 118.4085° W, I suppose I could convert this to a x,y coordinate system. Is this what you mean?

In what dimensions do you want to compare airports? An xy position like lat/lon is one. What about altitude? Size of the airport? I’m sure there are a lot of semantics you might want to encode.

1 Like

I see. Well, normally I would say yes to x,y coord, altitude and size of the airport, but the other classifiers that I’m comparing against only get the wac code which is only one feature. Do you think it would invalidate the comparison if we encode additional features?

While I have many disagreements with the work of Chomsky there was a bit that he said in an interview that sums up my feeling about predicting late flights:

Chomsky: It became … well, which is understandable, but would, of course, direct people away from the original questions. I have to say, myself, that I was very skeptical about the original work. I thought it was first of all way too optimistic, it was assuming you could achieve things that required real understanding of systems that were barely understood, and you just can’t get to that understanding by throwing a complicated machine at it. If you try to do that you are led to a conception of success, which is self-reinforcing, because you do get success in terms of this conception, but it’s very different from what’s done in the sciences. So for example, take an extreme case, suppose that somebody says he wants to eliminate the physics department and do it the right way. The “right” way is to take endless numbers of videotapes of what’s happening outside the video, and feed them into the biggest and fastest computer, gigabytes of data, and do complex statistical analysis—you know, Bayesian this and that [ Editor’s note : A modern approach to analysis of data which makes heavy use of probability theory.]—and you’ll get some kind of prediction about what’s gonna happen outside the window next. In fact, you get a much better prediction than the physics department will ever give. Well, if success is defined as getting a fair approximation to a mass of chaotic unanalyzed data, then it’s way better to do it this way than to do it the way the physicists do, you know, no thought experiments about frictionless planes and so on and so forth. But you won’t get the kind of understanding that the sciences have always been aimed at—what you’ll get at is an approximation to what’s happening.

And that’s done all over the place. Suppose you want to predict tomorrow’s weather. One way to do it is okay I’ll get my statistical priors, if you like, there’s a high probability that tomorrow’s weather here will be the same as it was yesterday in Cleveland, so I’ll stick that in, and where the sun is will have some effect, so I’ll stick that in, and you get a bunch of assumptions like that, you run the experiment, you look at it over and over again, you correct it by Bayesian methods, you get better priors. You get a pretty good approximation of what tomorrow’s weather is going to be. That’s not what meteorologists do—they want to understand how it’s working. And these are just two different concepts of what success means, of what achievement is. In my own field, language fields, it’s all over the place. Like computational cognitive science applied to language, the concept of success that’s used is virtually always this. So if you get more and more data, and better and better statistics, you can get a better and better approximation to some immense corpus of text, like everything in The Wall Street Journal archives—but you learn nothing about the language.

Source:

2 Likes

Thank you for your well informed response. Ultimately I’m am just replicating the experiments from published worked which compares different classifiers against 9 different data sets (one of which is the airline file in my earlier post). So because all of the classifiers are given the same data files to process, I can’t changed the data file itself or risk the results of my experiment being rejected by my dissertation committee. However if I can devise a way to encode the 3-letter World Airport Codes into an SDR then the HTM solution might show itself superior in some regards to the other classifiers (ARF, DWM, LevBag, OAUE, OSBoost, and VFDT).

1 Like

Since there are “only” 293 WACs, maybe I should just build a look up table inside my htmClassifier.py file that takes the 3 letter string and builds an encoding from concatenating x,y cord, altitude, and size??? Thoughts? ¯\(ツ)

A post was split to a new topic: Geospatial Coordinate Encoder for htm.core?