Hi,
I am a newbie in Numenta. I was able to install nupic and get the hot gym example up and running. I used it for both prediction and anomaly score detection. Now, I have some artifically generated medical data. There is a time-stamp associated with each row of data and there are many other attributes like heartbeat readings, some sensor readings etc. Before moving onto multivariate analysis, I want to just use the time-stamp and the heartbeat as my attributes. There are 3.85 million records. I currently have this data in a Cassandra cluster in my server machine. I wanted to simulate a streaming kind of scenario where lets say I extract 1 row per sec from this cassandra table and treat this as real-time streaming data. I want to now run NUPIC to give me anomaly scores on this data. These are my questions
Where do I stage the data that is read from the Cassandra Table. In the ~/nupic/examples/opf/clients/hotgym/prediction/one_gym, there is the rec-center-hourly.csv file. Do I change tablename and put my data in that file. Or there is some other smoother way.
Lets say the hearbeat readings fluctuate a lot, so I dont to be in a scenario where I swarmed just once over the data uptil that point, but rather have a system where as new data flows in the model automatically keeps on updating and gives me the best anomaly score. What are the code changes that need to be done for this. If there is no way of automatically doing this, do i feed data into that rec-center-hourly.csv file and swarm over it lets say every 1 min and then compute the anomaly score. But in this case there would be a lag to take into account the time it took to run the model, to get corresponding anomaly scores of data points. I want to visulaize my data, so want to avoid this lag.
Please suggest me the best way of doing this exercise.
Thanks