Introduction
So, you want to build an application with NuPIC? This page provides practical guidance and advice for the budding NuPIC application developer.
Requirements
This reference is meant to support NuPIC application development, so you will need to have NuPIC installed. For some sections below, you should have some Python experience. However, if you don’t have Python experience, you can do a lot by use the HTM Engine, which allows you to create your own NuPIC client application in any programming environment you like.
For some apps, you may need to run a Swarm, which requires that you have MySQL installed and running. In general, individual tools listed below may have additional requirements, so be sure to check their README pages for installation instructions.
Data
The nature of the brain and HTM dictate the best form of data for NuPIC: streaming temporal data. HTM works best with data having a strong temporal component that changes continuously. Some examples:
- a path of coordinates an object takes over time
- a sensor reading of water temperature taken every hour on a beach
- stock volume
- energy consumption of a building or appliance over time
- machine temperature readings over time
Note that every example above has changing values of data over time. The patterns in this data are temporal patterns. For example, the energy consumption of a building will have obvious daily and weekly patterns, depending on the purpose of the building. These patterns can be learned by NuPIC models and acted upon given NuPIC’s predictions of future data and anomaly indications.
Getting Data
It can be difficult to find high quality streaming temporal data in a format that is easy to consume. The NuPIC community has been collecting possible data sources for a long time. Below are some resources we’ve put together to help provide data for NuPIC applications.
River View
River View collects publicly accessible temporal data streams over time and caches them in an easy-to-query interface. Many public data streams only expose real-time data, neglecting to allow the querying of historical data. River View caches data from the source, making near-real-time data available as well as up to 3 months of historical data. The data.numenta.org River View instance is currently caching nearly 7,000 data streams, including:
- Chicago beaches weather and water quality data
- New York City traffic
- Minnesota traffic (this is just a sample, there are over 6,000 other traffic sensor data streams)
- stock prices and other stock metrics
- global earthquake activity from USGS
- crimes in San Francisco, CA
- 911 calls in Portland, OR
- energy demand and load for the state of Texas
These are just some examples of existing “Rivers” available for anyone to consume. Anyone can add a new River to River View by following these instructions. As soon as the new river is merged into the codebase and redeployed to http://data.numenta.org, the River will start collecting data. Give it a few weeks and see what kinds of temporal patterns you can uncover!
If you think you might need to set up a River to collect data for your NuPIC application, don’t delay! We are accepting pull requests at https://github.com/nupic-community/river-view. The sooner you get your River started, the more accumulated data you’ll have to use.
Other Public Data Sets
There are a lot of potential public data sources that have not been exploited. We’ve been keeping a list of them at Temporal Data Sets for HTM.
Portable Device Sensors
There are a plethora of hackable sensor devices today with APIs for getting streams of live data. Here are a few examples:
Spark also offers a data service much like River View that allows Spark device users to upload their own data sets for anyone to use: https://data.sparkfun.com/streams/
NuPIC APIs
See the API Docs.
Tools
These tools exist outside of the core NuPIC codebase. They are not HTM implementations, but they can help users to create HTM applications.
HTM Engine
HTM Engine is a framework used for creating and running hundreds of NuPIC anomaly detection models simultaneously. It manages memory and CPU usage by serializing models to disk when they are inactive. This means models only utilize system resources when they are learning new data and returning anomaly indications.
First, some disclaimers. HTM Engine:
- only runs anomaly detection models
- does not generate predictions, only anomaly scores and likelihoods
- provides a model interface that can only monitor one field of data per model
That said, HTM Engine is a very useful tool for anomaly detection problems when hundreds of potential scalar metrics are involved because it allows you to easily stand up a server that handles scaling models automatically.
See the README for complete installation instructions, including a tutorial screencast (mentioned below).
Cortical.IO
Cortical.IO is a partner of Numenta. Their basic API provides SDRs (called fingerprints) for words, sentences, and paragraphs. They have additional services for more advanced processing. The demos on their website are comprehensive. You can sign up for a free API key for experimentation.
There are two Python clients for the Cortical.IO API:
- Official Cortical.IO Python Client
- cortipy (Numenta’s client)
The SDRs returned from Cortical.IO’s APIs can be used directly by NuPIC if passed into the temporal memory module or temporal pooler. You can see an example of this in the Fluent library.
Fluent
Fluent is a platform for building language / NLP-based applications using NuPIC and Cortical.io’s API. This project is currently in research-mode, so it is not stable enough to promote to NuPIC application developers yet. We have future plans to release a version of Fluent that does not depend on nupic.research
.
Details coming soon…
Example Apps
The NuPIC community keeps several experiments and example applications on the HTM Community GitHub organization. Anyone interested in sharing their example application is welcome to chat with us on HTM Forum or our Gitter chat room.
Reference Apps
These are sample applications that don’t include detailed tutorials. They are more for reference. You might want to use them as examples for how to do certain things with NuPIC.
Taurus
“Taurus” is the development code word for Grok for Stocks.
Grok for Stocks is an example HTM application that continually monitors hundreds of publicly traded companies and alerts you if something unusual is happening to any of them. Grok for Stocks uses HTM machine intelligence algorithms to model stock price, stock volume, and Twitter data related to 200 of the largest publicly traded companies. Companies monitored include Apple, Google, Amazon, and Starbucks. Grok for Stocks is a mobile application that runs on Android-based phones.
This complete application is available for free download on the Google Play Store for Android devices.
The complete source code for Grok for Stocks includes several components:
taurus
A server application that implements HTM Engine for the purpose of collecting and reporting on company metrics. Custom metrics are used for Stock Price, Stock Volume, and Twitter handle tweet volume. A RESTful API is provided to support the Taurus Mobile application.
taurus.monitoring
Implements several monitors of the Taurus infrastructure and a supporting database.
taurus.metric_collectors
Implements metric collection agents for twitter and xignite data sources which forward data to a running Taurus instance.
taurus-mobile
Application-specific Android source code for the mobile app.
mobile-core
The Mobile App is composed of reusable components that are used by all Numenta mobile client applications.
HTM for IT
“Grok” is the development codeword for HTM for IT.
The complete source code for HTM for IT includes several components:
grok
Grok is an application for monitoring IT infrastructure and notifying on anomalous behavior. This is the server running HTM Engine.
grok-cli
This repository contains the Grok Command line interface (CLI). grokcli allows you to easily interact with a Grok server through the command line including creating instances, etc.
grok-mobile
Application-specific Android source code for the mobile app.
mobile-core
The Mobile App is composed of reusable components that are used by all Numenta mobile client applications.
Rogue Behavior
Metrics collection agent for the Numenta Rogue showcase application and consists of two primary components: A long-running metric collection agent, which periodically polls various metrics and records the results to a local database, and a separate process for forwarding metrics to a Grok server for analysis.
Audio Signal Analysis
A couple of projects may be of interest in this area:
Skeleton HTMEngine App
Tutorial Apps
Hot Gym
The “hot gym” sample application has been around for a long time, and was one of the first real-world applications of NuPIC that proved the value of cortically-inspired learning algorithms. The data used is real energy consumption data from a gym in Australia, which simply contains a timestamp and float value for energy consumption.
This collection of tutorials uses the “Hot Gym” premise to illustrate many ways users can set up and run a NuPIC application against real-world data.
HTM Engine Traffic Anomalies
This tutorial application comes complete with an instructional video, fully commented example codebase, and a runtime that pulls live data from River View.
Geospatial Tracking
Shows examples of geospatial anomaly detection using canned and manually recorded GPS paths.
Predicting a Sine Wave
Not the most practical example, but an example nonetheless.
Ideas
Looking for inspiration? Check out these videos of demos from our previous hackathons:
- 2014 Spring NuPIC Hackathon Demos
- 2014 Fall NuPIC Hackathon
- 2015 Spring NuPIC Hackathon
- 2015 HTM Challenge
Additionally, here are some topical suggestions for projects:
Biometric
- Look for anomalies in human heartbeats. You can get this data from sound recordings (there are more online if you poke around), electrocardiograms (EKG), or other heart monitoring devices. Try it identify the onset of an irregular heartbeat.
- Electroencephalogram (EEG) data from devices like OpenBCI or Muse.
- Collect accelerometer data from human movements. Attempt to classify the movements.
Geospatial
- Attach a location-tracking device to a pet. See if you can correlate high anomaly indications with abnormal behaviors.
- Attach a GPS device to your car to identify anomalies in your routine.
- Track satellites through space and identify anomalies in their movements.
Earth Science
- Identify weather anomalies in one or more of several weather feeds in River View:
Human Behavior
- Try to identify traffic accidents before the news media or DOT reports them:
- NYC traffic
- Twin Cities metro area traffic (high quality data feed!)
- Houston TX freeway traffic
- Find patterns in 911 calls
- Find patterns in crime reports
- Find patterns in stock prices or volumes