Honing NuPIC for Supermarket Application



Hi again smart folks,

So I was contacted to do HTM consulting by someone from a company that visited Numenta (bMind). They’re very excited by HTM and NuPIC and presented me with an application regarding sales in a supermarket. I had a couple of basic ideas for how they might fit they’re data into NuPIC and I’m very curious to bounce them off of anyone on here. Below is a part of the message he sent me describing the application, and below that is what I said to him.

Hi Samuel,

At this moment we need to build a POC to prove and show HTM potential in a specific task: considering a supermarket dataset containing customer profile (types, social class, credit card classification, etc) combined with purchasing transactions (history of products/SKUs acquired by each customer with timestamps) and products characteristics (group, type, etc) predicts in real-time for a given customer what type of products in several specific times or periods (season, festivities, etc) a marketing campaign would be successful. Example: if I want to push sales of a specific meat product in a Friday what group of customers should I focus on?

_Our idea is to create an environment in Google Cloud, install Nupic there, setup properly all HTM School demos available in order to use their visualization in demos when needed and then create this POC to be presented to a specific potential early adopter company. _

That makes sense? Would like to hear your thoughts! :wink:

Warm regards,

Here was my reply:

Hi Moacyr,

This is certainly an interesting application area, and I’ll be glad to offer my intuitions. One initial idea that comes to me is to learn a Nupic model for each group of customers. Each model would learn the temporal buying habits of its group, and then on a given day (like that Friday in the example) you could check the predictions from each model and see which group is most prone toward buying that product.

You could also create a Nupic model for each given product or group of products (that specific meat or meats in general), and learn the sequences of leading buyer groups. As in which group bought the most on Monday, then which on Tuesday, Wednesday etc. Perhaps there are some temporal patterns for who follows who in buying that product the most.

Alternatively if you weren’t focused on customer sub-groups but all customers, you could create separate Nupic models for each product or group of products and learn the sequences of how much is sold day by day (or week by week, or whatever the time scale is). Maybe on one day (/week/time step) the store sells more vegetables, then the next more meat and the next more dairy. If there are solid patterns in this way you could predict that Friday will be a high meat-selling day in the store and offer discounts to maximize sales.

As I said these are just some basic initial intuitions, though if you have some test to test I could run it through Nupic to test which approaches might be most promising. Looking forward to it,

– Sam

As I said if anyone has any criticisms of my basic ideas or any alternatives I’d be extremely curious to hear it! Thanks once again,

– Sam


Is this only for one store or many stores? How many transactions a day are we talking about?


Good question! I just emailed him asking that.

Additionally I just wanted to add one other idea that occurred to me, which is to work on an individual shopper level. I’m not sure what the computational demands of this would be but in theory I think they could learn a Nupic model for each shopper. It seems likely to me that we as individuals have our own patterns of what we buy and in what sequence, so on a given day that you want to push a certain product you could ping everyone’s model and push the product to those who are predicted to buy meat that day.

I’ll relay his response to your question as soon as I get it. Thanks for your thought about this!!

– Sam


I doubt you’ll have rich enough data to focus on individual shoppers. I like the idea of grouping shoppers somehow and creating a model for each group.


Alright great, I’m pumped to hear that at least one of those concepts sounds good to you! If it’s alright I’ll ping you for quick thoughts as I find out more (as if I don’t already do that all the time lol). You da man!

– Sam


I think if you did some unsupervised cluster preprocessing (K-medoids or DBSCAN perhaps) to group individual baskets (since shopping habits of individuals change by day, eg. a single loaf of bread on Wednesday, a full shop on the weekend) for basket size, time/location and w/e other features. Then you might be able to model and make predictions within a cluster for a given time/location.

Keep in mind that nupic as it stands cannot do multiple outputs, so you need to think carefully about what it is you want the model to predict (I would guess you want to predict the contents of their shopping basket, so you can suggest something they DON’T normally buy, but is similar enough to the things they do buy without selecting competitors ie. don’t offer a Coca Cola drinker to buy Pepsi).


It is hard to speculate without seeing the data available, but I would not try to frame this as a scalar prediction issue. I would attempt to create an encoder that semantically stores purchase or product information in a way that keeps the relation between certain departments in the grocery store. For example, the bakery is near the deli. The condiments are near the spices, etc. You might be able to calculate semantic overlap of product features simply by looking at where they are racked in the store.


Hey @rhyolight,

Do you mean creating an encoding vector for each total purchase? So a basket full of dairies and meat and veggies (etc) would be encoded into a vector than contained the semantic similarities between the types of food they bought, and then the sequences are the shopping trips themselves? I’m not sure if I have it totally right, though it’s certainly interesting and I’d like to get it straight how you’re imagining that.

Also I thought I’d relay the last message from Moacyr, in response to what I’d said and relayed from you. He gives a bit more info on what he has in mind. The level of detail is limited, but do you have any initial reactions? I’ll be speaking with him on the phone and trying to flush out what he means in full form soon as well.

Hi Sam,

Some comments about your points:

- Actually I am thinking to create spatial poolers for 1) a general layer with “all" customers, groups of customers, products, group of products; 2) only groups of customers/groups of products; 3) each “individual” customer data. For being able to handle that, for sure, we will need a robust Docker parallel environment for Nupic (some concerns on performance here) and for the storage I am considering Google Bigquery (no concern at all for reaching petabyte level).

- "I doubt you’ll have rich enough data to focus on individual shoppers” - yes we have that detail level! :slight_smile: … that’s the good thing, we have a table with transaction level detailing each SKU (product) per customer, which is a huge advantage.

Thanks so much,

– Sam


There are so many ways you might do this, it is truly hard to speculate without have more information about the data itself. I would still like to find out how many transactions and locations daily we are talking about.


There is one thing that you would need to watch out for with the strategy of encoding semantics around foods of similar types and using the encoded shopping cart contents as the input to sequence memory. You’d need to be careful how much information you encode, because sequence memory does best with around 4 or fewer variables. You’d not only be encoding the several different food types in the cart at once, but you might also want to encode other important variables like holiday seasons, time of day, and weekends. My sense is that it might be a difficult task to do with sequence memory. You could probably tackle this with multiple layers, for example, which focus on smaller elements of the data.

I know the question was specifically about honing NuPIC for the application, but my sense is this problem is actually better suited to the newer object recognition component of HTM than it is to sequence memory. You would still encode semantics around foods of similar types, but use these food SDRs as features of an “object”. The object represented in the pooling layer would depict the full contents of a shopping cart when a customer checks out. Distal input could come from encoded time-related information (keeping it simple – something like time of day, weekends, and seasons). The system would learn semantically similar objects (shopping carts) in the context of when the shopping is occurring.

One way to use this information would be in an online shopping scenario, as a customer adds things to their cart, the system could predict what else the user might need to buy based on similar shopping carts and when the shopping is occurring. These predictions could be used to target ads to the user while they are shopping (similar to eBay’s “People who viewed this product also viewed…” feature).

Another way to use this information would be to view the objects (shopping carts) themselves as the input features of a higher-order object which represents individual customers. Distal input for this layer could also come from encoded time-related information, to build up profiles of what each customer tends to buy at different times of day, seasons, etc. This could be used to target ads to individual customers before they go shopping, based on patterns of other semantically similar customers. Metrics from this could also be used to inform targeted advertisement campaigns. Another use would be associating these “objects” with external survey data, etc to build up even better customer profiles (again to assist in targeted advertisements).


Good point, but unfortunately there is no implementation of that theory to use yet.

In the meantime, you might be able to encode each transaction with enough semantic info to at least predict what types of things will be purchased in the future. Perhaps you might even be able to do this with scalar encoding if you just want to guess what types of items will be purchased. Each row of data might have a count of how many different types of things were purchased within the area, aggregated by some time interval. These would just be scalar counts.

Perhaps if you focused on one product category first, like dairy products. You could create one model that contains a timestamp and the number of dairy products purchased within a time period across all stores in the area. Then you could predict how much dairy will be purchased at some point in the future. This approach might need a lot of models running, however (num towns * num item categories).

This doesn’t help you identify what group of customers you should market to, however. Introducing groups would another dimension to this, and increase the number of models you’ll need to use by a factor of the number of groups.