At some point in that last stream, Matt, you talked about a potential future idea to record all chat information into a database and then the possibility to develop a tool to analyse this data.
I think this is an interesting problem. I also think this is an important and even necessary problem. However, this will have moral and safety implications. Before you seriously think about this, I think you need to have an internal debate at Numenta about it. And maybe one with legal counsel.
After the Cambridge Analytica scandal, it’s clear that this is not a small matter. But there is also something called sousveillance which has its implications.
I don’t want to scare anyone reading this. It was only a side-track comment. But I also want to repeat that I think this is an important and necessary problem to be addressed. Where better than in an open-source community?
It’s not just privacy concerns. And it’s not just anomaly detection either.
Of course it depends on the kind of data you will use. If it’s just times when users pipe in, I guess it’s only broad behavior. But once the data is stored, someone will be able to mine which words users are using; what topics they are talking about; how they feel about certain questions; what they react to and what not.
A behavior analyser will be able to profile users. Cambridge Analytica did exactly that based on Fakebook likes. And using custom taylored messages based on those profiles, they targetted swing voters to sway more than 200 elections worldwide, including Brexit and USA2016.
In my opinion, this is worth thinking about a little bit. There’s huge potential for abuse. And considering lawsuits are so common in Cowboyland, I’d rather be careful. Not to mention what impact it could have on Numenta’s reputation if some day the media starts a witch hunt on AI builders.
Maybe I’m overly anxious. All I’m saying is… talk it over with your collegues.
**TL;DR: Whatever data is made/kept, keep it transparent and publicly available. **
Everything you describe is currently already taking place. In fact, that is what the hyped term “Big Data” is referring to, where data is warehoused and analyzed to draw conclusions, primarily for business purposes (and other times, for more nefarious uses). It has helped prove discrimination in housing, employment, criminal sentencing, and advertising. It has also allowed scientist to sort large quantities of data that pops out of CERN, so that they can isolate anomalies and advance our understanding of the physical universe.
It has also been used for linguistic research to build up language tree models between different regions of the planet to hint at historic migrations and trade connections between separated peoples.
To me, it’s a tool for telling stories, and introspecting ourselves on a personal and societal level (such as recommendation systems).
For the past year and a half, I’ve been earning my income by using Machine Learning (including natural language processing), and Computer Vision. Helpful in some of those explorations have been large datasets, such as Reddit’s archives of all its posts for the past few years. Such data, combined with various AI techniques, has lead to systems that in many instances improve folks quality of life. But like some things, there is the potential for being abused.
Unlike weapons, AI is a general purpose tool that can be applied however its users decide to employ it. We’ve gone past the point where data is already collected. I exist in the databases of several countries now, at least, where my face and fingerprints are already taken. I didn’t really have any control of that, as it was the price required for admission to those nations.
No, what bothers me about AI, is when it is hoarded by organizations whose sole purpose is to:
1.) make profit without a conscience (someone requesting me to make a system that dynamically changes prices of good based on what cookies are in somebody’s browser)
2.) control the flow of information or people for comercial/political purposes (disciminatory ads <–Facebook)
3.) social credit scoring (a-la ‘Black Mirror’ or some experiments in China).
It’s not the technology itself that worries me. We’re past that point. Save for astroid impact knocking us back to pre-industrial levels, the AI cat is out of the bag. What worries me is who has access to and controls AI and our information. I’d rather it all be freely available, democratized, and accessible to all, like what NASA does with all its data, or what Facebook/Google/OpenAI sometimes do with their AI research. I’d prefer a world with full transparency, where menial things are automated away, so that human can focus on being good humans, not bureaucracy, demagoguery, or tribalism (including extreme nationalism).
I for one want to expand out to space, to be multi-planetary, and the technologies under the umbrella of AI, combined with block-chain principles seem like a decent way to administer a society which spans and is mobile on an interplanetary scale.
For me, where HTM fits into all this, if it can help lead to systems which help produce that dream, while being democratically available to all, would be awesome.
I don’t see Numenta or its research engaging in any of the above. They’re OVERLY open with their research when compared to any of the other organizations I’ve mentioned. My thought is: Log away. Sounds like it’ll be fun!