Question on MNIST data set

Bitking · September 21, 2018, 8:49pm

Is there a version that identifies the percentage error if the system correctly identifies the poorly formed or ambiguous characters and sets them aside?
For that matter - is there anything that works to identify these bad characters as the goal?

MaxLee · September 21, 2018, 9:12pm

When you say “system” here, may we know what are you referring to?

At least in DL, while the aim is to reach for the highest-certainty result and return that, there is nothing to stop anyone from selecting a medium or low-certainty result instead, or doing some analysis on the outputted certainties.

In fact, in one of my projects (a chatbot), I’m specifically selecting and logging those “uncertain” or ambiguous responses as an important function that keeps humans in the loop, and helps with ongoing training and feedback.

The basic algorithm would be:

Feed input data.
Get the result.
Check probability/certainty score, while noting the known answer.
a. If within certain threshold such as 30-80% certainty, log it.
Post-process to find which numbers are appearing most frequently within “ambiguous” data.

Bitking · September 21, 2018, 11:49pm

By system I mean the entire process of preparing the data set, training and testing.

dmac · September 22, 2018, 1:27pm

I have read image recognition papers which reported two classification accuracies:

Whole dataset accuracy, uses all inputs
High confidence accuracy, which rejects uncertain classifications. I think they also report the fraction of samples which were rejected. I honestly can’t remember any college statistics any more so I can’t elaborate or help calculate.

Topic		Replies	Views
87.15% accuracy using Spatial Pooler and a biologically possible classifer on MNIST Applications	8	846	May 16, 2020
Off by 1? 121.0 != 120 NuPIC	2	548	July 28, 2018
Sample efficiency via autocomplete Machine Learning	3	587	May 22, 2023
Dreaming Encoder Machine Learning	3	637	December 20, 2023
Measuring : Shape similarity !? Lounge	1	625	November 17, 2016

Question on MNIST data set

Related topics