Does the brain have a negative 'weight' mechanism of sort?

Let’s say that an object ‘X’ has the following 5 properties [‘A’, ‘B’, ‘C’, ‘D’, ‘E’].

Let’s say that an object ‘Y’ has the following 6 properties [‘A’, ‘B’, ‘C’, ‘D’, ‘E’, ‘F’].

(e.g. these properties could be something like color, shape, size, etc.)

Let’s also say that ‘X’ and ‘Y’ has little or no commonalities between them.

It seems more efficient for the brain if observing ‘F’ could strongly inhibit/negate all the activations relating to ‘X’ even when the brain observes properties ‘A’ to ‘E’.

What mechanism in the brain helps it learn something like in the above case where a single property ‘F’ changes the whole meaning of an object from ‘X’ to ‘Y’, where ‘X’ and ‘Y’ could be extremely different from each other?

If I’m not mistaken HTM uses only positive ‘weights’ and global inhibition through winner takes all mechanism to do any sort of inhibition/negation. Is that consistent with biology or does the brain have an equivalent of a negative ‘weight’ mechanism like those in an artificial neural network to handle the type of learning I described above more easily? Or would the brain have difficulty learning the above case?

2 Likes

Well it can turn both ways but not at the same time. Higher level (common sense?) avoids self contradictory reality The Spinning Dancer Illusion - YouTube

1 Like

Thanks for referencing that higher level phenomenon. I guess what I’m trying to ask is more lower level stuff like if the brain could benefit by having the biological equivalent of a negative weight that’s used in ANNs.

From the example above if we only use WTA and that last problematic property ‘F’ shows up then ‘F’ must somehow overwhelm all those previous properties from ‘A’ to ‘E’. But there’s a limit to how much it could overwhelm it using only ‘positive weights’ (whether it be in frequency, volt, synapse strength, etc). It would be helpful if the brain has another way of handling it like by having a negative weighting mechanism when ‘F’ shows up in addition to overwhelming the properties ‘A’ to ‘E’ using positive weights. And maybe having a ‘negative weight’ might also help brain learn the NOT logic more directly and easily.

1 Like

Nice example of contradictory constructions of the same perceptual elements @cezar_t . I couldn’t get it to flip! Until I found this crib:

dancing ballerina / spinning dancer optical illusion made easy.

1 Like

There are inhibitory inputs from neighboring interneurons, I guess that’s what you are thinking about. ANN node output can be either excitatory or inhibitory, in BNN the output has to pass through interneuron to become inhibitory.

2 Likes

Can I assume the inhibitory outputs from neighboring interneurons are mainly there to implement HTM’s WTA mechanism roughly speaking? If so, WTA seems to be the only main mechanism to inhibit other potential outputs and act as a “negative weight”? If so, then in the context of my example above, the brain must represent different possible outputs really close to each other such that any slight changes will act to output something that could be entirely different from each other almost like it is operating near critical point / phase transition? (I don’t really understand anything about phase transition so I may be talking nonsense)

1 Like

There are different types of interneurons, they work at different ranges, etc. I think WTA (competitive learning) is only one of their functions, for example in retina they implement contrast detection.

2 Likes

Any relative weight mechanism is indeed the opposite pathline to neural negative feedback(s) . … …

1 Like

Sometimes the Henny Youngman solution is best:# “The patient says, “Doctor, it hurts when I do this.”
The doctor says, "Then don’t do that!”Rather than solving the problem within the neural network, solve it externally by adding the property “not F”. Then, X has the additional property “not F”, and X and Y can be easily resolved.

jes

2 Likes

Could you expand/clarify a bit more on what you meant?

1 Like

How to represent the ‘Not F’ property? If the property ‘F’ is just some raw/low level input than the ‘Not’ or negation is implemented at the circuit/algorithmic level while the higher level concept of ‘Not’ (as our consciousness understands it) is implemented in an indirect/roundabout manner the way we ‘understand’ other concepts. Probably.

1 Like

At the simplest level, instead of a single bit taking values 0 for absence/negative and 1 for presence/positive one can use two bits. (1,0) to encode positive and (0,1) for negative values.

How well does that work in practice…in many cases should be useful. At least it expands the representation computing space (number of columns). And the “illogical” (1,1) or (0,0) cases could be useful too, specially when they are predicted.

2 Likes

“At the simplest level, instead of a single bit taking values 0 for absence/negative and 1 for presence/positive one can use two bits. (1,0) to encode positive and (0,1) for negative values.”

Agreed. ON/OFF coding in the vision system does something along these lines (although not exactly the same).

2 Likes

Well, I could give a small example of how negative weighted coeffs in the semantic db correspond to inhibition. The semantic db is a toy of mine that is something of an associative memory, and a graph database. First a little background:

|s> is called a ket, where “s” can be any arbitrary string. Quite frequently kets correspond to concepts, like say “Fred Smith” or “apple”.

|s1> + |s2> + |s3> + |s4> is how we represent addition of kets, and is called a superposition. Quite frequently superpositions correspond to lists, vectors, or sets.

age |Fred Smith> => |42>
is called a learn rule. In this case we associate the age “42” with “Fred Smith”.

Now on to the inhibition example:

-- define some integers:
some |integers> => |1> .. |15>

-- define some primes:
some |primes> => |2> + |3> + |5> + |7> + |11> + |13> + |17> + |19>

-- inhibit primes from the list of integers:
some |integers without primes> => drop (some |integers> - some |primes>)

-- see what we know:
sprint["some integers: "] some |integers>
sprint["some primes: "] some |primes>
sprint["integers without primes: "] some |integers without primes>

where drop is an operator that removes all kets from a superposition that have coeff less than or equal to zero. Ie, it implements the idea of inhibition for us.

Here is the output:

some integers: |1> + |2> + |3> + |4> + |5> + |6> + |7> + |8> + |9> + |10> + |11> + |12> + |13> + |14> + |15>
some primes: |2> + |3> + |5> + |7> + |11> + |13> + |17> + |19>
integers without primes: |1> + |4> + |6> + |8> + |9> + |10> + |12> + |14> + |15>

SDB website
SDB github

3 Likes

Following on from my post yesterday, here is another example of negation in the semantic db.

We start by defining an example/toy superposition, and then we define a linear operator that inverts the sign of all kets it is applied to. So we have:

-- define an example superposition:
the |sp> => split[", "] |alpha, beta, gamma>

-- define a linear operator that inverts the sign of ket coefficients:
not |*> #=> - |_self>

-- a demonstration of this not operator:
sprint["the superposition: "] the |sp>
sprint["not the superposition: "] not the |sp>
sprint["not not the superposition: "] not not the |sp>

Then we apply this not operator to some integers, and again use the drop operator:

-- now apply it to some integers:
some |integers> => |1> .. |15>
some |Fibonacci> => split[" "] |0 1 1 2 3 5 8 13 21 34>
some |integers not some Fibonacci> => drop (some |integers> + not some |Fibonacci>)

-- now print out the results:
print | >
sprint["some integers: "] some |integers>
sprint["some Fibonacci: "] some |Fibonacci>
sprint["some integers not some Fibonacci: "] some |integers not some Fibonacci>

Here is the output of this code:

the superposition: |alpha> + |beta> + |gamma>
not the superposition: - |alpha> - |beta> - |gamma>
not not the superposition: |alpha> + |beta> + |gamma>
 
some integers: |1> + |2> + |3> + |4> + |5> + |6> + |7> + |8> + |9> + |10> + |11> + |12> + |13> + |14> + |15>
some Fibonacci: |0> + 2|1> + |2> + |3> + |5> + |8> + |13> + |21> + |34>
some integers not some Fibonacci: |4> + |6> + |7> + |9> + |10> + |11> + |12> + |14> + |15>

SDB website
SDB github

2 Likes

cezar_t: Thanks, that’s an interesting trick to ponder upon. Need to figure out the proper learning rule for it first.

garry.morrison: Interesting project. By any chance does it have a built in clustering algorithm once it gets a bunch of input data?

There-s also an account for ambiguity - as long as the string “boa” could be taken as a whole word or continued as “boar”, “board” or “boat”, our… “network” recognizes the ambivalence of “boa” and delays making a choice until the context suggest one single choice.

If we would express that in HTM jargon - as long as too many predictions are simultaneous then “winner takes all” is postponed by “nobody takes anything”

2 Likes

Thanks for the interest in the project. No, sorry it does not currently have a built-in clustering algo. Though it does have a built-in similarity measure that works with arbitrary kets, superpositions and sequences which might be useful in implementing one.

For interests sake, here is (roughly) our similarity measure:

similarity-function {our|object 1>, our|object 2>} #=> measure-currency intersection( normalize our |object 1>, normalize our |object 2>)

sprint["similarity of A,B,C vs B: "] similarity-function( |A> + |B> + |C>, |B>)
sprint["similarity of A,B,C vs B,C: "] similarity-function( |A> + |B> + |C>, |B> + |C>)

where:

  • measure-currency sums all the coefficients in the given superposition
  • intersection is a slight generalization of set intersection, it takes the ket wise minimum of the coefficients of each ket (if a ket is not in a superposition then its coeff is assumed to be 0)
  • union by contrast takes the ket wise maximum
  • normalize reweights the given superposition so that it has “currency” == 1

Here is the output of the above code snippet:

similarity of A,B,C vs B: |number: 0.33333>
similarity of A,B,C vs B,C: |number: 0.66667>

Noting that our similarity measure always has values in range [0,1]. 0 for completely distinct, 1 for identical, and values in between otherwise.

BTW, the available releases (sorry, windows exe only) are here. It is still alpha so it has some rough edges, but is a big improvement over the previous command line version of the sdb. I do hope to resume work on it sometime soon.

If anyone wants help using it my email is: garry -at- semantic-db.org

2 Likes

Just a note, superpositions cleanly handle ambiguity because their coeffs can take values other than {0,1}. A quick example:
We define some patterns (in this case sequences of letters) and their matching words, then feed in various strings to find their similarity to the defined patterns:

pattern |node: 1> => |b> . |o> . |a>
matching-word |node: 1> => |boa>

pattern |node: 2> => |b> . |o> . |a> . |r>
matching-word |node: 2> => |boar>

pattern |node: 3> => |b> . |o> . |a> . |r> . |d>
matching-word |node: 3> => |board>

pattern |node: 4> => |b> . |o> . |a> . |t>
matching-word |node: 4> => |boat>

sprint["input b: "] matching-word similar-input[pattern] ssplit |b>
sprint["input bo: "] matching-word similar-input[pattern] ssplit |bo>
sprint["input boa: "] matching-word similar-input[pattern] ssplit |boa>
sprint["input boar: "] matching-word similar-input[pattern] ssplit |boar>
sprint["input board: "] matching-word similar-input[pattern] ssplit |board>
sprint["input boat: "] matching-word similar-input[pattern] ssplit |boat>

We get this output:

input b: 0.333333|boa> + 0.250000|boar> + 0.250000|boat> + 0.200000|board>
input bo: 0.666667|boa> + 0.500000|boar> + 0.500000|boat> + 0.400000|board>
input boa: |boa> + 0.750000|boar> + 0.750000|boat> + 0.600000|board>
input boar: |boar> + 0.800000|board> + 0.750000|boa> + 0.750000|boat>
input board: |board> + 0.800000|boar> + 0.600000|boa> + 0.600000|boat>
input boat: |boat> + 0.750000|boa> + 0.750000|boar> + 0.600000|board>
3 Likes

Yeah but this solution still leaves ambiguity unsolved. At least apparently.
What I meant was there-s a threshold of “probability winner” under which we delay an WTA resolution until one of the choices detaches without doubt from the others.

eg. NNs “solve” ambiguity with softmax, which is lame. It doesn’t signal contradiction just transforms an output layer result into a “probability” distribution.

I use quotes here since the only fact that resembles probability is predicted chances sum up to 1.00 it doesn’t guarantee the values match actual (measured) probability distribution

1 Like