EnDe neural network

This model comprises methods like encoding and decoding. It involves encoding information in one network and decoding the reduced information in another network. The model consists of layers such as input, intermediate, and output. The input layer receives information, the intermediate layer stores the reduced information, and the output layer extracts information from the intermediate layer.

  1. Input layer

These layers receive inputs in the form of values associated with the information. The information is further processed by allowing random neurons to connect randomly to the input layers. These random neurons add a fixed variable (e.g., 10, 20, etc.) to the incoming information layers. This is done to differentiate and accelerate specific input units for accumulation within the model. The number of random neurons is determined based on the number of neurons in the layer and the sparsity of information units in the input. While other factors need to be considered, for now, let’s keep it at this level. The first network contains random neurons that add a fixed value, and there’s also a second network.

  1. The function

As models progress to process more complex systems, the properties of functions may or may not change. This is due to the fact that as more input is introduced, it can lead to chaos, requiring the identification of an optimal function that can bring order to this induced chaos. The fundamental purpose of any function in a neural network model is to establish order amidst chaos.
In the first network, the function allows the neurons with the highest values, after summing the input values and the values from randomly moving neurons, to be the only units of input information allowed in the model. Subsequently, the information is reduced by removing unnecessary information through a lossy function with the assistance of included functions and values of the random neurons. The first network’s function only permits the highest-valued neurons to reach the intermediate layers.

In the second network, the function exclusively allows the lowest-valued neurons to reach the output layer. This is because random neurons in the second network offset the fixed value added by the random-moving neurons in the first network. These random-moving neurons encode and decode information by randomly connecting to appropriate neurons in the network layers.

It’s important to note that the range of functions for processing neurons should expand layer by layer, but it should not increase too significantly, as this can significantly reduce information and potentially result in the loss of valuable data. Depending on various factors, the range should not exceed a large number. In my opinion, a range of 4 or 5 is optimal, although it can be subject to change. When a function reaches its limit, it should continue to the next layer with the same value range to process other neurons. The function is a dependent variable, while the input is the independent variable.
To reduce complexity and enhance the model’s efficiency, only one random neuron is allowed to participate in the decoding function of the second network. The number of random-moving neurons in the first network is determined by other factors such as the sparsity of unique units in the input information and the number of neurons in the layers.

  1. The output layer

Once the model processes the input for the first time, the output layer compares it with the input layers. If the values are the same, the random-moving neurons in the first and second networks are fixed in their appropriate positions with their connections to other neurons in the layers. This process is similar to fixing via backpropagation in a conventional neural network.

One of the pre-assumed notions is that these models will create order from chaos, essentially extracting patterns from the input information. By only allowing unique units of input information, common information is removed because this model focuses on distinct values in the input. Predictability occurs because patterns are extracted and absorbed by the model due to lower entropy in the input information. Since humans tend to predict less entropic information, this model can replicate properties of the brain.

After the model has been trained for some time, it becomes continuous with the input information. When the model becomes continuous, the intermediate step acts as a temporary holder of reduced information. The frequently reduced input information moves to the intermediate step and then quickly to the output neurons. In this way, not all attributes of the inputs are stored; only a few specific features essential for forming novel ideas, similar to how our brain operates.

So, when we move the model to more complex tasks, it is believed that only permitting the highest and lowest valued neurons to pass through the model is insufficient. There should be an addition of more functions that encode and decode information differently. To make this model work perfectly, creating order from chaos is key, and as the model’s complexity increases, different functions are required to handle the chaos.

Schematic diagram of first network -

Schematic diagram of second network -

Comment down your thoughts and feedbacks on this model.

1 Like

Apologies for not giving full details. My model may seem like a auto encoder but the working mechanism is different. I will state it as points -

  1. When the input layer gets the values from the sample it goes to the next layer and processed by bias,weight, activation functions in these nearby layers. This is the conventional model working mechanism. My model doesn’t include bias, weight, activation function. Instead it includes only the values of the randomly-moving neurons that randomly connects with the adjacent neurons (you can see it in the schematic diagram where the random neuron has the value of 0.10).
  2. As the random moving neurons make their connection randomly at some point it will make connection that were desirable to get the correct output. And by using backpropagation we fix these randomly moving neurons position to get the correct output.
  3. When the first network (see the schematic diagram) adds 0.10 values to the input information, the second network removes the 0.10 values from the reduced information that were stored in the intermediate layer. In order to filter or reduce information we introduce functions like letting-neurons-that-has-highest-value-to-pass-their-value-to-nearby-layer in the first network (because the random neuron adds 0.10 to the inputs) and other function like letting-neurons-that-has-lowest-value-to-pass-their-value-to-nearby-layer in the second network (because the random neuron removes 0.10 from the inputs).
  4. When decoding happens the information gets extracted by these functions. The output layer gets these values and compared with the input layer. Since both these values should match (much like backpropagation). (You can see it in the schematic diagram like 0.65 and 0.39 in the output layer that matches with input layer with same 0.65, 0.39)
  5. So by using values from random neurons that make connection randomly (to replicate plasticity in the brain) and functions that search for higher values and lower values. It is possible to get neural model that has reduced complexity.

(Forget about my inadequate knowledge and lack of presentation, now please tell me that my model is wrong)

1 Like

Without testing it is difficult to tell whether it is wrong or not.
If you can’t figure out how to, or have no resources to implement and test it, then something in your approach might be wrong.


Well… yeah… currently thinking on how to test it… thanks for your feedback

1 Like

What you say here vaguely reminds me about Dynamically Sparse Training. Can’t say exactly why, you can search for papers.


Yess… it was similar. I was wondering what are the ways to create plasticity? (Without using weights,bias, activation functions) can you think of any ways?

1 Like

Not really. I think the more fundamental issue is the one of differentiability, which is required by back propagation. Otherwise,

  • weights can be thought as multiple binary connections between two nodes (that’s why models can be quantized to low bits)
  • activations & bias can be thought of as tinkering thresholds above which a neuron “fires”.

Even back propagation is “bad” because it works too well for many architectures, diminishing research into algorithms that are constrained to learn without it - the ones we assume brains are using.


Back propagation & brains thingy:

here is one of a very few cheers from a member of the system - a university prof:

“This blows my mind that you made this happen without gradients. Incredible work - I really hope to see the final form! The community would be blown away if you managed to release the codebase someday.”

Not kidding not bragging. The gen exactly works on BP modification and after first excitement he went dark - he cannot just cannibalize his own work and well being.

I’m pretty pessimistic generally. To stay in the system one has to follow dogmas, to be out of the system is…

Codebase did not help :slight_smile: btw


Maybe he realized it doesn’t matter that you did it without backprop, the backprop will still beat you at generalization? Face it, Hebbian is the oldest method in NNs.


Maybe he realized something, maybe he really blew his mind. Mystery forever. Who cares.


Idk, but this idea resembles a bit the N.E.A.T. algorithm.


that’s just a worse autoencoder…


Yeah dropped this idea. Thanks for the feedback

1 Like

How do you think the model finds important parts/element in any information? What is important in a information? Eg, the image should not be saved as image, the words should not be saved as mere words etc…

1 Like

The real reason i came up with this model is to find & process only the important parts in the information. That is value difference in terms of pixel size or frequency difference in terms of sound waves. You know, information is only worthful if it has some differences or uniqueness in it. My model only captures the higher values in the input information while neglecting other similar values. This is my motive " to capture only the essential parts of the information" so for that i tried to come up with a model. It was bad but I believe that my motive is correct while the model is not correct. I was looking forward for a objective reasons for proving that my motive is wrong.

1 Like

I think the most important aspect of a system is for it to be able to “rotate” the data in a way that the most important components stick out in a linearly separable way.

The problem is that unless it is MNIST, the data is not static and you will find it rotated differently every time you look (2D translation and temporal evolution can also be thought as rotation).

So how you find out in which way the data is rotated and produce the inverse transformation matrix, thats the big question.

1 Like

Why do you think it is important?

1 Like

Well, I dont see any other means to generalize to real world scenarios. Without managing to model the topological transformations in the input, it becomes a game of memorization, and a game that becomes increasingly harder to win due to the curse of dimensionality.


Is not it enough just getting essential parts of information and combining it with other Essential information stored in the model. We get some false output + true output. Our brain also produce false output when the input were transformed into a different dimension (optical illusion for eg.). Its not about true output its always about how the information are being combined in the model.

1 Like

I dont think its about combining as much as it is about learning to look at it the right way.

You need to learn how to represent arbitrary new inputs meaningfully and its hard to do that if the meaning changes completelly when you shift the input by one pixel to the left.

1 Like