It’s been a while since I’ve provided an update about what I’m working on.
I’ve been making some major improvements to the internal architecture of this program.
Now it does more things, runs faster, and with many fewer lines of code.
However I still have not finished putting it back together again.
Here are some details about the overhaul.
A problem which I was having was that writing programs in a structure-of-arrays (SoA) style is tedious and error prone. Programmers normally use an array-of-structures (AoS) style (aka: Object-Oriented-Programming) because it’s a lot nicer to work with. For many applications (this included) SoA is technically superior to AoS. So I created a piece of software to deal with this problem which I named the “database”.
The database stores data in an optimized format (SoA), and provides an object oriented user interface to the data. It gets the best of both worlds: performant storage and easy access.
Data is stored in large contiguous vectors of homogeneous type, intended for fast vectorized computation. The database can also move these vectors to & from a graphics card (using CUDA).
The user is presented with proxy objects which behave as though they were regularly defined python objects, even though their internal data is not stored in the usual way. Python allows for such customization.
Whats more is that this has allowed me to consolidating all of the “data” related stuff into a centralized place. New features and improvements can be added to the database and easily applied to all of the contained data. In no particular order, here are some of those features:
- All pointers are represented as indexes into arrays (as opposed to raw memory addresses), and can be stored in 32 bits instead of 64.
- Data can be sorted. Surprisingly, this is a challenge!
- Sorting things necessarily involves moving them to new locations. Any pointers to the old location need to be updated to point to the things’ new location.
- Some data arrays need to be sorted based on a pointer’s value, and if the target of the pointer is also being sorted then there is a dependency in the order that you sort the arrays. For example: you might want to sort all of the neurons in a simulation, and then sort all of the synapses according to the index of their postsynaptic neuron. In this example, you must sort the neurons before the synapses.
- All data can be (optionally) tagged with meta-data. Currently I have fields for:
- A documentation string
- The physical units
- The range valid values
- Error checking for NaN, NULL, and values which are outside of their valid range.
- Tools for recording and measuring data.
Example of using the database:
First, using Object-Oriented-Programming, here is what we are going to make:
self.voltage = -70 # millivolts
my_neuron = Neuron()
print(my_neuron.voltage) # prints: "-70"
And now let’s rewrite it using the database.
from neuwon.database import Database
db = Database()
neuron_data = db.add_class("Neuron")
voltage_data = neuron_data.add_attribute("voltage",
initial_value = -70,
units = "millivolts",)
Neuron = neuron_data.get_instance_type()
my_neuron = Neuron()
print(my_neuron.voltage) # prints: "-70.0"
Both implementations of the
Neuron class behave identically. However, behind the scenes, my_neuron does not actually contain the voltage data. Instead
my_neuron contains a pointer to the database, and the index of where
my_neuron is located inside of the database’s arrays.
The database provides an API for accessing the raw data, although this should be reserved for advanced users/programmers:
my_neuron.get_unstable_index() # The index of this neuron.
neuron_data.get_data("voltage") # The voltages for every neuron.
It’s been one of my goals with making these tools that they be useful to me beyond my current project, and can be useful for other people’s simulations too.