Research in Random Matrix Theory [1, 2] and Computational Neuroscience [3, 4, 5] suggests the narrative that it is challenging to find ensembles of random matrices with desirable "brain-like" dynamical properties and biologically plausible connectivity structures. It has been found, rather, that careful post-hoc fine-tuning of such a randomly-generated matrix is required. This work intends to experimentally investigate and meet this challenge through the evolutionary optimisation of a novel model of block-structured random matrices with flexible expressivity. The model fits sufficiently outside the somewhat restrictive assumptions made by the recently developed theory, yet it retains biological plausibility through its a hierarchical structure. As such, this poses genuine angle to explore through experimental means to challenge this existing narrative. The outcome may find that there is more nuance than the existing work suggests, or it might support it.
The following "big" scientific question provides me with an interest in a project like this. I include it as personal context. I do not claim that this project would make any advance on it if successful, but I think it sets an appropriate background theme.
Thesis Sketch
The introduction of the thesis will commence with a comprehensive overview of the desirable properties of "brain-like" dynamical systems, contextualising abstract mathematical notions that are to follow, such as:
"Randomness" and "Entropy"
"Sparsity"
"Dynamical stability"
"Transient behaviour" and "Non-normality"
A short and high-level description of our current scientific understanding of the evolution of the human brain will be included to motivate the central use of Evolutionary Computation in the work. [6]
This will transition into a foundational overview of Genetic Programming (GP), a subfield of Evolutionary Computation characterised by algorithms which evolve genotypes that form a tree data structure
Relevant foundational mathematical concepts will be strategically recapped. This will start with important concepts in Linear Algebra covering matrix spectra (eigenvalues) and the Schur Decomposition [7] which will then be related to the behaviour of Linear Dynamical Systems (LDSs).
A high-level departure into some well-known results from Random Matrix Theory (RMT) will illustrate and lend credence to the claim that tension exists between the degree of randomness and the desired properties of random matrix ensembles.
Progressing towards the experimental method, measurable mathematical quantities will be defined, derived and argued to concretize the main research objective. This will advance into more specific mathematical territory and will result in the following definitions:
(a) Entropy of random matrices defined by the model
(b) Spectral Abscissa and Smoothed Spectral Abscissa [3, 8]
(d) Henrici's "Departure from normality" quantity:
(e) ...
The main component of the method will comprise the careful construction of the model, motivated-by key relevant literature in the fields of RMT, Computational Neuroscience, GP and Neuroevolution [9]. This will conclude with a demonstration of the non-deterministic mapping from the model (genotype) to a sample of generated matrices (phenotype).
As part of the method, the full GP algorithm intended to optimise the models will be laid out. This will include defining the genetic operators of mutation, crossover and selection. Mutation will require some minor customisation to account for the metadata stored at nodes in the tree of the model, and the arbitrarily large arity of the internal nodes, but will otherwise be routine, following A Field Guide to Genetic Programming[10]. Crossover will consist of subtree crossover, and will also be routine, but some tweaking may need to be instated to handle the arity here also (as the branching factor is not clean - note p. 15-16 of [10]). The selection strategy will contact a key motivation stemming from a gap in the literature, since the task can be described as multi-objective optimisation of quantities which are not smooth with respect to the underlying parameters (i.e. a carefully chosen subset of those listed above). This will make reference to a recent successful publication I was involved in where multi-objective evolutionary optimisation via a "Pareto Tournament" selection strategy lead to a strong advance on benchmark datasets [11].
A clear hypothesis for the experiments will be outlined, making clear the motivation of the proposed modelling of block-structured random matrices. The hypothesis will employ a Mean Value Theorem-type argument [12], reasoning that the model accounts for two ends of a spectrum ranging from maximally random matrices to completely contrived matrices:
At the former end of the spectrum (completely random), the theory tells us that these matrices do not have the desired properties [1, 2, 13].
At the latter end, we can use basic linear algebra to construct matrices with the desired properties [7].
We expect to find a trade-off somewhere in-between.
The nature of the tradeoff revealed through evolutionary optimisation could establish a new perspective on the organisation of complex modular networks, such as the brain. This returns to the theme of the "big question" posed earlier.
Experimental results of the GP algorithm trials will be gathered and analysed. The quantities calculated from the phenotypes of candidates from final populations will validate or invalidate the hypothesis. These quantities will also be compared to estimations relevant to existing works (do we improve on anything?).
The writing will conclude with proposed future work. Possible applications could include:
Applications where the Curse of Dimensionality prohibits the direct optimisation of matrix entry values. This is particularly the case for Evolutionary Optimisation strategies, which fall over quickly as the size of the matrix increases. This encoding could help if one can allow for randomness.
Detection of community structure in adjacency matrices.
Reservoir computing, where initialisation of the feedback matrix is a central issue [14, 15, 16]
Extensions to the model intended to incorporate elements of self-organisation (rewiring, Hebbian-like plasticity, SORN) [17] will be outlined.
The results will be placed in the context of the theory laid out earlier. Conclusions will be made and restated from the abstract.
Demonstration of Model
The following demonstrates how a random matrix with a rich random submatrix assembly can be hierarchically represented using a tree datastructure. This is the proposed mapping from genotype to phenotype, where we employ standard genetic programming (GP) methods to the evolution of the tree, and evaluate fitness based on the generated random matrices.
You can gain a sense for the mapping by hovering over different nodes in the tree and different blocks in the matrix.
The "Generate" button will generate a random matrix based on the tree (but the tree always stays the same!).
This will also plot below the spectrum of eigenvalues of the generated matrix in the complex plane (using numpy through pyodide). This may take a moment and will download a ~10mb runtime. See "Spectra and Pseudospectra by Trefethen & Embree" ch. 35 p. 334-335 for some illustrations of the spectra for matrices with uniformly distributed random values (real and complex). I have also included an example of this maximally random case further below
An interesting example genotype-to-phenotype mapping
Please interact with the demonstration below:
601520251281015654Hover over tree or matrix
Generate a matrix to see eigenvalues in the complex plane
Example of a random matrix with entries drawn from
6060Hover over tree or matrix
Generate a matrix to see eigenvalues in the complex plane
Completion Criteria: What does failure and success look like?
The clearest signal of success would be if the evolutionary algorithm returns an individual which represents a very random (high entropy) matrix ensemble which generates matrices with all of the properties we want: stability, balance, and high non-normal amplification (without special tuning)
If this does not happen, and the algorithm produces individuals representing highly contrived matrices then this would suggest that the fine tuning mechanism really is needed. Something like anti-Hebbian tuning of inhibition [3]. Future work could incorporate such elements into the model.
If the algorithm can't converge well, then this would suggest that the objective functions are so poorly behaved that we really can't do much with an evolutionary algorithm. This is not much of a finding. The feasibility of this work rests on this not being the case.
In any of these cases, analysis of final results and training runs could provide insight into "why" the eventual outcome occurs and if there is a theoretical explanation this could be determined
Critique
Below are some justifiable push-backs to this proposal that I have considered
Implementation roadmap
Defining the genetic representation and the mapping to the phenotype
This is the most critical part of this project. There are some choices which have to be made when it comes to defining the representation of the genotype. I will explain these in due course
Implementing the genetic operators (mutation, crossover)
These will be mostly standard implementations once the representation is locked down
The only thing which will be bespoke is an "add child" and "remove child" operator, which is necessary for ensuring that the "contrived" matrices are possible
Some decisions to be made regarding how to mutate the node data () - but this has all been done before
Implementing selection and the mathematical quantities used to measure fitness
Entropy as a fitness measure to optimise for needs to be pinned down (but I have a good idea of this in mind)
Spectral abscissa are tricky to compute but it has been done before many times and the runtimes from Trefethen's book [8] from 2005 look very promising for 20 years later.
Existing works provide good ways to measure non-normality. We can start with the simple "deviation from normality" - but I need to dig into this more because I am not aware of the justification for why people use other quantities than this...
Have implemented multi-objective selection operators before in very recent work [11] (Pareto tournament selection)
Initialisation of the population
I intend to base this on Koza's ramped half-and-half strategy - but some allowances may need to be made for the branching factor being arbitrarily large. I think a small limit will be fine because mutations will take care of the rest.
Choosing hyperparameters
We are not plagued by too many here. I can think of:
Population size
Mutation rate
Point mutation magnitudes for
Crossover rate
Number of samples to generate from individuals to estimate fitness
Notes on Relevant Literature
Guillaume Hennequin's PhD Dissertation "Stability and amplification in plastic cortical circuits": (download). This is a major inspiration (which draws heavily upon Trefethen & Embree itself) which studied similar kinds of matrix properties and developed a method of finely tuning the inhibitory (negative) connections towards stability. The work made an effort towards exploring biologically plausible explanations which I heavily admire.