Networks of Integrate-and-Fire Neuron using Rank Order Coding : How to implement Hebbian Learning
Perrinet L.1, Delorme A.2, Samuelides M. 1 et Thorpe S.J. 2
1
ONERA-DTIM, 2 Av. E. Belin, BP 4025, F-31055 TOULOUSE CEDEX 42
CERCO/CNRS, 133 Route de Narbonne, F-31062 TOULOUSE CEDEXNeuropsychological experiments have recently showed the incompatibility of the classical rate code introduced in 1926 by Adrian and Zotterman. Actually, the speed of processing in the visual system can be as short as 150 ms for a complex task involving at least 10 processing layers (Thorpe et al, Nature, 381, 520-522) leaving less than 15 ms for the neuron to process and transmit the information. A solution is to use the asynchronous characteristics of the neural code, as in the analog-to-delay conversion characteristic of retinal ganglion cells by using integrate-and-fire like models of neurons: generally, the stronger the input intensity, the quicker the neuron would fire. At this point, we propose that each afferent neuron emits only one spike and that integration is not dependent on time but on the number of spike received by the efferent neuron. By this assumption, the information carried by the neural code is the order of spike arrivals rather than the exact latency (Gautrais et Thorpe, 1998, Biosystems, 48, 57-65). This model was implemented using an artificial neural network called SpikeNET (Delorme et al 1999, Neurocomputing, 26-27, 989-996) which proved to be as effective as classical models but with quicker processing speed.
Mathematically, the nth spike is dynamically integrated in the soma to the potential V by adding the synaptic weight wordo(n) of the last spiking dendrite of index ordo(n) modulated by a decreasing function Mod(.) of the rank order of the spike:
V(0)=0 and Mod(0)=1 and for n>1, V(n)=V(n-1)+ wordo(n).Mod(n)
The neuron emits a spike when it reaches the threshold s. The network propagates then this spike to the next layer and if so to the next layers of the network. Numerous theoretical results have been found. Firstly, we showed that rank order coding is invariant with any increasing monotonous transformation of the pixel intensities (the order is conserved), which is an analogous aspect of the relative invariance to the contrast of natural vision. Considering propagation, the spike wave is emitted in a feed-forward fashion to the connected neurons of the efferent layer, and therefore we consider emitting fields in comparison with the classical view of receptive fields. In this view, we proved under some assumptions that receptive and emitting fields have homologous geometry (central symmetric for translation invariance, for instance). To analyze integration in the soma, we used rank linear statistics, linking the potential value with the Spearman correlation coefficient (Spearman, 1904): the neurons measure dynamically the correlation between the input and the weight pattern. We then proved that under certain conditions, the distribution of potentials is gaussian with known moments (permutationnal central limit theorem). This enables us to determine with precise selectivity the threshold s. Other important aspects of this coding are first the equalization of the signal coded by rank order, which corresponds to an information theoretic optimization of the information transmission. Also, on the computational side, this type of coding is highly parallel both inside a layer but also between layers (the spikes are propagated until all neurons are under threshold value, as opposed to the classical sequential propagation from the first to the last layer). Lastly, it is dynamically interruptible (i.e. once a neuron spiked, the calculation can be stopped on this neuron) and generating sparse representation of the signal. The consequence is that the process is therefore well suited for the visual type of processing.
A problem was to implement learning on this asynchronous artificial neural network. Due to its simplicity and biological plausibility, we have chosen to implement hebbian-like learning: the synaptic weights are reinforced towards the value of modulation at the value of the rank order. For this model of learning, we proved that under certain assumptions, the learning algorithm was converging in the quadratic function space. We also used this type of rule to update the modulation function. We then proved that for natural images, Mod( ) converges towards the cumulative gaussian distribution. For the first layers of the network, the learning process is unsupervised: "the winner takes all" and therefore, only the first spiking neuron of the layer is updated. Therefore, we used lateral inhibition and an adaptive rule for the threshold so that it becomes more selective during the learning process of each neuron (and therefore with its "maturity"), allowing the different neurons to compete during this process. We implemented those algorithms on MATLAB for modeling purposes and then on the SpikeNET technology allowing the computation with a bigger size of the learning database showing the emergence of V1 like type of direction selective receptor fields on the first layer of our network for those images.
Actually, biological learning mechanisms are similar to the one we used in the present model and have been shown to depend on very small difference in latencies between the input and the output cell discharge date (Markram et al, 1997, Nature, 275, 213-215). Based on this observation, we implemented an adapted rule introducing the time of emission of the spike. Particularly, we studied the convergence of the algorithm with synthesized images (gratings), textures (from the Brodatz database) and natural images and showed similar results as with the first hebbian rule. Analysis of the results showed, as we expected, that as we learn on the spike really emitted, the learned patterns have a more sparse representation and more selective response. Using information theory, we then studied the respective performances of the algorithms, putting into evidence the similar properties of transmission of the information. Actually, those receptors are optimal to generate a sparse and non redundant representation of the image allowing compression of the spatial correlation of different levels. For instance, at the first level, neighboring pixels are more probable to be at the same value in comparison with distant pixels, on the second level, neighboring edges are more probable to point at same directions. Classically, those filters are decorrelating kernels to allow maximization of the information transmission (Attick 1992). In our model of asynchronous propagation of spikes with hebbian learning, this information theoretic optimization is still sound and is combined with the advantages of the fast processing with the rank order scheme. Lastly, the latter scheme is biologically more plausible and computationally more effective.