Event clustering (spike sorting etc)

It is frequently desirable to partition events within a single channel into clusters based on similarities in event parameters, and then to write each cluster to a separate channel. This is called "cluster cutting". A typical example is when extracellular spikes are detected by template recognition. In this case, if the acceptance criterion is set quite broadly, spikes originating from several different axons (and therefore with somewhat different waveforms) will be incorporated into a single event channel. These events can then be partitioned into clusters which have similar characteristics. Users can cluster cut in DataView either using a 2D scatter graph (see previous page), or a rotatable 3D scatter graph.

Clustering can either be manual or automatic. In manual clustering, the user selects clusters by drawing around them in the 2D or 3D graphs, and then applies a colour to them that distinguishes them from other clusters. Eventually, when all the events have been satisfactorily clustered, the user instructs the program to write the different clusters to different event channels.

Automatic clustering uses an expectation maximization algorithm developed by Charles A Bouman. You can either pre-select the number of clusters, in which case the system partitions events into that number of clusters, or you can allow the algorithm itself to choose how many clusters are present in the data.

The screenshot below shows the rotatable 3D scatter graph (hopefully rotating). The first 3 principal component of the waveforms of a set of extracellularly-recorded spikes are displayed. These have been partitioned into 4 well-defined clusters using KlustaWin, each of which represents the spike from a different axon. An additional two clusters have been found; a loose purple cluster, which may well be associated with the green cluster, and a red "noise" cluster, which seems to be randomly distributed in the parameter space. These probably represent distorted waveforms resulting from spike collisions. The user could examine the waveform associated with these outliers by clicking on them individually, which would centre the associated event in the main display. If the user did not like the exact cluster partitions, he/she could manually edit them by drawing around particular data items and changing their colour to that of another cluster, or by changing the colour of a whole set of items simultaneously (e.g. purple to green).

Examination of the raw data underlying the events reveals the basis of the clustering.

The screenshot above shows a section of raw waveform with spikes identified as events. The event colours reflect the colours in the scattergraph of PCs, and the spike colours in the screenshot below. The red event (id 54) is clearly a result of spike collision. Its principal components do not fit into a well-defined category, and hence it has been assigned to the "noise" cluster.

The screenshot on the left shows a superimposed stack of the waveforms of the first 200 of the 514 events whose principal components are shown in the scattergraph. The 4 main categories of waveform are obvious. Some of the outliers that do not fit cleanly into any of these categories are also apparent (red and magenta traces).