Welcome to NumbaSOM
A fast Self-Organizing Map Python library implemented in Numba.
This is a fast and simple to use SOM library. It utilizes online training (one data point at the time) rather than batch training. The implemented topologies are a simple 2D lattice or a torus.
Installation
With pip
With conda
Quick Start
To import the library:
A Self-Organizing Map is often used to show the underlying structure in data. To demonstrate the library, we'll train it on 200 random 3-dimensional vectors (which we can render as colors).
Create Sample Data
Initialize the SOM
We initialize a map with 50 rows and 100 columns. The default topology is a 2D lattice. We can also train it on a torus by setting is_torus=True.
Train the SOM
We adapt the lattice by iterating 15,000 times through our data points. If we set normalize=True, data will be normalized before training.
Access Lattice Cells
To access an individual cell:
Slicing also works:
The shape of the lattice is (50, 100, 3):
Visualize the Lattice
Since our lattice is made of 3-dimensional vectors, we can represent it as a lattice of colors:

U-Matrix Visualization
Since most data will not be 3-dimensional, we can use the U-matrix (unified distance matrix by Alfred Ultsch) to visualize the map and the clusters emerging on it.
Each cell of the U-matrix is a single value representing the average distance to neighbors:
Plot U-Matrix
The library contains a plot_u_matrix function:

Projecting Data
Project onto the Lattice
To project data onto the lattice, use the project_on_lattice function:
colors = np.array([
[1., 0., 0.], [0., 1., 0.], [0., 0., 1.], [1., 1., 0.],
[0., 1., 1.], [1., 0., 1.], [0., 0., 0.], [1., 1., 1.]
])
color_labels = ['red', 'green', 'blue', 'yellow', 'cyan', 'purple', 'black', 'white']
projection = project_on_lattice(colors, lattice, additional_list=color_labels)
for p in projection:
if projection[p]:
print(p, projection[p][0])
Projecting on SOM took: 0.158945 seconds.
(0, 85) blue
(2, 39) white
(5, 1) purple
(10, 60) cyan
(41, 59) green
(49, 12) red
(49, 40) yellow
(49, 96) black
Find Closest Vectors
To find every cell's closest vector in the provided data, use lattice_closest_vectors:
You can also get the actual vectors (without labels):
closest_vec = lattice_closest_vectors(colors, lattice)
values = np.array(list(closest_vec.values())).reshape(50, 100, -1)
plt.imshow(values)
plt.show()

Lattice Activations
Compute how each data vector "activates" the lattice:
Show how the vector blue: [0., 0., 1.] activates the lattice:

To scale up higher values and scale down lower values, use the exponent argument:
