Skip to content

Core Module

API reference for the core NumbaSOM functionality.

SOM Class

numbasom.SOM

A class representing the Self-Organizing Map

Methods:

Name Description
train

Trains the algorithm

__init__(som_size, is_torus=False)

Parameters:

Name Type Description Default
som_size tuple
The size of the lattice, i.e. (20,30) for 20 rows and 30 columns
required
is_torus bool
is_torus=True, changes the topology to a torus
False

Returns:

Type Description
The SOM object that can be trained.

train(data, num_iterations, normalize=False)

Trains the algorithm and returns the lattice.

If normalize is False, there will be no normalization of the input data.

Parameters:

Name Type Description Default
data numpy array

The input data tensor of the shape NxD, where: N - instances axis D - features axis

required
num_iterations int

The number of iterations the algorithm will run.

required
normalize boolean

If True, the data will be normalized

False

Returns:

Type Description
The lattice of the shape (R,C,D):
R - number of rows; C - number of columns; D - features axis

Example: Creating and Training a SOM

Let's create a SOM with 5 rows and 8 columns:

from numbasom import SOM
import numpy as np

som = SOM(som_size=(5, 8))

Let's create 10 random 3-dimensional data points and train the SOM:

my_data = np.random.random([10, 3])
lattice = som.train(my_data, 1000)
SOM training took: 0.882079 seconds.

Let's see what is in the lattice's cell (1,1):

lattice[1, 1]
array([0.33745451, 0.78054122, 0.91251117])

U-Matrix

numbasom.u_matrix(lattice)

Builds a U-matrix on top of the trained lattice.

Parameters:

Name Type Description Default
lattice list

The SOM generated lattice

required

Returns:

Type Description
The lattice of the shape (R,C):
R - number of rows; C - number of columns;

Example: Computing U-Matrix

Let's create a U-matrix of the lattice and check its shape:

from numbasom import u_matrix

um = u_matrix(lattice)
um.shape
(5, 8)

Projection Functions

project_on_lattice

numbasom.project_on_lattice(data, lattice, additional_list=None, normalize=False)

Projects the data set to the trained lattice.

Parameters:

Name Type Description Default
data numpy array

The input data tensor of the shape NxD, where: N - instances axis D - features axis

required
additional_list int

You can additionally pass a vector of the same length as data with labels describing each data point in any way. This value will be then associated with the function's output.

None
normalize boolean

If True, the data will be normalized

False

Returns:

Type Description
A dictionary whose keys are indexes of the lattice's cells,
and whose values are data points belonging to each cell

Example: Projecting Data onto the Lattice

from numbasom import project_on_lattice

projection = project_on_lattice(my_data, lattice)
for p in projection:
    if projection[p]:
        print(p, projection[p][0])
Projecting on SOM took: 0.159090 seconds.
(0, 0) [0.45463541 0.86601644 0.91961619]
(0, 4) [0.67644781 0.56173361 0.57405225]
...

lattice_activations

numbasom.lattice_activations(data, lattice, normalize=False, exponent=1)

Projects the data on the lattice, and computes the vector of activations for each data point.

Parameters:

Name Type Description Default
data numpy array

The input data tensor of the shape NxD, where: N - instances axis D - features axis

required
normalize boolean

If True, the data will be normalized

False
exponent float

if different from 1, activations will be raised to the power of the exponent and then normalized between 0 and 1

1

Returns:

Type Description
A tensor of lattice activations

Example: Computing Lattice Activations

Let us compute how each data vector activates the lattice (Euclidean distance from each cell):

from numbasom import lattice_activations

scaled = lattice_activations(my_data, lattice, exponent=8)
Computing SOM activations took: 0.330122 seconds.

Activations of the data point 0:

scaled[0].round(2)
array([[0.  , 0.  , 0.01, 0.03, 0.03, 0.03, 0.  , 0.  ],
       [0.  , 0.  , 0.01, 0.03, 0.03, 0.02, 0.  , 0.  ],
       [0.11, 0.06, 0.02, 0.03, 0.02, 0.02, 0.02, 0.02],
       [0.99, 0.84, 0.66, 0.05, 0.01, 0.02, 0.19, 0.16],
       [1.  , 0.89, 0.68, 0.57, 0.01, 0.02, 0.39, 0.39]])

lattice_closest_vectors

numbasom.lattice_closest_vectors(data, lattice, additional_list=None, normalized=False)

Finds the closest data vector to each cell in the lattice.

Parameters:

Name Type Description Default
data numpy array

The input data tensor of the shape NxD, where: N - instances axis D - features axis

required
additional_list int

You can additionally pass a vector of the same length as data with labels describing each data point in any way. This value will be then associated with the function's output.

None
normalized boolean

If True, the data will be normalized

False

Returns:

Type Description
A dictionary whose keys are indexes of the lattice's cells,
and values the data points closest to each cell

Example: Finding Closest Vectors

Let's find the closest data vectors to each lattice cell:

from numbasom import lattice_closest_vectors

closest = lattice_closest_vectors(my_data, lattice)
for c in closest:
    print(c, closest[c])
Finding closest data points took: 0.067116 seconds.
(0, 0) [0.45463541 0.86601644 0.91961619]
(0, 1) [0.45463541 0.86601644 0.91961619]
...

Persistence Functions

save_lattice

numbasom.save_lattice(lattice, filename)

Saves the lattice as the numpy vector

load_lattice

numbasom.load_lattice(filename)

Loads the lattice as the numpy vector

Example: Saving and Loading

from numbasom import save_lattice, load_lattice

save_lattice(lattice, "my_som.npy")
# SOM lattice saved at my_som.npy

loaded = load_lattice("my_som.npy")
# SOM lattice loaded from my_som.npy