API details.

class SOM[source]

SOM(som_size, is_torus=False)

A class representing the Self-Organizing Map

Methods

train(data, num_iterations, is_scaled=True) Trains the algorithm

SOM.__init__[source]

SOM.__init__(som_size, is_torus=False)

Parameters

som_size : tuple

    The size of the lattice, i.e. (20,30) for 20 rows and 30 columns

is_torus : bool

    is_torus=True, changes the topology to a torus

Returns

The SOM object that can be trained.

Let's create a SOM with 20 rows and 30 columns

som = SOM(som_size=(5,8))

class Methods

SOM.train[source]

SOM.train(data, num_iterations, normalize=False)

Trains the algorithm and returns the lattice.

If normalize is False, there will be no normalization of the input data.

Parameters

data : numpy array

The input data tensor of the shape NxD, where:
N - instances axis
D - features axis

num_iterations : int

The number of iterations the algorithm will run.

normalize : boolean, optional

If True, the data will be normalized

Returns

The lattice of the shape (R,C,D):

R - number of rows; C - number of columns; D - features axis

Let's create 10 random 3-dimensional data points:

my_data = np.random.random([10,3])
lattice = som.train(my_data, 1000)
SOM training took: 0.882079 seconds.

Let's see what is in the lattice's cell (1,1):

lattice[1,1]
array([0.33745451, 0.78054122, 0.91251117])

Module Methods

If we have an non-normalized data vector p:

p = np.random.randn(10,1)*3 + 2
p
array([[-1.04293465],
       [ 1.66105068],
       [ 4.99902485],
       [ 4.03622355],
       [ 6.1025092 ],
       [-3.4373023 ],
       [-0.48707551],
       [ 0.19297129],
       [ 2.17613642],
       [ 3.69202082]])

We can use normalize_data function to set the values between min_val and max_val:

normalize_data(p, min_val=0, max_val=1)
array([[0.25098689],
       [0.53442911],
       [0.8843285 ],
       [0.78340393],
       [1.        ],
       [0.        ],
       [0.3092542 ],
       [0.38053934],
       [0.58842239],
       [0.74732327]])

u_matrix[source]

u_matrix(lattice)

Builds a U-matrix on top of the trained lattice.

Parameters

lattice : list

The SOM generated lattice

Returns

The lattice of the shape (R,C):

R - number of rows; C - number of columns;

Let's create a U-matrix of the lattice, and check its shape:

um = u_matrix(lattice)
um.shape
(5, 8)

project_on_lattice[source]

project_on_lattice(data, lattice, additional_list=None, normalize=False)

Projects the data set to the trained lattice.

Parameters

data : numpy array

The input data tensor of the shape NxD, where:
N - instances axis
D - features axis

additional_list : int, optional

You can additionally pass a vector of the same length as `data`
with labels describing each data point in any way.
This value will be then associated with the function's output.

normalize : boolean, optional

If True, the data will be normalized

Returns

A dictionary whose keys are indexes of the lattice's cells, and whose values are data points belonging to each cell

Let's project onto the lattice:

projection = project_on_lattice(my_data, lattice)
for p in projection:
    if projection[p]:
        print (p, projection[p][0])
Projecting on SOM took: 0.159090 seconds.
(0, 0) [0.45463541 0.86601644 0.91961619]
(0, 4) [0.67644781 0.56173361 0.57405225]
(0, 7) [0.38039748 0.92845473 0.09514875]
(1, 2) [0.28595804 0.75255383 0.91054277]
(1, 4) [0.69071505 0.57031127 0.57262534]
(2, 7) [0.45633722 0.3801364  0.16401741]
(3, 4) [0.90483774 0.45491878 0.66536136]
(4, 0) [0.35111524 0.00187137 0.71132841]
(4, 2) [0.28809659 0.20528437 0.84526417]
(4, 7) [0.54998596 0.02396913 0.40396049]

lattice_activations[source]

lattice_activations(data, lattice, normalize=False, exponent=1)

Projects the data on the lattice, and computes the vector of activations for each data point.

Parameters

data : numpy array

The input data tensor of the shape NxD, where:
N - instances axis
D - features axis

normalize : boolean, optional

If True, the data will be normalized

exponent : float, optional

if different from 1, activations will be raised
to the power of the exponent and then normalized between 0 and 1

Returns

A tensor of lattice activations

Let us compute how the vector activates the lattice. (Euclidean distance from each cell)

scaled = lattice_activations(my_data, lattice, exponent=8)
Computing SOM activations took: 0.330122 seconds.

Activations of the data point 0:

scaled[0].round(2)
array([[0.  , 0.  , 0.01, 0.03, 0.03, 0.03, 0.  , 0.  ],
       [0.  , 0.  , 0.01, 0.03, 0.03, 0.02, 0.  , 0.  ],
       [0.11, 0.06, 0.02, 0.03, 0.02, 0.02, 0.02, 0.02],
       [0.99, 0.84, 0.66, 0.05, 0.01, 0.02, 0.19, 0.16],
       [1.  , 0.89, 0.68, 0.57, 0.01, 0.02, 0.39, 0.39]])

Activations of the data point 5:

scaled[5].round(2)
array([[0.02, 0.03, 0.11, 0.07, 0.06, 0.06, 0.  , 0.  ],
       [0.03, 0.08, 0.11, 0.14, 0.06, 0.06, 0.  , 0.  ],
       [0.5 , 0.38, 0.24, 0.07, 0.03, 0.03, 0.01, 0.01],
       [0.73, 0.95, 0.99, 0.08, 0.02, 0.02, 0.07, 0.06],
       [0.7 , 0.91, 1.  , 0.84, 0.02, 0.03, 0.11, 0.11]])

lattice_closest_vectors[source]

lattice_closest_vectors(data, lattice, additional_list=None, normalized=False)

Finds the closest data vector to each cell in the lattice.

Parameters

data : numpy array

The input data tensor of the shape NxD, where:
N - instances axis
D - features axis

additional_list : int, optional

You can additionally pass a vector of the same length as `data`
with labels describing each data point in any way.
This value will be then associated with the function's output.

normalize : boolean, optional

If True, the data will be normalized

Returns

A dictionary whose keys are indexes of the lattice's cells, and values the data points closest to each cell

Let's find the closest vectors to the lattice:

closest = lattice_closest_vectors(my_data, lattice)
for c in closest:
    print (c, closest[c])
Finding closest data points took: 0.067116 seconds.
(0, 0) [0.45463541 0.86601644 0.91961619]
(0, 1) [0.45463541 0.86601644 0.91961619]
(0, 2) [0.28595804 0.75255383 0.91054277]
(0, 3) [0.67644781 0.56173361 0.57405225]
(0, 4) [0.67644781 0.56173361 0.57405225]
(0, 5) [0.67644781 0.56173361 0.57405225]
(0, 6) [0.38039748 0.92845473 0.09514875]
(0, 7) [0.38039748 0.92845473 0.09514875]
(1, 0) [0.45463541 0.86601644 0.91961619]
(1, 1) [0.28595804 0.75255383 0.91054277]
(1, 2) [0.28595804 0.75255383 0.91054277]
(1, 3) [0.67644781 0.56173361 0.57405225]
(1, 4) [0.69071505 0.57031127 0.57262534]
(1, 5) [0.69071505 0.57031127 0.57262534]
(1, 6) [0.45633722 0.3801364  0.16401741]
(1, 7) [0.45633722 0.3801364  0.16401741]
(2, 0) [0.28595804 0.75255383 0.91054277]
(2, 1) [0.28595804 0.75255383 0.91054277]
(2, 2) [0.28595804 0.75255383 0.91054277]
(2, 3) [0.90483774 0.45491878 0.66536136]
(2, 4) [0.90483774 0.45491878 0.66536136]
(2, 5) [0.90483774 0.45491878 0.66536136]
(2, 6) [0.45633722 0.3801364  0.16401741]
(2, 7) [0.45633722 0.3801364  0.16401741]
(3, 0) [0.35111524 0.00187137 0.71132841]
(3, 1) [0.28809659 0.20528437 0.84526417]
(3, 2) [0.28809659 0.20528437 0.84526417]
(3, 3) [0.90483774 0.45491878 0.66536136]
(3, 4) [0.90483774 0.45491878 0.66536136]
(3, 5) [0.90483774 0.45491878 0.66536136]
(3, 6) [0.54998596 0.02396913 0.40396049]
(3, 7) [0.45633722 0.3801364  0.16401741]
(4, 0) [0.35111524 0.00187137 0.71132841]
(4, 1) [0.28809659 0.20528437 0.84526417]
(4, 2) [0.28809659 0.20528437 0.84526417]
(4, 3) [0.28809659 0.20528437 0.84526417]
(4, 4) [0.90483774 0.45491878 0.66536136]
(4, 5) [0.90483774 0.45491878 0.66536136]
(4, 6) [0.54998596 0.02396913 0.40396049]
(4, 7) [0.54998596 0.02396913 0.40396049]

save_lattice[source]

save_lattice(lattice, filename)

Saves the lattice as the numpy vector

load_lattice[source]

load_lattice(filename)

Loads the lattice as the numpy vector