python-course.eu

11. Separating Classes with Dividing Lines

By Bernd Klein. Last modified: 20 Mar 2024.

We will develop a simple neural network in this chapter of our tutorial. A network capable of separating two classes, which are separable by a straight line in a 2-dimensional feature space.

Line Separation

Fence as a dividing line

Before we start programming a simple neural network, we are going to develop a different concept. We want to search for straight lines that separate two points or two classes in a plane. We will only look at straight lines going through the origin. We will look at general straight lines later in the tutorial.

You could imagine that you have two attributes describing an eddible object like a fruit for example: "sweetness" and "sourness".

We could describe this by points in a two-dimensional space. The A axis is used for the values of sweetness and the y axis is correspondingly used for the sourness values. Imagine now that we have two fruits as points in this space, i.e. an orange at position (3.5, 1.8) and a lemon at (1.1, 3.9).

We could define dividing lines to define the points which are more lemon-like and which are more orange-like.

In the following diagram, we depict one lemon and one orange. The green line is separating both points. We assume that all other lemons are above this line and all oranges will be below this line.

Divion boundary

The green line is defined by

$$y = mx$$

where:

m is the slope or gradient of the line and x is the independent variable of the function.

$$m = \frac{p_2}{p_1}x$$

This means that a point $P'=(p'_1, p'_2)$ is on this line, if the following condition is fulfilled:

$$mp'_1 - p'_2 = 0$$

The following Python program plots a graph depicting the previously described situation:

import matplotlib.pyplot as plt
import numpy as np

X = np.arange(0, 7)
fig, ax = plt.subplots()

ax.plot(3.5, 1.8, "o", 
        color="darkorange", 
        markersize=15)
ax.plot(1.1, 3.9, "oy", 
        markersize=15)

point_on_line = (4, 4.5)
# calculate gradient:
m = point_on_line[1] / point_on_line[0]  
ax.plot(X, m * X, "g-", linewidth=3)
plt.show()

No description has been provided for this image

It is clear that a point $A = (a_1, a_2)$ is not on the line, if $m \cdot a_1 - a_2$ is not equal to 0. We want to know more. We want to know, if a point is above or below a straight line.

Divion boundary

If a point $B=(b_1, b_2)$ is below this line, there must be a $\delta_B > 0$ so that the point $(b_1, b_2 + \delta_B)$ will be on the line.

This means that

$$m\cdot b_1 - (b_2 + \delta_B) = 0$$

which can be rearranged to

$$m\cdot b_1 - b_2 = {\delta_B}$$

Finally, we have a criteria for a point to be below the line. $m \cdot b_1 - b_2$ is positve, because ${\delta_B}$ is positive.

The reasoning for "a point is above the line" is analogue: If a point $A=(a_1, a_2)$ is above the line, there must be a $\delta_A > 0$ so that the point $(a_1, a_2 - \delta_A)$ will be on the line.

This means that

$$m \cdot a_1 - (a_2 - \delta_A) = 0$$

which can be rearranged to

$$m \cdot a_1 - a_2 = -\delta_A$$

In summary, we can say: A point $P(p_1, p_2)$ lies

We can now verify this on our fruits. The lemon has the coordinates (1.1, 3.9) and the orange the coordinates 3.5, 1.8. The point on the line, which we used to define our separation straight line has the values (4, 4.5). So m is the quotient of 4.5 and 4.

lemon = (1.1, 3.9)
orange = (3.5, 1.8)
m = 4.5 / 4

# check if the orange is below the line,
# a positive value is expected:
print(orange[0] * m - orange[1])

# check if the lemon is above the line,
# a negative value is expected:
print(lemon[0] * m - lemon[1])

OUTPUT:

2.1375
-2.6624999999999996

We did not calculate the green line using mathematical formulas or methods, but arbitrarily determined it by visual judgement. We could have chosen other lines as well.

The following Python program calculates and renders a bunch of lines. All going through the origin, i.e. the point (0, 0). The red ones are completely unusable for the purpose of separating the two fruits, because in these cases both the lemon and the orange are on the same side of the straight line. However, it is obvious that even the green ones might not be too useful if we have more than these two fruits. Some lemons might be sweeter and some oranges can be quite sour.

import numpy as np
import matplotlib.pyplot as plt 

def create_distance_function(a, b, c):
    """ 0 = ax + by + c """
    def distance(x, y):
        """ 
        returns tuple (d, pos)
        d is the distance
        If pos == -1 point is below the line, 
        0 on the line and +1 if above the line
        """
        nom = a * x + b * y + c
        if nom == 0:
            pos = 0
        elif (nom<0 and b<0) or (nom>0 and b>0):
            pos = -1
        else:
            pos = 1
        return (np.absolute(nom) / np.sqrt( a ** 2 + b ** 2), pos)
    return distance
    
orange = (4.5, 1.8)
lemon = (1.1, 3.9)
fruits_coords = [orange, lemon]

fig, ax = plt.subplots()
ax.set_xlabel("sweetness")
ax.set_ylabel("sourness")
x_min, x_max = -1, 7
y_min, y_max = -1, 8
ax.set_xlim([x_min, x_max])
ax.set_ylim([y_min, y_max])
X = np.arange(x_min, x_max, 0.1)

step = 0.05
for x in np.arange(0, 1+step, step):
    slope = np.tan(np.arccos(x))
    dist4line1 = create_distance_function(slope, -1, 0)
    Y = slope * X
    results = []
    for point in fruits_coords:
        results.append(dist4line1(*point))
    if (results[0][1] != results[1][1]):
        ax.plot(X, Y, "g-", linewidth=0.8, alpha=0.9)
    else:
        ax.plot(X, Y, "r-", linewidth=0.8, alpha=0.9)

size = 10
for (index, (x, y)) in enumerate(fruits_coords):
    if index== 0:
        ax.plot(x, y, "o", 
                color="darkorange", 
                markersize=size)
    else:
        ax.plot(x, y, "oy", 
                markersize=size)


plt.show()

No description has been provided for this image

Basically, we have carried out a classification based on our dividing line. Even if hardly anyone would describe this as such.

It is easy to imagine that we have more lemons and oranges with slightly different sourness and sweetness values. This means we have a class of lemons (class1) and a class of oranges class2. This is depicted in the following diagram.

Two clusters of 2-dimensional points

We are going to "grow" oranges and lemons with a Python program. We will create these two classes by randomly creating points within a circle with a defined center point and radius. The following Python code will create the classes:

import numpy as np
import matplotlib.pyplot as plt

def points_within_circle(radius, 
                         center=(0, 0),
                         number_of_points=100):
    center_x, center_y = center
    r = radius * np.sqrt(np.random.random((number_of_points,)))
    theta = np.random.random((number_of_points,)) * 2 * np.pi
    x = center_x + r * np.cos(theta)
    y = center_y + r * np.sin(theta)
    return x, y

X = np.arange(0, 8)
fig, ax = plt.subplots()
oranges_x, oranges_y = points_within_circle(1.6, (5, 2), 100)
lemons_x, lemons_y = points_within_circle(1.9, (2, 5), 100)

ax.scatter(oranges_x, 
           oranges_y, 
           c="orange", 
           label="oranges")
ax.scatter(lemons_x, 
           lemons_y, 
           c="y", 
           label="lemons")

ax.plot(X, 0.9 * X, "g-", linewidth=2)

ax.legend()
ax.grid()
plt.show()

No description has been provided for this image

Live Python training

instructor-led training course

Enjoying this page? We offer live Python training courses covering the content of this site.

See: Live Python courses overview

Enrol here

Automatically Finding the Dividing Line

The division line was once more established based on visual estimation. This raises the question: how can we approach this process systematically? Currently, our analysis is limited to straight lines passing through the origin, each uniquely characterized by its slope.

To initiate our systematic approach, let's revisit the simplest scenario involving just one lemon and one orange. We commence with an arbitrary line, which evidently does not effectively separate our orange from the lemon.

import matplotlib.pyplot as plt
import numpy as np

def plot_fruits(p1, p2, point_on_line=(5,1)):
    X = np.arange(0, 7)
    fig, ax = plt.subplots()
    ax.plot(p1[0], p1[1], "o", 
            color="darkorange", 
            markersize=15)
    ax.annotate("Orange", 
                xy=(p1[0], p1[1]), 
                xytext=(p1[0]+0.5, p1[1]+0.5),
                arrowprops=dict(facecolor='orange', shrink=0.05))
    ax.plot(p2[0], p2[1], "o", 
            color="yellow", 
            markersize=15)
    ax.annotate("Lemon", 
                xy=(p2[0], p2[1]), 
                xytext=(p2[0]-0.5, p2[1]-0.5),
                arrowprops=dict(facecolor='orange', shrink=0.05))
    ax.plot(*point_on_line, "x", 
            color="darkorange", 
            markersize=15)
    # calculate gradient:
    m = point_on_line[1] / point_on_line[0]  
    ax.plot(X, m * X, "g-", linewidth=3)
    plt.show()


orange = (4, 2)
lemon = (1, 3)
point = (5, 1)
plot_fruits(p1=orange, p2=lemon, point_on_line=point)

No description has been provided for this image

We can see that the line is not suitable as a dividing line, because both the lemon and the orange are above the line. We can calculate if the orange is above or below the line, if we check, if m * orange[0] + orange[1] is greater zero ('below the line') or smaller than zero ('above the line'):

m = point[1] / point[0]
m * orange[0] - orange[1]

OUTPUT:

-1.2

This indicates that the orange lies above the line when it should actually be below. In this case, an optimal dividing line would ideally be positioned just above the orange. Identifying such a line will again be done by manual inspection of the plot, but it's essential to automate this process. The point p3 = (4, 2+delta) with delta equal to 0.3 satisfies the condition:

For instance, consider the point p3=(4,2+δ)p3​=(4,2+δ) with δδ set to 0.3, which fulfills the condition:

This means that the orange is above the line, but it should be below the line. In this example it would be ideal to have a dividing line just above the orange. So, a line going through the point p3 = (4, 2+delta) with delta equal to 0.3 satisfies the condition:

delta = 0.3
plot_fruits(p1=(4, 2), p2=(1, 3), point_on_line=(4, 2+delta))

new_slope = (2 + delta) / 4
# position of orange:
new_slope * orange[0] - orange[1]

OUTPUT:

No description has been provided for this image

0.2999999999999998

This means that the orange is now below the line.

We can say that the error between our initial slope and the targeted slope is:

targeted_slope = new_slope
initial_slope = point[1] / point[0]
error = targeted_slope - initial_slope

# the targeted_slope can be seen as the following sum:
initial_slope + error

OUTPUT:

0.575

We will examine now what happens, if we apply this correction mechanism to more fruits. To create the lemon and orange clusters, we will use this time the make_blobs function from sklearn.datasets. The center of oranges is set to (1, 1.5) and the center of the oranges to (1.5, 1).

import numpy as np
import matplotlib.pyplot as plt
from sklearn.datasets import make_blobs

number_of_samples = 9
centers = [(1, 1.5), (1.5, 1)]
data, labels = make_blobs(n_samples=number_of_samples, 
                          cluster_std=0.2,
                          centers=np.array(centers),
                          random_state=42)

fruits = [(data[i], labels[i]) for i in range(len(data))]

fig, ax = plt.subplots()
colours = ["yellow", "orange"]
label_name = ["Lemons", "Oranges"]
for label in range(0, 2):
    ax.scatter(data[labels==label, 0], 
               data[labels==label, 1], 
               c=colours[label], 
               s=60, 
               label=label_name[label])

ax.set(xlabel='X', ylabel='Y', title='fruits');

No description has been provided for this image

We will apply the previously developped idea of correcting the error on this data. We will iterate over the fruits and correct the error in the way we have demonstrated just before:

Iterating through the Data: The function loops over pairs of points, denoted as (x, y), along with their corresponding labels obtained from the data and labels arrays. This process is facilitated by the zip function, pairing each data point with its corresponding label, which is represented by incrementing numbers starting from zero.

Adjusting Slope: Based on the label and the position of the point relative to the line, the slope is adjusted. If a point with label 1 (orange) is above the line but should be below, the slope is increased. Conversely, if a point with label 0 (lemon) is below the line but should be above, the slope is decreased. In other words: Every time a fruit is on the wrong side of the straight line, we will reset the slope accordingly:

The decreasing and increasing is done in the following way: We calculate a new target slope with the code line target_slope = (y + delta) This line determines the target slope of the line required to adjust the classification of the point. It adds a small delta value to the y-coordinate of the point and divides it by the x-coordinate. This adjustment introduces a slight change in slope to potentially reclassify the point correctly.

slope = 0.3

def adjust(slope=0.3, delta=0.1):
    # Initialize variables
    counter = -1
    
    # Iterate through data points and labels
    for point, label in zip(data, labels):
        counter += 1
        x, y = point
        
        # Plot data points
        ax.scatter(x, y, color="yellow" if label == 0 else "orange")
        ax.annotate(str(counter), (x, y))
        
        # Calculate position of the point relative to the line
        pos2line = slope * x - y
        # Calculate target slope for adjusting the line
        target_slope = (y + delta) / x
        # Calculate error in current slope
        error = target_slope - slope
        
        # Adjust slope based on point's label and position relative to the line
        if (label == 1 and pos2line < 0) or (label == 0 and pos2line > 0):
            # Update slope based on error
            slope += error 
        
        # Plot the adjusted line
        ax.plot(X, slope * X, linewidth=2, label=str(counter))
    
    # Return the adjusted slope
    return slope


X = np.arange(0, 3)
fig, ax = plt.subplots()
colours = ["orange", "yellow"]
label_name = ["Oranges", "Lemons"]

ax.set(xlabel='X', ylabel='Y', title='fruits')
slope_count = 1
ax.plot(X, 
        slope * X,  
        linewidth=2,
        label="initial")
slope = adjust(slope, delta=0.1)

ax.legend()
ax.grid()

print(f'The final value for the slope: {slope}')

plt.show()

OUTPUT:

The final value for the slope: 0.8962688409146223

No description has been provided for this image

This has worked out fine, but it still isn't okay. To show what can go wrong, we will add an orange in the area, where the lemons are supposed to be:

data = np.concatenate((data, np.array([[1.1, 1.6]])))
labels = np.concatenate((labels, np.array([1])))
fig, ax = plt.subplots()
colours = ["yellow", "orange"]
label_name = ["Lemons", "Oranges"]
for label in range(0, 2):
    ax.scatter(data[labels==label, 0], 
               data[labels==label, 1], 
               c=colours[label], 
               s=40, 
               label=label_name[label])

ax.set(xlabel='X', ylabel='Y', title='fruits')

OUTPUT:

[Text(0.5, 0, 'X'), Text(0, 0.5, 'Y'), Text(0.5, 1.0, 'fruits')]

No description has been provided for this image

We will apply our adaptive algorithm to this extended dataset:

start_slope = 0.3

def adjust(slope=0.3, delta=0.1):
    # Initialize variables
    counter = -1
    
    # Iterate through data points and labels
    for point, label in zip(data, labels):
        counter += 1
        x, y = point
        
        # Plot data points
        ax.scatter(x, y, color="yellow" if label == 0 else "orange")
        ax.annotate(str(counter), (x, y))
        
        # Calculate position of the point relative to the line
        pos2line = slope * x - y
        # Calculate target slope for adjusting the line
        target_slope = (y + delta) / x
        # Calculate error in current slope
        error = target_slope - slope
        
        # Adjust slope based on point's label and position relative to the line
        if (label == 1 and pos2line < 0) or (label == 0 and pos2line > 0):
            # Update slope based on error
            slope += error 
        
        # Plot the adjusted line
        ax.plot(X, slope * X, linewidth=2, label=str(counter))
    
    # Return the adjusted slope
    return slope


fig, ax = plt.subplots()
colours = ["orange", "yellow"]
label_name = ["Oranges", "Lemons"]


ax.set(xlabel='X', ylabel='Y', title='fruits')
slope_count = 1
ax.plot(X, 
        start_slope * X,  
        linewidth=2,
        label="initial")
slope = adjust(start_slope, delta=0.1)

ax.legend()
ax.grid()
print(f'The final value for the slope: {slope}')
plt.show()

OUTPUT:

The final value for the slope: 1.5454545454545454

No description has been provided for this image

We can see the newly added orange "destroys" the previously created result. The line number 3 (green) was perfect. The orange line which is positioned inside of the lemons caused the red line to be created.

Instead of correcting the error completely, we should only correct it a little bit in the necessary direction. This way outliers will not be capable of completely changing the result. For this purpos we will introduce a learning rate. We will use the learning rate to modify the corrections, i.e. make them less large.

The following Python program calculates a dividing line by going through all the fruits and dynamically adjusts the slope of the dividing line we want to calculate. If a point is above the line but should be below the line, the slope will be increment by the value of the learning_rate multiplied by the absolute error. If the point is below the line but should be above the line, the slope will correspondingly be decremented by the value of learning_rate multiplied by the absolute error.

learning_rate, start_slope, delta = 0.1, 0.3, 0.1

def adjust(slope=0.3, learning_rate=0.3, delta=0.3):
    counter = -1
    for (x, y), label in zip(data, labels):
        counter += 1 

        # Plot data points
        ax.scatter(x, y, color="yellow" if label == 0 else "orange")
        ax.annotate(str(counter), (x, y))
 
        # Calculate position of the point relative to the line
        pos2line = slope * x - y
        # Calculate target slope for adjusting the line
        target_slope = (y + delta) / x
        # Calculate error in current slope
        error = target_slope - slope
        
        # Adjust slope if the point is on the wrong side of the line
        if (label == 1 and pos2line < 0) or (label == 0 and pos2line > 0):
            slope += error * learning_rate
            # Plot the adjusted line
            ax.plot(X, slope * X, linewidth=2, label=str(counter))
            
    return slope


fig, ax = plt.subplots()
colours = ["orange", "yellow"]
label_name = ["Oranges", "Lemons"]


ax.set(xlabel='X', ylabel='Y', title='fruits')
slope_count = 1
ax.plot(X, 
        start_slope * X,  
        linewidth=2,
        label="initial")
slope = adjust(start_slope, learning_rate, delta)

ax.legend()
ax.grid()
plt.show()

print(slope)

OUTPUT:

No description has been provided for this image

0.5530792593978056

We can see in the previous class that we haven't found a proper dividing line. The reason is that our learning rate was too small for the data set, i.e. we don't have enough fruits for such a small learning rate. We could either try to get more fruits or we can call adjust multiple times, i.e. repeat the learning with the same data. We do this in the following code:

fig, ax = plt.subplots()
colours = ["orange", "yellow"]
label_name = ["Oranges", "Lemons"]


ax.set(xlabel='X', ylabel='Y', title='fruits')
slope_count = 1
ax.plot(X, 
        start_slope * X,  
        linewidth=2,
        label="initial")
slope = adjust(start_slope, learning_rate, delta)
# redo the learning, we use the current slope as the start slope:
slope = adjust(slope, learning_rate, delta)
# and again once more:
slope = adjust(slope, learning_rate, delta)

ax.legend()
ax.grid()
plt.show()

print(slope)

OUTPUT:

No description has been provided for this image

0.8283353043065619

We can be satisfied now!

In the following chapter we will see that this idea can be seen in simple neural networks with just one Neuron.

A simple Neural Network

We were capable of separating the two classes with a straight line. One might wonder what this has to do with neural networks. We will work out this connection below.

We are going to define a neural network to classify the previous data sets. Our neural network will only consist of one neuron. A neuron with two input values, one for 'sourness' and one for 'sweetness'.

A Neural Network with just one perceptron

The two input values - called in_data in our Python program below - have to be weighted by weight values. So solve our problem, we define a Perceptron class. An instance of the class is a Perceptron (or Neuron). It can be initialized with the input_length, i.e. the number of input values, and the weights, which can be given as a list, tuple or an array. If there are no values for the weights given or the parameter is set to None, we will initialize the weights to 1 / input_length.

In the following example choose -0.45 and 0.5 as the values for the weights. This is not the normal way to do it. A Neural Network calculates the weights automatically during its training phase, as we will learn later.

import numpy as np

class Perceptron:
    
    def __init__(self, weights):
        """
        'weights' can be a numpy array, list or a tuple with the
        actual values of the weights. The number of input values
        is indirectly defined by the length of 'weights'
        """
        self.weights = np.array(weights)
    
    def __call__(self, in_data):
        weighted_input = self.weights * in_data
        weighted_sum = weighted_input.sum()
        return weighted_sum
    

Please keep in mind: This is a simple Perceptron class. While it's a good start, a complete neural network usually includes additional components such as activation functions, training algorithms, and methods for updating weights.

Each instance of this Perceptron is callable like a function with a list (or array) of two elements.

p = Perceptron(weights=[-0.45, 0.5])
p([2.9, 4])

OUTPUT:

0.6950000000000001

We can call it with the data of our lemons and oranges:

for point in zip(oranges_x[:10], oranges_y[:10]):
    res = p(point)
    print(res, end=", ")

for point in zip(lemons_x[:10], lemons_y[:10]):
    res = p(point)
    print(res, end=", ")

OUTPUT:

-2.2902838040478413, -0.9296040826505769, -0.44288629422334447, -0.921141217382226, -1.1989570023785483, -2.1292119126574325, -1.9162051907454645, -2.034971699564993, -1.8904999620080214, -1.3808309390816484, 0.42354834533620345, 2.523998893154855, 0.6078323661276248, 0.621437065763484, 1.380734933409236, 0.6598427057645357, 1.778580807678696, 1.4961377308232242, 0.9583911443437048, 1.5974686976642136, 

We can see that we get a negative value, if we input an orange and a posive value, if we input a lemon. With this knowledge, we can calculate the accuracy of our neural network on this data set:

from collections import Counter
evaluation = Counter()
for point in zip(oranges_x, oranges_y):
    res = p(point)
    if res < 0:
        evaluation['corrects'] += 1
    else:
        evaluation['wrongs'] += 1


for point in zip(lemons_x, lemons_y):
    res = p(point)
    if res >= 0:
        evaluation['corrects'] += 1
    else:
        evaluation['wrongs'] += 1

print(evaluation)

OUTPUT:

Counter({'corrects': 200})

How does the calculation work? We multiply the input values with the weights and get negative and positive values. Let us examine what we get, if the calculation results in 0:

$$w_1 \cdot x_1 + w_2 \cdot x_2 = 0$$

We can change this equation into

$$ x_2 = -\frac{w_1}{w_2} \cdot x_1$$

We can compare this with the general form of a straight line

$$ y = m \cdot x + c$$

where:

We can easily see that our equation corresponds to the definition of a line and the slope (aka gradient) $m$ is $-\frac{w_1}{w_2}$ and $c$ is equal to 0.

This is a straight line separating the oranges and lemons, which is called the decision boundary.

We visualize this with the following Python program:

import time
import matplotlib.pyplot as plt
slope = 0.1

X = np.arange(0, 8)
fig, ax = plt.subplots()
ax.scatter(oranges_x, 
           oranges_y, 
           c="orange", 
           label="oranges")
ax.scatter(lemons_x, 
           lemons_y, 
           c="y", 
           label="lemons")

slope = 0.45 / 0.5
ax.plot(X, slope * X,  linewidth=2)


ax.grid()
plt.show()

print(slope)

OUTPUT:

No description has been provided for this image

0.9

Live Python training

instructor-led training course

Enjoying this page? We offer live Python training courses covering the content of this site.

See: Live Python courses overview

Upcoming online Courses

Enrol here

Training a Neural Network

As we mentioned in the previous section: We didn't train our network. We have adjusted the weights to values that we know would form a dividing line. We want to demonstrate now, what is necessary to train our simple neural network.

Before we start with this task, we will separate our data into training and test data in the following Python program. By setting the random_state to the value 42 we will have the same output for every run, which can be benifial for debugging purposes.

from sklearn.model_selection import train_test_split
import random

oranges = list(zip(oranges_x, oranges_y))
lemons = list(zip(lemons_x, lemons_y))

# labelling oranges with 0 and lemons with 1:
labelled_data = list(zip(oranges + lemons, 
                         [0] * len(oranges) + [1] * len(lemons)))
random.shuffle(labelled_data)

data, labels = zip(*labelled_data)

res = train_test_split(data, labels, 
                       train_size=0.8,
                       test_size=0.2,
                       random_state=42)
train_data, test_data, train_labels, test_labels = res    
print(train_data[:10], train_labels[:10])

OUTPUT:

[(3.633142293153684, 1.4104800824800643), (0.31701872308040424, 5.360489066616793), (2.917287629048313, 5.119837897273688), (6.159533837251676, 0.963012845430826), (6.061342184001881, 2.5773021115459667), (1.0376949212229944, 4.075714466674502), (3.8169075789446927, 4.956621468123399), (6.230773329060932, 1.1058034663738123), (0.8974783146897662, 3.5085056938163204), (5.084298975282835, 1.5553601207399503)] [0, 1, 1, 0, 0, 1, 1, 0, 1, 0]

As we start with two arbitrary weights, we cannot expect the result to be correct. For some points (fruits) it may return the proper value, i.e. 1 for a lemon and 0 for an orange. In case we get the wrong result, we have to correct our weight values. First we have to calculate the error. The error is the difference between the target or expected value (target_result) and the calculated value (calculated_result). With this error we have to adjust the weight values with an incremental value, i.e. $w_1 = w_1 + \Delta w_1$ and $w_2 = w_2 + \Delta w_2$

Adjusting the weights in a neural network

If the error e is 0, i.e. the target result is equal to the calculated result, we don't have to do anything. The network is perfect for these input values. If the error is not equal, we have to change the weights. We have to change the weights by adding small values to them. These values may be positive or negative. The amount we have a change a weight value depends on the error and on the input value. Let us assume, $x_1 = 0$ and $x_2 > 0$. In this case the result in this case solely results on the input $x_2$. This on the other hand means that we can minimize the error by changing solely $w_2$. If the error is negative, we will have to add a negative value to it, and if the error is positive, we will have to add a positive value to it. From this we can understand that whatever the input values are, we can multiply them with the error and we get values, we can add to the weights. One thing is still missing: Doing this we would learn to fast. We have many samples and each sample should only change the weights a little bit. Therefore we have to multiply this result with a learning rate (self.learning_rate). The learning rate is used to control how fast the weights are updated. Small values for the learning rate result in a long training process, larger values bear the risk of ending up in sub-optimal weight values. We will have a closer look at this in our chapter on backpropagation.

We are ready now to write the code for adapting the weights, which means training the network. For this purpose, we add a method 'adjust' to our Perceptron class. The task of this method is to crrect the error.

import numpy as np
from collections import Counter

class Perceptron:
    
    def __init__(self, weights, learning_rate=0.1):
        """
        Initialize the Perceptron with weights and learning rate.
        'weights' can be a numpy array, list, or tuple with the actual values of the weights.
        """
        self.weights = np.array(weights)
        self.learning_rate = learning_rate
     
    @staticmethod
    def unit_step_function(x):
        """
        Activation function: returns 0 if x < 0, otherwise returns 1.
        """
        return 0 if x < 0 else 1
        
    def __call__(self, in_data):
        """
        Perform a forward pass through the Perceptron and return the output.
        """
        weighted_sum = np.dot(self.weights, in_data)
        return Perceptron.unit_step_function(weighted_sum)
    
    def adjust(self, target_result, calculated_result, in_data):
        """
        Adjust the weights based on the error between the target and calculated results.
        """
        in_data = np.array(in_data)
        error = target_result - calculated_result
        if error != 0:
            correction = error * in_data * self.learning_rate
            self.weights += correction 
            
    def evaluate(self, data, labels):
        """
        Evaluate the Perceptron on the given data and labels and return a Counter object with evaluation results.
        """
        evaluation = Counter()
        for index in range(len(data)):
            label = int(round(self(data[index]), 0))
            if label == labels[index]:
                evaluation["correct"] += 1
            else:
                evaluation["wrong"] += 1
        return evaluation
                

p = Perceptron(weights=[0.1, 0.1],
               learning_rate=0.3)

for index in range(len(train_data)):
    p.adjust(train_labels[index], 
             p(train_data[index]), 
             train_data[index])
    
evaluation = p.evaluate(train_data, train_labels)
print(evaluation.most_common())
evaluation = p.evaluate(test_data, test_labels)
print(evaluation.most_common())

print(p.weights)

OUTPUT:

[('correct', 154), ('wrong', 6)]
[('correct', 39), ('wrong', 1)]
[-4.07592476  3.08532999]

Both on the learning and on the test data, we have only correct values, i.e. our network was capable of learning automatically and successfully!

We visualize the decision boundary with the following program:

import matplotlib.pyplot as plt
import numpy as np

X = np.arange(0, 7)
fig, ax = plt.subplots()

lemons = [train_data[i] for i in range(len(train_data)) if train_labels[i] == 1]
lemons_x, lemons_y = zip(*lemons)
oranges = [train_data[i] for i in range(len(train_data)) if train_labels[i] == 0]
oranges_x, oranges_y = zip(*oranges)

ax.scatter(oranges_x, oranges_y, c="orange")
ax.scatter(lemons_x, lemons_y, c="y")

w1 = p.weights[0]
w2 = p.weights[1]
m = -w1 / w2
ax.plot(X, m * X, label="decision boundary")
ax.legend()
plt.show()
print(p.weights)

OUTPUT:

No description has been provided for this image

[-4.07592476  3.08532999]

Let us have a look on the algorithm "in motion".

import numpy as np
import matplotlib.pyplot as plt
import matplotlib.cm as cm

p = Perceptron(weights=[0.1, 0.1],
               learning_rate=0.1)
number_of_colors = 10
colors = cm.rainbow(np.linspace(0, 1, number_of_colors))

fig, ax = plt.subplots()
ax.set_xticks(range(8))
ax.set_ylim([-2, 8])

counter = 0
for index in range(len(train_data)):
    old_weights = p.weights.copy()
    p.adjust(train_labels[index], 
             p(train_data[index]), 
             train_data[index])
    if not np.array_equal(old_weights, p.weights):
        color = "orange" if train_labels[index] == 0 else "y"        
        ax.scatter(train_data[index][0], 
                   train_data[index][1],
                   color=color)
        ax.annotate(str(counter), 
                    (train_data[index][0], train_data[index][1]))
        m = -p.weights[0] / p.weights[1]
        print(index, m, p.weights, train_data[index])
        ax.plot(X, m * X, label=str(counter), color=colors[counter])
        counter += 1
ax.legend()
plt.show()

OUTPUT:

0 -6.414786990990161 [-0.26331423 -0.04104801] (3.633142293153684, 1.4104800824800643)
1 0.46790290229689485 [-0.23161236  0.4950009 ] (0.31701872308040424, 5.360489066616793)
17 4.196905933001318 [-0.78963697  0.18814741] (5.580246158717744, 3.068534904219927)
18 1.002329056481867 [-0.66109105  0.6595549 ] (1.285459277506332, 4.714074965113091)

No description has been provided for this image

Each of the points in the diagram above cause a change in the weights. We see them numbered in the order of their appearance and the corresponding straight line. This way we can see how the networks "learns".

Live Python training

instructor-led training course

Enjoying this page? We offer live Python training courses covering the content of this site.

See: Live Python courses overview

Upcoming online Courses

Enrol here