Particle Swarm Optimization

Introduction

Particle swarm optimization (PSO) is a very well establish computational method that optimizes a problem by iteratively improving a candidate solution with regard to an objective function. PSO is a meta-heuristic optimization algorithm, which means it is problem-independent techniques. Therefore, it won't take advantage of any specificity of the problem and can be used as black box.

PSO is a stochastic, population-based algorithms and utilizes a population of candidate solutions, named particles, that move around in the search-space seeking for the global optimum.

The movement of each particle is dictated by 3 components (Figure 1):

Figure 1. Particle's movement in PSO algorithm - Part 1

The particle's movement is dictated by its previous moves (purple arrow), its own best known position (red arrow), and the best known position among all particles in the swarm(green). The sum of these there forces dictates where the particle will land in the next iteration (Figure 2):

Figure 2. Particle's movement in PSO algorithm - Part 2

In the next iteration, let's assume the group best position improved a little:

Figure 3. Particle's movement in PSO algorithm - Part 3

In figure 3, note that current direction is given by the previous move. In then next step, the particle will land at Xi+2, again, according to the sum of the 3 vectors (Figure 4):

Figure 4. Particle's movement in PSO algorithm - Part 4

In Figure 1-4 I showed the moves of one single particle. Imagine this happening for several particles: after a certain amount of iterations, the swarm of particles will concentrate on the global minimum. Why does the algorithm find the optimum position? Because the particles keep track of the best personal and global positions and search around them, the chance of finding a better position is high, eventually pulling the particles to hit the target.

PSO - The Math:

Assuming P particles, the position of the particle i at the time t is denoted by a vector of length d:

At every iteration each particle's position is updated according to the equation:

The position of the particle at the next iteration is equal to its current position plus the velocity vector which is given by:

Vi is the current direction of the particle, Pi is the particle's best personal location, and Gi is the group's best location. r1 and r2 are random numbers between 0 and 1: the fact that r1 and r2 are sampled at each iteration makes the PSO a stochastic algorithm. w (inertia weight), c1 and c2 are hyperparameters. w tunes exploration and exploration. c1 and c2 affect the impact of the individual and global component on particle's velocity.

Each term in (3) have its own name:

The inertia component carries forwards the direction of the particle's move from the previous iteration; the individual (or cognitive) term compares the particle current position with its own best position; the social component calculates the distance between the current particle location and the best position of the whole swarm.

Implementation

The figure below depicts the landscape the particles will explore. The global minimum is located around the center of the figure (white cross).

Figure 5. Objective function landscape

Below you will find the python code of the PSO algorithm. The Jupyter Notebook can be found HERE.

def particle_swarm(n_iterations, w, c1, c2, 
                   n_particles, x, y, loss_function): 
    
    # Calculate Z   
    X, Y = np.meshgrid(x, y)
    Z = loss_function(X, Y)
    
    # Find the location of the global minimum
    x_min = X.ravel()[np.argmin(Z)]
    y_min = Y.ravel()[np.argmin(Z)]
    min_tuples = (x_min, y_min)
    
    # Empty dictionary to store particle information
    particle_positions = {}
    
    # Get the range of x and y
    x_lim_min, x_lim_max = np.min(x), np.max(x)
    y_lim_min, y_lim_max = np.min(y), np.max(y)
    
    # Random initialization of particles' starting position
    particles_x =np.random.rand(n_particles) * (x_lim_max - x_lim_min) - \      
    x_lim_max
    
    particles_y =np.random.rand(n_particles) * (y_lim_max - y_lim_min) - \
    y_lim_max
    
    start_position_particles =np.concatenate([particles_x.reshape(-1,1), \             
    particles_y.reshape(-1,1)], axis = 1)

    # Initial random velocity of the particles
    start_velocity = np.random.normal(size = (n_particles, 2)) * 0.5

    # Initially, the best individual position is the starting position
    best_personal_position = start_position_particles
    
    # Calculate best group position 
    loss = loss_function(start_position_particles[:,0], \    
    start_position_particles[:,1])
    
    best_group_location = start_position_particles[np.argmin(loss),:]
    
    # Storing start position, best_personal_position, initial velocity,      
    # best_group_location
    particle_positions[0] = [start_position_particles, 
                             best_personal_position, 
                             start_velocity, 
                             best_group_location]
    
    # Main loop 
    for iteration in range(n_iterations):
        
        # Give random value to  r1 and r2
        r1, r2 = np.random.rand(2)
        
        # Extract previous iteration values
        current_position = particle_positions[iteration][0]
        current_velocity = particle_positions[iteration][2]
        current_best_personal_position = particle_positions[iteration][1]
        current_best_group_position = particle_positions[iteration][3]
        
        # Calculate velocity at next iteration
        V_next = w * current_velocity + \
        c1*r1*(current_best_personal_position - current_position) + \
        c2*r2*(current_best_group_position - current_position)
        
        # Calculate position at next iteration
        x_next = current_position + V_next
        
        # Calculate current and next position's loss
        previous_loss = loss_function(current_position[:,0], \
        current_position[:,1])
        next_loss = loss_function(x_next[:,0], x_next[:,1])
        losses = np.concatenate([previous_loss,next_loss])
        
        # Update best personal position 
        update_personal_best_indexes = \
        np.argmin(losses.reshape(2,n_particles), axis = 0)
        
        previous_next_position_concat = \
        np.concatenate([np.expand_dims(current_position, axis = 1), 
                        np.expand_dims(x_next, axis = 1)], axis =1)
        
        next_best_personal_position = previous_next_position_concat[ \ 
        np.arange(n_particles),update_personal_best_indexes,:]
        
        # Calculate current and next best group position
        current_best_group_location_loss =     
        loss_function(current_best_group_position[0],                                               
        current_best_group_position[1])
        
        next_best_group_location = x_next[np.argmin(next_loss),:]
        next_best_group_location_loss = np.min(next_loss)
        
        # Update best group position
        if current_best_group_location_loss<next_best_group_location_loss:         
            next_best_group_location = current_best_group_position
            
        # Storing new values in the dictionary  
        particle_positions[iteration + 1] = [x_next, 
                                             next_best_personal_position,
                                             V_next,
                                             next_best_group_location]
    
    # Plot location and velocity of particles at 6 iterations
    delta = round((n_iterations-1) / 5 )
    indexes_to_plot = list(range(0, n_iterations, delta ))
    indexes_to_plot.append(n_iterations)
    cols = 2
    rows = 3

    # Create the figure
    plt.figure(figsize = (8*cols,6*rows))
    
    # For loop for the subplots 
    for num, ind in enumerate(indexes_to_plot):
        current_position = particle_positions[ind][0]
        current_position_velocity = particle_positions[ind][2]
        plt.subplot(rows,cols , num+1)
        plt.title(f"Iteration: {ind}")
        # Plot the contours
        plt.contourf(X, Y, Z, 50, cmap='jet')
        # Plot the particles' location 
        plt.scatter(current_position[:,0], current_position[:,1], color = 
        'red')    
        # Plot the particles' velocity
        plt.quiver(current_position[:,0].ravel(), 
                   current_position[:,1].ravel(), 
                   current_position_velocity[:,0].ravel(), 
                   current_position_velocity[:,1].ravel(), 
                   color='blue', width=0.005, angles='xy', 
                   scale_units='xy', scale=1)
        # Display the location of the minimum
        plt.plot(x_min, y_min, marker = 'x', color = 'white', \
        markersize = 10, label = 'minumum')
        plt.xlim([x_lim_min, x_lim_max])
        plt.ylim([y_lim_min, y_lim_max])
        plt.colorbar()
       
    plt.tight_layout()
    plt.show()
    
    # Return the dictionary with the particles' history
    return particle_positions

Let's test the PSO algorithm with 30 particles, 30 iterations, w = 0.5, c1=c2=0.1:

Figure 6. PSO algorithm result

After 30 iteration, the particles successfully concentrated tightly around the optimum's location as we wanted. The animation below displays the movement of the particles at each iteration (* indicates the group's best location:

Figure 7. PSO algorithm result - animation

Concluding remarks

In this article the inertia weight w was kept constant; however, it is not uncommon to decrease its value at each iteration (for instance from an initial value of 0.9 to a final value of 0.4). Lowering w will force the particles to explore less and focus more on individual and collective best locations.

PSO does not require calculating the derivative of the objective function in contrast to gradient descent, which makes it the preferable optimization algorithm when the derivative of the objective function is hard to derive. Furthermore, because it has a relative low number of hyperparameters, PSO is easy to tune and implement, and can solve a wide range of problems.

Follow me on Twitter and Facebook to stay updated.

#Python #MachineLearning #PSO #SwarmIntelligence #Optimization

Particle Swarm Optimization

Introduction

PSO - The Math:

Implementation

Concluding remarks

Recent Posts