Understanding Neural Networks: How They Work, Layer Calculation, and Practical Example

December 14, 2024

Neural networks are the backbone of modern artificial intelligence and machine learning. They mimic the human brain to process data, recognize patterns, and make decisions. From self-driving cars to recommendation systems, neural networks power a wide range of applications.

In this comprehensive guide, we will:

Understand how neural networks work.
Learn how to calculate neurons in each layer.
Determine the number of hidden layers and neurons.
Explore an example with a step-by-step breakdown.
Illustrate weight calculations with animations (conceptually explained).

Let’s dive into the fascinating world of neural networks!

What is a Neural Network?

A neural network is a computational model inspired by biological neurons. It consists of layers:

Input Layer: Takes the input features.
Hidden Layers: Process the inputs using weights and biases.
Output Layer: Provides the final prediction or classification.

Each layer consists of neurons (or nodes) connected by weights. Activation functions introduce non-linearity, enabling the network to solve complex problems.

How Neural Networks Work

Forward Propagation:
- Input features are multiplied by weights and summed.
- A bias is added, and the result is passed through an activation function.
- This output becomes the input for the next layer.
Mathematically:
$z = \sum_{i=1}^n (w_i \cdot x_i) + b$ $a = \text{ActivationFunction}(z)$
Error Calculation:
The predicted output is compared with the actual output using a loss function (e.g., Mean Squared Error).
Backpropagation:
- The error is propagated backward to adjust weights using gradient descent.
- Weights are updated iteratively to minimize the error.

How to Calculate Neurons in Each Layer

Input Layer

The number of neurons equals the number of input features. For example:

If the input data has 10 features, the input layer has 10 neurons.

Hidden Layers

The choice of hidden layers and neurons depends on:

Complexity of the problem.
Size of the dataset.

A general heuristic for neurons in hidden layers:

n_h = \frac{n_i + n_o}{2}

Where:

$n_h$ : Number of neurons in the hidden layer.
$n_i$ : Number of neurons in the input layer.
$n_o$ : Number of neurons in the output layer.

Output Layer

For binary classification: 1 neuron with a sigmoid activation function.
For multi-class classification: Neurons equal to the number of classes, with softmax activation.

How Many Hidden Layers Should a Neural Network Have?

Shallow Networks: 1-2 hidden layers suffice for simple tasks.
Deep Networks: Complex tasks (e.g., image recognition) may require multiple hidden layers.

The optimal number of hidden layers and neurons is often determined through hyperparameter tuning.

Practical Example: Predicting House Prices

Dataset

We have a dataset with the following features:

Input Features: Square footage, number of bedrooms, location score.
Output: Price (continuous variable).

Neural Network Design

Input Layer: 3 neurons (for 3 features).
Hidden Layer 1: 5 neurons (calculated using heuristic).
Hidden Layer 2: 3 neurons (reduced for simplicity).
Output Layer: 1 neuron (predicting price).

Step-by-Step Explanation

Step 1: Initialize Weights and Biases

Randomly initialize weights ( $w$ ) and biases ( $b$ ) for each layer.

Step 2: Forward Propagation

For each neuron in the hidden layer:

Multiply inputs by weights.
Sum the results and add a bias.
Pass through an activation function (e.g., ReLU or Sigmoid).

Step 3: Backpropagation

Calculate the error using a loss function: $Loss = \frac{1}{N} \sum_{i=1}^N (y_{true} - y_{pred})^2$
Compute gradients and adjust weights using gradient descent: $w = w - η \cdot \frac{\partial L o s s}{\partial w}$

Weight Calculation Example

Imagine a simplified network:

Inputs: [1500 (sq ft), 3 (bedrooms), 8 (location score)].
Weights: Randomly initialized as: $W = \begin{bmatrix} 0.2 & 0.4 & 0.6 \\ 0.1 & 0.3 & 0.5 \end{bmatrix}$
Bias: $b = [0.1, 0.2]$ .

Hidden Layer Calculations:

Compute weighted sum:
$z_1 = (1500 \cdot 0.2) + (3 \cdot 0.4) + (8 \cdot 0.6) + 0.1$ $z_2 = (1500 \cdot 0.1) + (3 \cdot 0.3) + (8 \cdot 0.5) + 0.2$
Apply activation function (e.g., ReLU):
$a = \max(0, z)$

Repeat for the Output Layer:

Combine hidden layer outputs to compute the final prediction.

Python Implementation

from keras.models import Sequential
from keras.layers import Dense

# Create the model
model = Sequential()
model.add(Dense(5, input_dim=3, activation='relu'))  # Hidden Layer 1
model.add(Dense(3, activation='relu'))              # Hidden Layer 2
model.add(Dense(1, activation='linear'))            # Output Layer

# Compile the model
model.compile(optimizer='adam', loss='mean_squared_error')

# Train the model (dummy data example)
X = [[1500, 3, 8], [2000, 4, 9], [1800, 3, 7]]
y = [300000, 400000, 350000]
model.fit(X, y, epochs=50, verbose=1)

Conclusion

Understanding the inner workings of neural networks is crucial for building efficient machine learning models. By following the steps outlined in this guide, you can design networks tailored to your specific tasks.

Search This Blog

Creative World