What are artificial neural networks?

Artificial neural networks (ANNs) are computer models inspired by the design and working of the human brain. They are a subset of machine learning algorithms used for various tasks, including classification, regression, pattern recognition, and decision-making.

Components of an ANN

An ANN consists of the following elements:

  • An input layer: The first layer, which accepts input data and transmits it to the layers below.
  • One or more hidden layers: Layers that process data via interconnected neurons between the input and output layers.
  • An output layer: The layer at the end of the network that generates predictions or outputs.
  • Neurons: The fundamental processing units take in inputs, apply weights and biases, and then, through an activation function, produce an output.
  • Weights: Parameters that control the intensity of connectivity between neurons and thus the direction of information flow.
  • Biases: Additional neuronal characteristics that adjust the network’s behavior and change the activation function.
  • Activation function: A non-linear function applied to the weighted sum of inputs in each neuron, introducing non-linearity to the network.
  • Forward propagation: The method of creating predictions by transferring input data from the input layer to the output layer through the network.
  • Backpropagation: The process of computing gradients of the error with respect to weights and biases to adjust them during training.
  • Loss function: A function that gauges the performance of the network by calculating the difference between predicted results and actual labels.
Architecture of ANN
Architecture of ANN

Training the model

Training an ANN entails modifying the weights and biases using a learning algorithm and a labeled dataset. The goal of the network is to minimize the gap between its actual and predicted output (i.e., the error). Gradient descent or its derivatives are commonly used in this technique.

Python with TensorFlow/Keras ANN example

In this example, we’ll build a basic feedforward ANNWith no feedback connections, information flows unidirectionally from the input layer to the output layer in a feedforward ANN. utilizing the popular deep-learning package TensorFlow and its high-level API, Keras. We will construct a simple neural network using a synthetic dataset to tackle a binary classification problem.

  • First, the required libraries will be imported.

    import numpy as np
    import tensorflow as tf
    from tensorflow import keras
    from tensorflow.keras import layers
    
  • Let’s generate a synthetic dataset X that consists of two features (X1 and X2) and the labels y for binary classification.

    np.random.seed(55)
    X = np.random.rand(1200, 2)
    y = (X[:, 0] + X[:, 1] > 1).astype(float)
    
  • Create training and testing sets from the dataset.

    from sklearn.model_selection import train_test_split
    
    X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.25, random_state=55)
    
  • Create the neural network model.

    model = keras.Sequential([
      layers.Dense(64, activation='relu', input_shape=(2,)),
      layers.Dense(1, activation='sigmoid')
    ])
    
  • Now the model will be compiled.

    model.compile(optimizer='adam', loss='binary_crossentropy', metrics=['accuracy'])
    
  • Utilise the training data to train the model.

    history = model.fit(X_train, y_train, epochs=25, batch_size=32, validation_split=0.2)
    
  • Using the test data, evaluate the model.

    loss, accuracy = model.evaluate(X_test, y_test)
    print(f'Test loss: {loss:.4f}, Test accuracy: {accuracy:.4f}')
    
  • Now we can make predictions with the help of the trained model.

    predictions = model.predict(X_test)
    

We’ve built a simple ANN for a binary classification problem using Python and TensorFlow/Keras. Note that this is a simplistic example and that we may need to perform data preprocessing, adjust hyperparameters, and manage more complex architectures for better performance in real-world situations.

Try it out

The output of the code snippets can be observed below after clicking the “Run” button:

import numpy as np
import tensorflow as tf
from tensorflow import keras
from tensorflow.keras import layers
# Generate synthetic data
np.random.seed(55)
X = np.random.rand(1200, 2)
y = (X[:, 0] + X[:, 1] > 1).astype(float)
from sklearn.model_selection import train_test_split
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.25, random_state=55)
model = keras.Sequential([
layers.Dense(64, activation='relu', input_shape=(2,)),
layers.Dense(1, activation='sigmoid')
])
model.compile(optimizer='adam', loss='binary_crossentropy', metrics=['accuracy'])
history = model.fit(X_train, y_train, epochs=25, batch_size=32, validation_split=0.2)
loss, accuracy = model.evaluate(X_test, y_test)
print(f'Test loss: {loss:.4f}, Test accuracy: {accuracy:.4f}')
predictions = model.predict(X_test)

Code explanation

  • Lines7–9: The code generates a synthetic dataset X with 1200 samples and two features, and a target array y containing binary labels based on the condition whether the sum of the two features in X is greater than 1.

  • Line 13: It then splits the dataset X and corresponding labels y into training and testing sets using train_test_split() function from scikit-learn library, with 25% of the data reserved (X_test, y_test) for testing and 75% of the data (X_train, y_train) reserved for training. The random_state is an optional parameter that allows you to set a seed for the random number generator. Providing a specific random_state ensures that the data split will be the same each time you run the code, which is helpful for reproducibility. In this case, random_state=55 sets the random seed to 55.

  • Lines 15–18: The code defines a simple neural network model using Keras with two dense layers. The first layer has 64 neurons and uses relu activation, while the second layer has one neuron with a sigmoid activation function, suitable for binary classification with two input features.

  • Line 20: The model is compiled using the compile() function. It takes different parameters: adam as the optimizer, binary_crossentropy as the loss function, and accuracy is used as the evaluation metric.

  • Line 22: The model is trained using the fit() function on the training data (X_train, y_train) for 25 epochs with a batch_size of 32, and 20% of the training data is used for validation during training.

  • Lines 24–25: After training, the model’s performance is evaluated using the evaluate() function on the test set (X_test, y_test), and the test loss and accuracy are printed.

  • Line 27: Finally, the model is used to make predictions on the test set X_test using the predict() function, and the predictions are stored in the predictions variable.

Copyright ©2024 Educative, Inc. All rights reserved