Artificial neural networks (ANNs) are computer models inspired by the design and working of the human brain. They are a subset of machine learning algorithms used for various tasks, including classification, regression, pattern recognition, and decision-making.
An ANN consists of the following elements:
Training an ANN entails modifying the weights and biases using a learning algorithm and a labeled dataset. The goal of the network is to minimize the gap between its actual and predicted output (i.e., the error). Gradient descent or its derivatives are commonly used in this technique.
In this example, we’ll build a basic
First, the required libraries will be imported.
import numpy as np
import tensorflow as tf
from tensorflow import keras
from tensorflow.keras import layers
Let’s generate a synthetic dataset X
that consists of two features (X1 and X2) and the labels y
for binary classification.
np.random.seed(55)
X = np.random.rand(1200, 2)
y = (X[:, 0] + X[:, 1] > 1).astype(float)
Create training and testing sets from the dataset.
from sklearn.model_selection import train_test_split
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.25, random_state=55)
Create the neural network model.
model = keras.Sequential([
layers.Dense(64, activation='relu', input_shape=(2,)),
layers.Dense(1, activation='sigmoid')
])
Now the model will be compiled.
model.compile(optimizer='adam', loss='binary_crossentropy', metrics=['accuracy'])
Utilise the training data to train the model.
history = model.fit(X_train, y_train, epochs=25, batch_size=32, validation_split=0.2)
Using the test data, evaluate the model.
loss, accuracy = model.evaluate(X_test, y_test)
print(f'Test loss: {loss:.4f}, Test accuracy: {accuracy:.4f}')
Now we can make predictions with the help of the trained model.
predictions = model.predict(X_test)
We’ve built a simple ANN for a binary classification problem using Python and TensorFlow/Keras. Note that this is a simplistic example and that we may need to perform data preprocessing, adjust hyperparameters, and manage more complex architectures for better performance in real-world situations.
The output of the code snippets can be observed below after clicking the “Run” button:
import numpy as npimport tensorflow as tffrom tensorflow import kerasfrom tensorflow.keras import layers# Generate synthetic datanp.random.seed(55)X = np.random.rand(1200, 2)y = (X[:, 0] + X[:, 1] > 1).astype(float)from sklearn.model_selection import train_test_splitX_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.25, random_state=55)model = keras.Sequential([layers.Dense(64, activation='relu', input_shape=(2,)),layers.Dense(1, activation='sigmoid')])model.compile(optimizer='adam', loss='binary_crossentropy', metrics=['accuracy'])history = model.fit(X_train, y_train, epochs=25, batch_size=32, validation_split=0.2)loss, accuracy = model.evaluate(X_test, y_test)print(f'Test loss: {loss:.4f}, Test accuracy: {accuracy:.4f}')predictions = model.predict(X_test)
Lines7–9: The code generates a synthetic dataset X
with 1200 samples and two features, and a target array y
containing binary labels based on the condition whether the sum of the two features in X
is greater than 1.
Line 13: It then splits the dataset X
and corresponding labels y
into training and testing sets using train_test_split()
function from scikit-learn library, with 25% of the data reserved (X_test
, y_test
) for testing and 75% of the data (X_train
, y_train
) reserved for training. The random_state
is an optional parameter that allows you to set a seed for the random number generator. Providing a specific random_state
ensures that the data split will be the same each time you run the code, which is helpful for reproducibility. In this case, random_state=55
sets the random seed to 55
.
Lines 15–18: The code defines a simple neural network model using Keras with two dense layers. The first layer has 64 neurons and uses relu
activation, while the second layer has one neuron with a sigmoid
activation function, suitable for binary classification with two input features.
Line 20: The model is compiled using the compile()
function. It takes different parameters: adam
as the optimizer
, binary_crossentropy
as the loss
function, and accuracy
is used as the evaluation metric.
Line 22: The model is trained using the fit()
function on the training data (X_train
, y_train
) for 25 epochs with a batch_size
of 32, and 20% of the training data is used for validation during training.
Lines 24–25: After training, the model’s performance is evaluated using the evaluate()
function on the test set (X_test
, y_test
), and the test loss
and accuracy
are printed.
Line 27: Finally, the model is used to make predictions on the test set X_test
using the predict()
function, and the predictions are stored in the predictions
variable.