We start by importing everything we will be using to make the CNN.
import torch
import tensorflow as tf
from PIL import Image
from torchvision import transforms, models
Download the imagenet class list
!wget -O imagenet_classes.txt https://raw.githubusercontent.com/Lasagne/Recipes/master/examples/resnet50/imagenet_classes.txt?fbclid=IwAR19mHA3rPwm_4OynZs_G4oUG9qVhK33aMM7Z2ASLxNUChPp4LE6-V0GQ9Q
Start by loading the classes from the file. I output the top 5 of them just for confirmation.
with open('imagenet_classes.txt') as f: #read the categories from file
classes = [line.strip() for line in f.readlines()]
print(classes[0:5])
We load the image using PIL.
image = Image.open('./WelshCorgi.jpeg') #load image
Next we need to define a helper function for transforming images into the proper dimensions for AlexNet. We also want to transform the image into a tensor and then create a "batch" from our image (Add another dimension to the image). The final step is to normalize the image data to the means and standard deviation from the imagenet database. Otherwise the results wouldn't be very useful
def preprocess(image):
transform = transforms.Compose([
transforms.Resize(256), #change image dims to 256
transforms.CenterCrop(224), #crop the data to 224x224
transforms.ToTensor(), #convert to tensor
transforms.Normalize( #normalize data with imagenet mean and std dev
mean=[0.485, 0.456, 0.406],
std=[0.229, 0.224, 0.225]
)
])
img_n = transform(image) #apply the transform
return torch.unsqueeze(img_n, 0) #add one dimension to the start
Now we have to apply these changes to the image we will be using.
data = preprocess(image)
First we load the image and the AlexNet not trained on any data. The weights will be initialized to random values.
alex_net = models.alexnet() #random weights
alex_net.eval() #set to evaluation mode
Lets output the number of input and output features of the last layer
params = [y.numel() for y in alex_net.parameters()]
print("Last layer has", params[-2], "input features and", params[-1], "output features")
Now we are ready to get a prediction from the CNN.
image_pred = alex_net(data) #get a prediction
We define a qucik function to summarize the top 5 predictions and print them out for us.
def summarize(out): #helper util to summarize the top 5 predictions
indices = torch.argsort(out, dim=1, descending=True)
prob = torch.nn.functional.softmax(out, dim=1)[0] * 100
top = [(classes[index], prob[index].item()) for index in indices[0][:5]]
for val in top:
print("Prediction:",val[0], "Probability",val[1])
So what is our picture?
summarize(image_pred)
We'll continue using the above functions for a CNN pretrained on imagenet data.
net = models.alexnet(pretrained=True) #load pre trained data
net.eval() #set to evaluation mode
image_pred2 = net(data) #get a prediction
summarize(image_pred2) #get summary
These are much better predictions!