Deep Learning Summer Research 2020: August 2020

Monday, August 31, 2020

Unit 4: Transfer Learning

Due to the size of our training dataset and the limitations of our computing resource, the CNNs we have constructed may not be able to reach high accuracy in validation throughout the training & validation process. For example, the recognition of 20 bird species can only reach 66.28% accuracy for validation.

In the diagram, the validation loss increases as the the training accuracy decreases, and it is the sign of overfitting. Through data augmentation, we can increase the size of the dataset dramatically. For example, the following code utilizing the classificationAugmentationPipeline function can perform data augmentation:

imdsTrainAugmented = transform(imdsTrain,@classificationAugmentationPipeline,'IncludeInfo',true);
imds_cat = imageDatastore(cat(1, imdsTrain.Files, imdsTrainAugmented1.UnderlyingDatastore.Files));
imds_cat.Labels = cat(1, imdsTrain.Labels, imdsTrainAugmented1.UnderlyingDatastore.Labels);
imdsTrain = imds_cat;
.......
function [dataOut,info] = classificationAugmentationPipeline(dataIn,info)

dataOut = cell([size(dataIn,1),2]);

for idx = 1:size(dataIn,1)
temp = dataIn{idx};

% Randomized Gaussian blur
temp = imgaussfilt(temp,1.5*rand);

% Add salt and pepper noise
%temp = imnoise(temp,'salt & pepper');

% Add randomized rotation and scale
tform = randomAffine2d('Scale',[0.95,1.05],'Rotation',[-15 15], 'XTranslation',[-15 15],'YTranslation',[-15 15], 'XReflection',true,'XShear',[-30 30]);
outputView = affineOutputView(size(temp),tform);
temp = imwarp(temp,tform,'OutputView',outputView);
%temp = jitterColorHSV(temp,'Hue',[0.05 0.15]);

% Form second column expected by trainNetwork which is expected response,
% the categorical label in this case
dataOut(idx,:) = {temp,info.Label(idx)};
end

However, it will also dramatically slow down the training process!

The more practical solution is to use transfer learning - utilize and modify the pre-trained networks for image recognition and retrain it with our bird images. It will boost the validation accuracy dramatically!

There are two ways to utilize the existing pre-trained network: (1) using Matlab script to replace the the last convolution layer, the classification layer and the output layer. (2) using Deep Network Designer provided by the Matlab to conduct all the network modifications, data input configuration and training through a visual programming interface.

The reason to replace the last convolutional layer is to adapt the pre-trained network to recognize new image dataset without changing the underlying lower-level feature extraction. The most of the lower-level feature extraction processes for different images (e.g. birds, cars, faces, etc.) are very similar, so we can take advantage of the existing networks.

The reason to replace the classification and output layers is to match the number of classes we are classifying. For example, both Alexnet and ResNet-18 are trained by over one million images in 1000 categories. However, in our bird recognition project, we only have 20 categories.

In addition, we have to resize the input images to match the pre-trained networks. For example, the Alexnet requires the input images to be 227 x 227, and our birds dataset images are 224 x 224.

In the following diagram, by utilizing pre-trained Alexnet, the validation data accuracy can reaches over 93%!

In the following diagram, by utilizing pre-trained ResNet-18 the validation data accuracy now reaches over 96%!

Unit 4 Project: Transfer Learning Using Alexnet & ResNet-18 for Birds Recognition

The goal of this project is to improve the performance of bird recognition to over 90% in the Unit 3 project by retraining existing CNNs such as Alexnet and ResNet-18 provided in the Matlab Deep Learning Toolbox.

Project Deadline: 09/03/2020

Monday, August 24, 2020

Unit 3: Matlab Deep Learning Toolbox

Matlab provides an integrated tool to streamline the process of building deep learning neural network.

A Matlab example implementing CNN to recognize hand-written digits can be found at the link.

A Matlab script file of the same example can be found at the link (digits.m).

A detailed commented script file of the same example can be found at the link (dl.m).

Unit 3 Project:

Birds Recognition

The goal of this project is to recognize 20 common species of birds using convolutional neural network (CNN). There are 20 species of birds in the dataset. Their names, labels and corresponding file counts can be found in the file (link).

Please use the following link to locate dataset files and guidelines for your project:

Input Dataset: birds dataset
Training Dataset: randomly pick 80% from the input dataset.
Validation Dataset: remaining 20% from the input dataset.
Goal: Accuracy reach 90%.

Project Deadline: 08/27/2020

Wednesday, August 12, 2020

Unit 2: Convolutional Neural Network (CNN)

Unit 2 Assignment: Convolutional Neural Network (CNN)

Please use the "Basic Tutorials > Convolutional Neural Network (CNN)" on the right panel to watch tutorials about Convolutional Neural Network (CNN). Please take notes and create PowerPoint presentation(s) to summarize what you have learned. A PowerPoint template has been provided (click link).

In addition, in order to prepare for running the CNN-related project, please download and install the 30-day free-trial version of MATLAB R2020a with Deep Learning Toolbox (link).

Once the installation is done, you can run a quick experiment following the tutorial in the link. Download the code at the bottom of the webpage. Open and run it in the Matlab. You will encounter errors shown in red. Follow the directions in the error messages and download whatever missing. Repeat this process twice until there is no more errors. Now, you can use your webcam to focus on some common objects and see how the CNN can identify them!

Assignment Due Date: 08/20//2020