Natural Images Classification
- maheshkamineni35
- May 3, 2022
- 4 min read
Updated: May 4, 2022
A Convolutional Neural Network (ConvNet/CNN) is a Deep Learning algorithm which can take in an input image, assign importance (learnable weights and biases) to various aspects/objects in the image and be able to differentiate one from the other. The pre-processing required in a ConvNet is much lower as compared to other classification algorithms. While in primitive methods filters are hand-engineered, with enough training, ConvNets have the ability to learn these filters/characteristics.
The architecture of a ConvNet is analogous to that of the connectivity pattern of Neurons in the Human Brain and was inspired by the organization of the Visual Cortex. Individual neurons respond to stimuli only in a restricted region of the visual field known as the Receptive Field. A collection of such fields overlap to cover the entire visual area.

About Dataset
This dataset contains 6,899 images from 8 distinct classes compiled from various sources (see Acknowledgements). The classes include airplane, car, cat, dog, flower, fruit, motorbike and person.
Through Manual inspection we can find the images are not correctly resized
so we have to resize the images
Preprocessing
Resizing Images
Here we will be resizing the image in order to maintain consistancy. As our network will be receiving inputs of the same size so all the image needs to be resized to a fixed size.The larger is the fixed size the less shrinking required. We are keeping the image size atr 50x50
In recent times neural networks have become much deeper with networks going from a minimal number of layers (e.g. AlexNet, LeNet, VGGNet) to over a hundred layers (ResNet).
Residual Learning was introduced by Heetal. from Microsoft Research in the paper titled “Deep Residual Learning for Image Recognition. From observing the existing neural networks at the time, they found that accuracy does not come with simply adding layers to the network in order to make it better at image recognition.
AlexNet → 8 Layers
VGGNet→ 16 Layers
GoogleLeNet→ 22 Layers
If we were to add another layer to a model that is already deep enough, the next layer should aim to be a direct copy of the previous block known as the ID map. This is because the model has already calculated certain dynamic features. However, 1 suggests that it is difficult to read this personal map. Therefore, adding additional layers is considered ineffective and we have to choose a new solution. This is where the "In-depth Study of the Remains of Image Recognition" comes into play
In the next step I converted images into an array

Next step is to Load Data

and next I imported random function and shuffle the trained data
Visualize the data
for this I imported sk.learn preprocessing and keras categorical packages

Weight Initialisation
Weight initialisation is used to define the initial values for the parameters in a model prior to training on the dataset.Since we are using transfer learning, we add predetermined imagenet weights to aid the ResNet model.
Correct for data imbalance
here the data correctness can be done,by using formula

Splitting of Training data

Batch Normalisation
This is a layer that is already included in the ResNet50 architecture so it was not something we had to add ourselves. Batch normalisation allows for every layer of the network to do learning more independently and it is used to normalize the output of the previous layers.[12] Basically, it is a technique that standardizes inputs to each layer for each mini-batch.
Transfer Learning
Transfer learning allows us to utilize the elements of a pre-trained model by reusing them in a new machine learning model . Since the ResNet Architecture is built on top of ImageNet, we can load pre-trained convolutional weights and train our model on top of it. The sequential layer is the easiest way to build a model in keras and it allows us to build a model layer by layer. We use the .add() function

Train the model

Train and test the model
After this we have train the data the data set we have taken a batch size of 32
We used categorical cross entropy as the loss function.

our model is tested with 90% accuracy and loss of 0.32

To Host Web Application
To host web application we install certain packages of pyngrok,streamlit .
and we loaded our model in app.py file.

The reason for using ngrok is for authentication purpose.To host the application we have to get static token from ngrok website and we have to pass that token in web url the our application gets open.
!ngrok authtoken 27mOZLQPhbhaKoMWKAEJ1uL14aS_52Nyg4u82cfZGo72sBKFd
27mOZLQPhbhaKoMWKAEJ1uL14aS_52Nyg4u82cfZGo72sBKFd is static webtoken.
It runs under 8505 portnumber.
Finally we are running the streamlit application where our image classifier .model will be hosted
Below is the code
!streamlit run streamlit_host.py

Here it asks for image to upload,we have to give the image for prediction

Difficulty and Challenges faced
· Did not know what ImageDataGenerator and flow_from_directory returns. So had difficulty to pass new images for prediction to the models. But it's basically a 4D Numpy array.
· The training was done on Colaboratory. To create and test the app, I had to download the trained model and run it locally, but due to low specs, it couldn't run on my laptop. So tried using ngrok on Colab which didn't work, but an alternative, remote.it worked perfectly.
· To deploy the app on Streamlit, I used tensorflow 2.5.0 whid has a huge size so the slug was exceeding the limits (500MB). Then I used tensorflow-cpu 2.5 and it worked.
References
Image Reference:
Project References
https://gist.github.com/mikesmales/5b7f527b23ce686653202d9ae1d0e485 ->for categorical cross entropy
Video Link



Comments