top of page
Search

Natural Images Classification

  • maheshkamineni35
  • May 3, 2022
  • 4 min read

Updated: May 4, 2022

A Convolutional Neural Network (ConvNet/CNN) is a Deep Learning algorithm which can take in an input image, assign importance (learnable weights and biases) to various aspects/objects in the image and be able to differentiate one from the other. The pre-processing required in a ConvNet is much lower as compared to other classification algorithms. While in primitive methods filters are hand-engineered, with enough training, ConvNets have the ability to learn these filters/characteristics.

The architecture of a ConvNet is analogous to that of the connectivity pattern of Neurons in the Human Brain and was inspired by the organization of the Visual Cortex. Individual neurons respond to stimuli only in a restricted region of the visual field known as the Receptive Field. A collection of such fields overlap to cover the entire visual area.


About Dataset

This dataset contains 6,899 images from 8 distinct classes compiled from various sources (see Acknowledgements). The classes include airplane, car, cat, dog, flower, fruit, motorbike and person.

Through Manual inspection we can find the images are not correctly resized

so we have to resize the images

Preprocessing

Resizing Images

Here we will be resizing the image in order to maintain consistancy. As our network will be receiving inputs of the same size so all the image needs to be resized to a fixed size.The larger is the fixed size the less shrinking required. We are keeping the image size atr 50x50

In recent times neural networks have become much deeper with networks going from a minimal number of layers (e.g. AlexNet, LeNet, VGGNet) to over a hundred layers (ResNet).

Residual Learning was introduced by Heetal. from Microsoft Research in the paper titled “Deep Residual Learning for Image Recognition. From observing the existing neural networks at the time, they found that accuracy does not come with simply adding layers to the network in order to make it better at image recognition.

AlexNet → 8 Layers

VGGNet→ 16 Layers

GoogleLeNet→ 22 Layers

If we were to add another layer to a model that is already deep enough, the next layer should aim to be a direct copy of the previous block known as the ID map. This is because the model has already calculated certain dynamic features. However, 1 suggests that it is difficult to read this personal map. Therefore, adding additional layers is considered ineffective and we have to choose a new solution. This is where the "In-depth Study of the Remains of Image Recognition" comes into play


In the next step I converted images into an array



Next step is to Load Data



and next I imported random function and shuffle the trained data

Visualize the data

for this I imported sk.learn preprocessing and keras categorical packages


Weight Initialisation

Weight initialisation is used to define the initial values for the parameters in a model prior to training on the dataset.Since we are using transfer learning, we add predetermined imagenet weights to aid the ResNet model.

Correct for data imbalance

here the data correctness can be done,by using formula



Splitting of Training data


Batch Normalisation

This is a layer that is already included in the ResNet50 architecture so it was not something we had to add ourselves. Batch normalisation allows for every layer of the network to do learning more independently and it is used to normalize the output of the previous layers.[12] Basically, it is a technique that standardizes inputs to each layer for each mini-batch.

Transfer Learning

Transfer learning allows us to utilize the elements of a pre-trained model by reusing them in a new machine learning model . Since the ResNet Architecture is built on top of ImageNet, we can load pre-trained convolutional weights and train our model on top of it. The sequential layer is the easiest way to build a model in keras and it allows us to build a model layer by layer. We use the .add() function


Train the model


Train and test the model

After this we have train the data the data set we have taken a batch size of 32

We used categorical cross entropy as the loss function.


our model is tested with 90% accuracy and loss of 0.32


To Host Web Application

To host web application we install certain packages of pyngrok,streamlit .

and we loaded our model in app.py file.


The reason for using ngrok is for authentication purpose.To host the application we have to get static token from ngrok website and we have to pass that token in web url the our application gets open.

!ngrok authtoken 27mOZLQPhbhaKoMWKAEJ1uL14aS_52Nyg4u82cfZGo72sBKFd

27mOZLQPhbhaKoMWKAEJ1uL14aS_52Nyg4u82cfZGo72sBKFd is static webtoken.

It runs under 8505 portnumber.

Finally we are running the streamlit application where our image classifier .model will be hosted

Below is the code

!streamlit run streamlit_host.py



Here it asks for image to upload,we have to give the image for prediction



Difficulty and Challenges faced

· Did not know what ImageDataGenerator and flow_from_directory returns. So had difficulty to pass new images for prediction to the models. But it's basically a 4D Numpy array.

· The training was done on Colaboratory. To create and test the app, I had to download the trained model and run it locally, but due to low specs, it couldn't run on my laptop. So tried using ngrok on Colab which didn't work, but an alternative, remote.it worked perfectly.

· To deploy the app on Streamlit, I used tensorflow 2.5.0 whid has a huge size so the slug was exceeding the limits (500MB). Then I used tensorflow-cpu 2.5 and it worked.


References

Image Reference:

Project References




Video Link













 
 
 

Recent Posts

See All

Comments


bottom of page