My experience with participating in a Machine Learning Competition by IndabaX Cameroon: Detect passion fruit disease

This was my first participation in an image classification competition. I found out about the challenge through LinkedIn when it was just about two weeks before the end. As someone who is passionate about problem solving with AI, I decided to give it a go! This is a documentation on the training process. I hope it inspires you.

About the problem:

Passion fruit pests and diseases in Uganda lead to reduced yields and decreased investment in farming over time. Most Ugandan farmers (including passion fruit farmers) are smallholder farmers from low-income households, and do not have sufficient information and means to combat these challenges. Without the required knowledge about the health of their crops, farmers cannot intervene promptly to avoid devastating losses.

The Goal: The aim of this challenge was to classify the disease status of a plant given an image of a passion fruit.

Dataset Structure

The dataset had train and test csv files. Each csv file contains the image path and the train csv had the label for each image. The train dataset contained a total of 1548 images while the test set contained 512 images.

structure of train data

sample of dataset

There were 3 classes for all the fruits: Healthy, woodiness, and brownspot. The classes were almost balanced in size.

Data Processing:

The images were loaded using the ImageDataLoaders. The image data loader is a class that loads the images for the model . The ImageDataLoaders loaded the data from the csv and the images were resized to 224 pixels. Also, the images were adjusted using the following parameters to transform the images so as to improve the performance of the model - all done with the dataloader;

max_zoom: The maximum zoom on an image set to 1.2
max_warp: The maximum warp on an image set to 0.2
max_lighting: The maximum lighting on an image set to 0.5
vertical_flip: Allows flip on model set to True,
max_rotate: The maximum rotation of an image set to 30

I never split the data into train/validation as the dataset was very small and I was going to use transfer learning.

Model Preparation and training:

I used a transfer learning for this task, seeing that the data was quite small.

I used the resnet model, a neural network model which has been pretrained on ImageNet.

I used fastai’s cnn_learner, a class that combines a pretrained model, a dataloader, and an loss function to handle training. I used the accuracy metric as the fruit classes were almost balanced.

The training was done with 15 epochs. Total training time was 4 minutes with a T4 GPU in my Google Bolab notebook.

Result:

The image below shows the 9 worst predictions of the model on the training dataset

The image shows that the model accurately classified most of the images and wrongly classified only two.

The best training gave a 98% accuracy on part of the test dataset.

Conclusion

I was able to finetune the resnet model to predict passion fruit disease with a high accuracy. The augmentation of the images greatly improved the accuracy of the model.

Next Steps: I am thinking about implementing this model as a tiny ML model so it can be used on a smart phone.

The challenge was a very good one for beginners, and I learned a lot during the process

Resources

Side Note: One thing I like about Zindi competitions is that they provide a starter notebook which you can modify. I had never used fastai before. One thing I love about fastai is that the dataloader it makes it easier to focus on improving the performance of your model rather than spending lots of time on trying to prepare your dataset.