Applying transfer learning for hotel picture categorization using fast.ai toolkit

In the fast-paced world of software development, Artificial Intelligence (AI) and Machine Learning (ML) have become buzzwords for good reason. They are powerful tools that can transform data into valuable insights and competitive advantages. With advancements in deep learning, we set out to create an application that leverages these technologies for a practical problem—hotel image classification using transfer learning and the fast.ai library.


Why Image Classification?

Image classification is a pivotal task in the field of deep learning. It's the process of categorizing and labeling groups of pixels or vectors within an image based on specific rules. For the travel and leisure industry, which often handles vast amounts of visual data, effective image classification can significantly enhance user experience and operational efficiency. By automating the classification of hotel images, companies can better manage visual content, improve searchability, and ultimately, provide a better user experience.


Getting Started with Fast.ai

To bring our project to life, we chose the fast.ai library. This library sits on top of PyTorch, adding a layer of abstraction that simplifies the process of building and training deep learning models. Developed by Jeremy Howard, fast.ai is well-supported with an excellent accompanying course, making it an ideal choice for quickly creating prototypes of neural networks.


Data Collection and Preparation

The cornerstone of any machine learning project is the dataset. For image classification, we needed a dataset of hotel-related images. Fortunately, there are several resources available:


  • Kaggle Datasets: A great source of free datasets, including the indoor scene recognition dataset, which we used for our project.
  • Google Images: Using Google's open API and a handy GitHub library, you can collect high-quality images by scraping the first hundred results for each search term.
  • Amazon Mechanical Turk: A platform where you can pay individuals to categorize images, although we didn't use this option for our project.


It's essential to carefully prepare your categories and datasets. The categories should be distinct to avoid confusion during training. We split our dataset into training and validation sets using a 70-30 split, ensuring we set a random seed to maintain consistency across training sessions.


Building the Model: ResNet and Transfer Learning

We chose the ResNet neural network architecture, specifically ResNet50, pretrained on ImageNet. ResNet is known for its ability to train very deep networks without suffering from vanishing gradients. The key feature here is transfer learning, where we start with a pretrained model and fine-tune it for our specific task. This approach significantly reduces training time and resource requirements.


Key Parameters

One of the critical parameters in any deep learning model is the learning rate. The fast.ai library provides tools to help you find an optimal learning rate:


# Find learning rate
learn.lr_find()
learn.recorder.plot()

Once we have the learning rate, we can train our model:


# Create DataBunch
data = ImageDataBunch.from_folder(path, ds_tfms=get_transforms())
data.normalize(imagenet_stats)

# Create learner with ResNet50
learn = create_cnn(data, resnet50)

# Train the model
learn.fit_one_cycle(10, lr)

Dealing with Overfitting

Overfitting is a common issue where the model performs well on the training data but poorly on the validation data. We implemented several techniques to mitigate this:


  • Dropout: Randomly disables neurons during training, forcing the network to generalize better.
  • Weight Decay: Adds a penalty for large weights, encouraging simpler models.
  • Data Augmentation: Applies random transformations like rotations and zooms to the training data.


Empirical Tuning

After applying these techniques, we saw some improvements but continued to fine-tune the hyperparameters to maximize performance:


learn.fit_one_cycle(12, 1e-2, wd=0.1)
learn.fit_one_cycle(15, 1e-2, wd=0.01)

Eventually, we achieved a validation accuracy of around 70.4%, with further improvements being minimal.


Analyzing Results and Future Work

One helpful tool for understanding a model's performance is the confusion matrix, which shows how often different categories are misclassified. By examining the confusion matrix, we noticed common misclassifications, such as "bars" being labeled as "restaurants." These insights can guide future improvements, such as refining category labels or further augmenting the dataset.


Despite the challenges, our model achieved substantial accuracy and demonstrated the potential of transfer learning in hotel image classification. Future work could explore more sophisticated architectures, additional data augmentation techniques, or even semi-supervised learning methods.


Conclusion

Transfer learning with the fast.ai library provides a powerful framework for solving practical image classification tasks. Through careful data preparation, architecture selection, and hyperparameter tuning, we successfully classified hotel-related images, setting the stage for further improvements and applications in the travel industry. Whether you're a seasoned data scientist or a beginner, tools like fast.ai make advanced deep learning techniques more accessible and impactful.