This project implements an end-to-end deep learning system for automatic image tagging and object recognition. It uses the CIFAR-100 dataset to train and compare two distinct architectures: a custom Convolutional Neural Network (CNN) and a ResNet-18 model using Transfer Learning.
The system is deployed as an interactive Marimo App, featuring real-time inference, model comparison, and Grad-CAM explainability to visualise model decision-making.
- Dual Model Architecture:
- Custom CNN: A lightweight baseline model trained from scratch.
- ResNet-18 (Transfer Learning): A deep residual network pretrained on ImageNet and fine-tuned for CIFAR-100, achieving significantly higher accuracy and robustness.
- Interactive Inference UI: A user-friendly interface built with Marimo that allows users to upload images and generate semantic tags instantly.
- Explainable AI (XAI): Integrated Grad-CAM (Gradient-weighted Class Activation Mapping) to visualise which regions of an image the model focuses on when making a prediction.
- Performance Analysis: Comprehensive evaluation including training curves, confusion matrices, and misclassification analysis.
- Language: Python 3.x
- Frameworks: PyTorch, Torchvision
- Interface: Marimo (Reactive Python Notebooks)
- Data Processing: NumPy, Pillow (PIL)
- Visualisation: Matplotlib, Scikit-learn (Confusion Matrix)
.
├── models/ # Stores trained models and logs
│ ├── model.pth # Custom CNN weights
│ ├── resnet18.pth # ResNet-18 weights
│ ├── losses.npy # Training loss history
│ └── accuracies.npy # Training accuracy history
├── src/ # Python modules
│ ├── dataset.py # Data loading & transforms
│ ├── model.py # Class definitions for CNN & ResNet
│ └── utils.py # Helper functions
├── SE-assignment-Marimo.py # Main Application (Marimo Notebook)
└── README.md # Project Documentation
- Installation Ensure you have Python installed. Clone the repository and install the dependencies:
pip install -r requirements.txt
- Running the App This project uses Marimo, not Jupyter. To launch the interactive dashboard, run the following command in your terminal:
marimo edit SE-assignment-Marimo.py
This will launch the application in your default web browser.
Dataset: CIFAR-100 (Canadian Institute for Advanced Research).
Description: 60,000 32x32 colour images across 100 classes.
Ethical Considerations: This project uses a standard, public academic dataset. No personal or sensitive data was collected. The dataset has been reviewed for Personally Identifiable Information (PII) and complies with academic integrity standards.
Limitations: The dataset lacks specific domestic animal classes (e.g., "Dog"), which may lead to domain-shift errors (e.g., classifying a dog as a "Tiger") during inference on user-uploaded photos.
Custom CNN: Served as a baseline, achieving moderate accuracy (~48%) but struggling with fine-grained classification.
ResNet-18: Demonstrated superior performance due to deep feature extraction and pretraining, correctly identifying complex objects that the CNN missed.
Conclusion: Transfer learning is essential for robust image tagging when working with limited or low-resolution data.
Student Information Name: Rory Mabina
Unit: Software Engineering for Media
Submission Date: January 2026