Automated algae classification using machine learning is a more efficient and effective solution compared to manual classification, which can be tedious and time-consuming. However, the practical application of this classification approach is restricted by the scarcity of labeled freshwater algae datasets, especially for rarer algae. To overcome these challenges, this study proposes to generate artificial algal images with StyleGAN2-ADA and use both real and the generated images to train machine-learning-driven algae classification models. This approach significantly enhances the performance of classification models, particularly in their ability to identify rare algae. Overall, the proposed approach improves the F1-score of lightweight MobileNetV3 classification models covering all 20 of the freshwater algae in this research from 88.4% to 96.2%, while for the models that cover only the rarer algae, the experiments show an improvement from 80% to 96.5% in terms of F1-score. The results show that the approach enables the trained algae classification systems to effectively cover algae with limited image data. Additionally, a multi-genera algae detection system is also developed to detect the algaeās genera and locations in microscopic images. In particular, a multi-genera algal image dataset is prepared by image composition and annotation generation. The composed images are used as additional training data to enrich the original dataset. Using both the original and the artificial images, a mixed image model can significantly outperform the baseline real image model, showing a 16.4% improvement for all 20 algae and a significant 30.2% for rare algae in bounding box mAP. Finally, this research also improves user accessibility of the algae detection system by converting the trained model into the ONNX format, developing a user-friendly website user interface using Gradio, and hosting the entire system on Huggingface Spaces. This research developed an automated algae classification and detection system with limited and imbalanced dataset, contributed to the early detection and management of harmful algal blooms, and also, highlighted the potential of machine learning in environmental management.
| Date of Award | 2023 |
|---|
| Original language | English |
|---|
| Awarding Institution | - The Hong Kong University of Science and Technology
|
|---|
| Supervisor | Irene Man Chi LO (Supervisor) & Danny Hin Kwok Tsang (Supervisor) |
|---|
Machine learning-based freshwater algae classification and algae detection system for limited and imbalanced datasets
CHAN, W. H. (Author). 2023
Student thesis: Master's thesis