It's already been two months? I'm only getting around to posting this past project now, taking advantage of the holiday..
Seoul ICT Innovation Square offers many online and offline classes every year. Thanks to that, I've been able to take a wide variety of high-quality courses for free. The fact that you can take them online while or after commuting home is especially attractive. I had taken AI classes there before too, so this time I applied for the advanced visual intelligence track that includes a final project. As is true for most fields of development, the AI side has been changing especially fast, and competition is high — so I was a bit worried, but luckily I got the chance to enroll, and it turned out to be a great learning experience.
In this course we got to attend lectures from three instructors (personally I call them professors). Starting with the first professor's classes on basic concepts of artificial intelligence, Python fundamentals, and OpenCV hands-on exercises, we worked through the theory and examples of Image Classification, Object Detection, and RNNs.
It was already very helpful before (in particular, the previous course included extremely detailed reviews of AI papers from each era as they were published — what a precious time! -> the Notion notes I put together at the time),
paper AI
study NOTE
www.notion.so
but this time we got to go all the way from data collection to labeling, model training, and service deployment, which was so worthwhile!!!
The final project for this class was a traffic-light signal classifier. With image upload it showed a recognition rate above 90%, while real-time recognition was around 50%. Even though we couldn't fully finish a complete deployment version, given that this was a roughly ten-day project done in the evenings after work by two people, the result was — for me, personally, hehe — quite satisfying.
I was in charge of data collection, labeling, data augmentation, and model training, and the team member who worked with me handled applying the model, web deployment, real-time mobile-web deployment, and shooting the result video. Honestly it felt like I was just lending a spoon to the dish that the actual developer on our team made ^ ^/ Here are some of the contents of how it went and the result links.
The project followed a fairly standard ML-WorkFlow.
Recognition of traffic-light signals from traffic-light images
(Technical) In the AI field, most datasets are collected, trained, and used from a car's point of view, so the number of datasets actually collected from the perspective of pedestrians is very small.
3. Data Collection
(Public datasets) Research: AI hub01, ETRI dataset, AI hub02
(Self-shot) For the same locations, photographs were taken at night(400 shots) and during the day (300 shots), one set each for green and red signals (because of light blooming at night, an additional ~100 shots were used for training)
4. Data Preprocessing
For the initial labeling I used Label Studio( offline-local older version: https://github.com/HumanSignal/labelImg) that we had practiced with in class. At that time I trained on YOLO v5, but the results were not satisfying, so I felt the need for data augmentation. While searching for relevant material I came across Roboflow. The fact that data already labeled in another tool could still be used, and that data augmentation could be done really fast, easily, and for free, was very, very nice. Thanks to that, I was able to augment the 700 photos by 3x and apply various additional image transformations on top.
GitHub - HumanSignal/labelImg: LabelImg is now part of the Label Studio community. The popular image annotation tool created by
LabelImg is now part of the Label Studio community. The popular image annotation tool created by Tzutalin is no longer actively being developed, but you can check out Label Studio, the open source ...
github.com
Open Source Data Labeling | Label Studio
A flexible data labeling tool for all data types. Prepare training data for computer vision, natural language processing, speech, voice, and video models.
labelstud.io
Roboflow: Give your software the power to see objects in images and video
With just a few dozen example images, you can train a working, state-of-the-art computer vision model in less than 24 hours.
roboflow.com
5. Modeling & Evaluation
First, I researched relevant references.
In the end I went with YOLO. At first I worked locally. I started with YOLO v3, which can be used with Keras, but in v3 it seemed to recognize the traffic light well but recognized the signal poorly, so I moved the model-training environment to Colab (T4 GPU) and tested YOLO v5 and YOLO v8. I ran 80~100 epochs. | Training code (Colab)
yolov8 data train 3.ipynb
Colaboratory notebook
colab.research.google.com
6. Deployment
While I was taking the photos and labeling, the very capable team member who joined me handled all of this! | GitHub code
GitHub - BGHyeon/traffic_sign
Contribute to BGHyeon/traffic_sign development by creating an account on GitHub.
github.com
In practice you don't really need to classify a traffic-light signal from a photo uploaded in a browser, so we asked ourselves that, and then tried using WebSocket to detect signals in real time on the mobile web. The result is around 50 points. Recognition based on YOLO's built-in training data was very good, but as expected the recognition rate for our self-shot signals fell short of expectations. | Final presentation slides
신호등 활용 분석 서비스
횡단보도 신호 감지 웹서비스 1조 - 친절한 찰쓰씨, 백규현
docs.google.com
