The product, including the PillVision software, was awarded the Design Excellence Award in the 2024 SST-NP IDP Capstone Project Showcase Day.
Overall software architecture
PillVision consists of 2 servers, a Flask server and a React server.
The React server, running on the client, sends a POST request to the Flask server, with the image of the pills from the camera feed attached as the payload every 1.5 seconds.
The Flask server will then use the custom-trained YOLOv8 model to perform inferencing on the image. The YOLOv8 model will return the total number of pills and its corresponding coordinates. The Flask server then performs our damaged pill detection algorithm, subsequently returning the total number of damaged pills and its corresponding coordinates. The Flask server then sends the total number of pills and damaged pills back to the React server, which will then display the results on the screen.
Docker
Both the Flask server and React server are containerised with Docker. For deployment, the Flask server is deployed as a container and the React server is deployed with Vercel. During development, both applications were containerised as we were developing on different operating systems; our laptop (M1 MacBook) and the NVIDIA Jetson. Both images are pushed to Dockerhub.
NVIDIA Jetson

The Jetson Nano 4GB.

Jetson could not boot up due to broken SD card reader with OS inside.

Flashing the NVIDIA Jetson using SDK Manager on an Ubuntu x86 laptop.
The machine had to run offline, so we used the powerful NVIDIA Jetson Nano 4GB to run both servers. I learnt a lot about Linux and CUDA during this project. Many problems surfaced while running the application on the Jetson (due to faulty hardware). There was one point where the Jetson could not boot up, and missing dependencies everywhere, Docker container not being able to access the CUDA cores etc. Trying to debug the OS issues and CUDA issues led me to learn a lot about Linux; system privileges, package installations, bash, and more. In the end, I am proud to say that it worked.
YOLOv8 model
The team trained a custom YOLOv8 model to detect pills. The model was trained on a dataset of 1368 images, and achieved a precision of 94.3%. It was trained on Google Colab's GPUs, and it took a little over 12 hours to train.
Why was it built?

Our KKH mentor on the far left, Mr Alan Chui. I am third from the left. Taken at our school's project showcase day.
Built in collaboration with KKH Women and Children’s Hospital in Singapore and School of Science and Technology, Singapore, PillVision (software) and its accompanying hardware (not covered here) is a computer vision aided pill counting machine. The entire team consists of Dimitros Lim, Sean Chua, Jarell Song, Tan Yi Shen. The software, PillVision, was written by Sean Chua and Dimitros Lim.
Tip: More information on the project is on its
Github repo. Please check it out.