Final Year Project — Nanyang Technological University
CUDA C is the parallel computing framework that powers NVIDIA Jetson GPUs — the hardware of choice for edge AI in autonomous vehicles. This project exploits CUDA to dramatically speed up a core computer vision primitive: convolution-based image recognition.
Traditional CPU-based convolution processes image regions sequentially. By restructuring the computation to run on hundreds of CUDA cores simultaneously, we achieved significant speedups critical for real-time applications where reaction latency is directly tied to safety.
The implementation benchmarks and compares:
The project demonstrates both the potential and the engineering challenges of writing high-performance GPU code for embedded robotics platforms.