Degree
Doctor of Philosophy (PhD)
Department
Electrical Engineering
Document Type
Dissertation
Abstract
Intelligent transportation system operations frequently rely on vision-based sensing that requires principled methods to convert raw visual data into structured representations of traffic control, vehicle motion, and intersection-level performance. In practice, vision-based scene perception is challenging due to several factors, including long-range small-object appearance, dense interactions of objects, occlusions, and often the absence of reliable sensor calibration data. This dissertation explores hybrid frameworks for scene perception tasks that combine supervised deep-learning models, spatio-temporal tracking, motion modeling, and data-driven performance analysis. First, Convolutional Neural Network-based deep learning architectures are utilized for safety-critical traffic light detection and state estimation at road intersections from ego-vehicle perspective. To address some of the existing challenges in supervised learning-based detection approach, including ground truth inconsistency, class imbalance, and to improve robustness across different illumination variations in real-world scenarios, extensive data refinement and augmentation techniques are investigated. The resulting validation experiments indicate large -margin improvements over accuracy and robustness compared to baseline results. Further experiments are carried out to test real-world feasibility of the detection models, including deployment in embedded hardware in ego-vehicle platforms. Next, the dissertation addresses the vehicle turn movement classification and counting problem within the vision-based infrastructure perception domain, by formulating turning-direction inference as a trajectory-to-movement assignment, grounded in spatiotemporal vehicle tracks. A hybrid framework is introduced that leverages deep learning detection and multi-object tracking to extract vehicle trajectories and subsequently classify them along the sequence of detected crossings at predesignated virtual regions in intersection scenes. Within the hybrid framework, unresolved or ambiguous cases are processed by a secondary association stage that matches partial vehicle tracks to reference movement patterns using a trajectory similarity method, which enables scalable and interpretable movement estimation under heavy traffic flow, partial occlusions, and noisy detections. To enable physically meaningful interpretation of vehicle motion from visual data in the absence of explicit camera calibration parameters, a planar homography transformation is estimated between ground-plane feature points in the raw scene and a reference top-down view through correspondence-based matching. A finite-state motion model operating on the transformed coordinates is utilized to extract vehicle motion semantics, including stop events, aggregated delays, and speed computations, which provide microscopic vehicle motion profiles and support intersection performance analysis. Experiments on real-world intersection datasets demonstrate that the proposed frameworks achieve reliable performance under diverse traffic scenarios. Overall, by combining supervised learning-based object detection with temporal association and heuristic-based spatial reasoning in vehicle and infrastructure perception pipelines, this work attempts to advance data-driven, vision-based deployable solutions and decision support for road transportation challenges.
Date
3-27-2026
Recommended Citation
Sarker, Tonmoy, "AI‑Enabled Computer Vision Applications in Intelligent Transportation Systems" (2026). LSU Doctoral Dissertations. 7072.
https://repository.lsu.edu/gradschool_dissertations/7072
Committee Chair
Meng, Xiangyu
LSU Acknowledgement
1
LSU Accessibility Acknowledgment
1