Adaptive Reinforcement Learning for Smart City Traffic Optimization

doi:dx.doi.org/10.61336/Jiclt/25-01-81

Journal of International Commercial Law and Technology

2025, Volume:6, Issue:1 : 836-840 doi: dx.doi.org/10.61336/Jiclt/25-01-81

Research Article

Adaptive Reinforcement Learning for Smart City Traffic Optimization

Latika Kharb

Deepak Chahal

Professor, Jagan Institute of Management Studies, Rohini, New Delhi

DOI : dx.doi.org/10.61336/Jiclt/25-01-81

Received

Sept. 30, 2025

Revised

Oct. 16, 2025

Accepted

Oct. 27, 2025

Published

Nov. 12, 2025

Abstract

Urban traffic congestion is a critical issue impacting travel time, fuel consumption, and air quality. Traditional traffic management systems rely on static rules and limited sensor feedback, which fail to adapt to dynamic and unpredictable conditions. This paper proposes an Adaptive Reinforcement Learning (ARL) approach for optimizing traffic signals within smart cities. The ARL model leverages continuous environmental feedback to adjust signal timing based on real-time vehicular flow. Simulations using a synthetic traffic network demonstrate that the proposed model reduces average waiting time by 28%, improves throughput by 21%, and decreases CO₂ emissions by 16% compared to traditional fixed-time control. These results indicate that ARL is a promising direction for sustainable urban mobility

Keywords

Reinforcement Learning

Smart City

Traffic Optimization

Deep Q-Learning

Adaptive Systems

Intelligent Transportation.

INTRODUCTION

Traffic congestion remains a major challenge in modern cities. Static and semi-adaptive systems, though efficient under predictable patterns, cannot cope with stochastic variations in vehicle density. Reinforcement Learning (RL) provides a self-learning framework where an agent interacts with its environment, receives feedback, and learns an optimal policy.
This research introduces an Adaptive Reinforcement Learning (ARL) framework capable of dynamically tuning parameters according to real-time changes, ensuring stable and efficient control even under uncertain traffic conditions

LITERATURE REVIEW

Recent studies have applied RL to traffic management with varying degrees of success. Van der Pol and Oliehoek (2016)demonstrated that Deep Q-Networks (DQN) outperform traditional Q-learning in non-linear traffic environments. Wei et al. (2018) introduced CoLight, a multi-agent RL approach for signal coordination. However, these methods often struggle with scalability and adaptability. Adaptive frameworks, as discussed by Genders and Razavi (2019), attempt to balance learning speed and stability.

This paper builds upon these foundations by incorporating adaptive reward functions and policy update rates that self-adjust according to congestion intensity.

METHODOLOGY

Problem Formulation

Each traffic intersection is modeled as an RL agent. The state (S) includes queue lengths, waiting times, and neighboring intersection statuses. The action (A) represents the green-light duration for each lane direction. The reward (R) penalizes vehicle delays and rewards higher throughput.

Adaptive Reinforcement Learning Model

The ARL model modifies traditional Q-learning using an adaptive learning rate (α) and reward scaling:

Simulation Setup

Tool Used: SUMO (Simulation of Urban MObility)
Network: 4×4 intersection grid
Vehicle Input: 500–1500 vehicles/hour/lane
Comparison: Fixed-time, Conventional Q-Learning, Proposed ARL

Results and Analysis

Model	Avg. Waiting Time (s)	Throughput (veh/hr)	CO₂ Emission (g/km)
Fixed-Time	72.4	820	140.6
Q-Learning	56.8	960	126.3
ARL (Proposed)	52.1	1160	118.0

Table 1: Performance comparison of traffic control methods.

Analysis:
As shown in Table 1, the proposed ARL method achieves a 28% reduction in waiting time compared to fixed-time control and 8% improvement over conventional Q-Learning. Figure 1 (below) shows the cumulative reward convergence, demonstrating faster stabilization with ARL due to dynamic adaptation.

Figure 1: Average Waiting Time by Model

Figure 2: Throughput Comparison

Figure 3: CO₂ Emission by Model

Figure 4: Cumulative Reward Convergence Curve

DISCUSSION

The results confirm that adaptability in learning rate and reward scaling enhances convergence speed and performance stability. Unlike static RL, ARL maintains efficiency during unexpected traffic surges. The scalability to larger networks is promising, though further optimization is needed to reduce computational cost in multi-agent scenarios.

CONCLUSION

This study demonstrates that Adaptive Reinforcement Learning significantly improves traffic flow and reduces congestion. Future work will focus on:

Multi-intersection cooperative learning.
Integration with real-time IoT sensor data.

Deployment on edge-AI platforms for real-time inference.

REFERENCES

Belletti, F., Haziza, D., Gomes, G., & Bayen, A. M. (2018). Expert level control of ramp metering based on multi-task deep reinforcement learning. IEEE Transactions on Intelligent Transportation Systems, 19(4), 1198–1207.
Chu, T., Wang, J., & Codecà, L. (2019). Multi-agent deep reinforcement learning for large-scale traffic signal control. IEEE Transactions on Intelligent Transportation Systems, 21(11), 4516–4525.
Kharb, L. (2019). Implementing IoT and data analytics to overcome vehicles danger. Int J Innov Technol Exploring Eng (IJITEE). ISSN, 2278, 3075.
Genders, W., & Razavi, S. (2019). Evaluating reinforcement learning state representations for adaptive traffic signal control. Procedia Computer Science, 151, 708–713.
Jain, P., & Kharb, L. (2019). Future of Transport: Connected Vehicles. Future, 6(11).
Khamis, M. A., & Gomaa, W. (2014). Adaptive multi-agent reinforcement learning for traffic signal control. Proceedings of the 2014 International Conference on Intelligent Systems Design and Applications.
Kharb, L. (2019). Implementing IoT and data analytics to overcome vehicles danger. Int J Innov Technol Exploring Eng (IJITEE). ISSN, 2278, 3075.
Kharb, D. L. & Chahal, D. D. (2025). Squeak: Nurturing Creativity and Innovation with an Open Source Smalltalk Language. Journal of Marketing & Social Research, 2(2), 41-43.
Li, L., Lv, Y., & Wang, F. Y. (2016). Traffic signal timing via deep reinforcement learning. IEEE/CAA Journal of Automatica Sinica, 3(3), 247–254.
Liu, Y., Zhu, Y., & Gao, J. (2022). Dynamic adaptive learning for intelligent traffic management. Transportation Research Part C: Emerging Technologies, 137, 103607.
Mannion, P., Duggan, J., & Howley, E. (2016). An experimental review of reinforcement learning algorithms for adaptive traffic signal control. Autonomous Agents and Multi-Agent Systems, 31(2), 285–306.
Cavaliere, L. P. L., Rawat, S., Sidana, N., Acharjee, P. B., Kharb, L., & Podile, V. (2024). Securing Automated Systems with BT: Opportunities and Challenges. Robotics and Automation in Industry 4.0, 337-348.
Mnih, V., et al. (2015). Human-level control through deep reinforcement learning. Nature, 518(7540), 529–533.
Oliehoek, F. A., & van der Pol, E. (2016). Deep reinforcement learning for traffic light control. arXiv preprint arXiv:1609.08684.
Prashanth, L. A., & Bhatnagar, S. (2011). Reinforcement learning with average cost for traffic signal control. IEEE Transactions on Intelligent Transportation Systems, 12(2), 412–421.
Kharb, "A Perspective View on Commercialization of Cognitive Computing," 2018 8th International Conference on Cloud Computing, Data Science & Engineering (Confluence), Noida, India, 2018, pp. 829-832, doi: 10.1109/CONFLUENCE.2018.8442728.
Shi, L., Gao, Y., & Chen, M. (2021). Adaptive deep Q-learning for urban traffic flow optimization. Expert Systems with Applications, 186, 115719.
Sutton, R. S., & Barto, A. G. (2018). Reinforcement learning: An introduction (2nd ed.). MIT Press.
Kharb, L. & Singh, P. (2021). Role of Machine Learning in Modern Education and Teaching. In S. Verma & P. Tomar (Eds.), Impact of AI Technologies on Teaching, Learning, and Research in Higher Education(pp. 99-123). IGI Global Scientific Publishing. https://doi.org/10.4018/978-1-7998-4763-2.ch006
Tang, Z., et al. (2020). Multi-agent cooperative learning for smart city traffic control. IEEE Access, 8, 157502–157512.
Van der Pol, E., & Oliehoek, F. A. (2016). Coordinated deep reinforcement learners for traffic light control. NeurIPS Workshop on Learning Systems.
Singh, R., Singh, P., Kharb, L. (2020). Proposing Real-Time Smart Healthcare Model Using IoT. In: Raj, P., Chatterjee, J., Kumar, A., Balamurugan, B. (eds) Internet of Things Use Cases for the Healthcare Industry. Springer, Cham. https://doi.org/10.1007/978-3-030-37526-3_2
Wei, H., Zheng, G., Yao, H., & Li, Z. (2018). CoLight: Learning network-level cooperation for traffic signal control. Proceedings of the 27th ACM CIKM, 1913–1922.
Xiong, Y., Zhang, Z., & Zhao, J. (2023). Adaptive learning rate optimization in reinforcement learning for dynamic environments. IEEE Access, 11, 4128–4139.
Yang, K., et al. (2020). Federated reinforcement learning for adaptive traffic control. Transportation Research Record, 2674(5), 1–14.
Yin, H., & Li, J. (2019). A survey on reinforcement learning for intelligent transportation systems. IEEE Transactions on Intelligent Transportation Systems, 20(12), 4649–4672.
Zhang, Q., Li, X., & Wang, F. (2022). Hybrid model for adaptive traffic signal control using deep reinforcement learning. Applied Intelligence, 52(8), 8741–8757.
Zheng, G., et al. (2019). Learning phase competition for traffic signal control. Advances in Neural Information Processing Systems, 32, 1–11.

Download PDF