
About
I am currently pursuing my Ph.D. in the Department of Electrical and Computer Engineering at the University of Southern California, where I work under the supervision of Professor Bhaskar Krishnamachari.​
​
My research focuses on the practical use of large language models (LLMs) for reinforcement learning (RL) in resource-constrained environments. I’m particularly interested in how LLM-guidance can support adaptive decision-making and sample efficiency.
​
Explore a brief overview of my academic and research journey here. For detailed information, please contact me.
EXPERIENCE
Research Experience
August 2022 - Present
AUTONOMOUS NETWORKS RESEARCH GROUP
- UNIVERSITY OF SOUTHERN CALIFORNIA
Graduate Research Assistant
ADAPTIVE UNIFIED REASONING AND AUTOMATION BASED ON LLMS AND MARL FOR NEXTG CELLULAR NETWORKS
Integrating Multi-Agent Reinforcement Learning with Large Language Models to develop a framework that manages the complexity and dynamism of 6G networks by combining high-level reasoning with decentralized, real-time decision-making.
GCNSCHEDULING BY LEVERAGING DEEP REINFORCEMENT LEARNING
Developing a deep actor-critic framework that integrates a multi-branch graph neural network for task prioritization, combined with a differentiable twin network that approximates heuristic scheduling decisions to enable efficient, gradient-based training in non-differentiable environments.
CORRELATED MULTI-ARMED BANDITS
Using the correlation between arms to lower the regret bound by performing more efficient exploration without using any/minimal prior information.
CONTEXTUAL MULTI-ARMED BANDIT APPROACH FOR RECOMMENDER SYSTEMS
Using Clustering and Contextual Multi-Armed Bandit for Recommendation Systems.
SEMI-COMBINATORIAL MULTI-ARMED BANDIT APPROACH FOR ANYPATH ROUTING
By coupling DSEE with Anypath routing, the algorithm optimizes packet routing through continuous learning and ensures accurate delivery probability estimation, while maintaining a near-logarithmic regret bound.
LEVERAGING REINFORCEMENT LEARNING AND PREDICTION FOR A FINANCIALLY AUTONOMOUS THERMOSTAT
By using the PPO algorithm and integrating predictions of next day temperature, the developed thermostat balances the cost of having a desired temperature as well as user satisfaction.
September 2021 - June 2022
COMPUTATIONAL AUDIO-VISION LAB
UNIVERSITY OF TEHRAN
Undergrad Research Assistant
SEISMIC SENSOR NETWORK SIGNAL PROCESSING
By using Deep Neural Networks, an algorithm to distinguish between earthquake signals and other signals captured on seismic sensors was developed.
Work Experience
June 2025 - August 2025
LEARNING, INCENTIVES, AND OPTIMIZATION IN NETWORKED SYSTEMS GROUP- CARNEGIE MELLON
Research Intern
LARGE LANGUAGE MODEL ENHANVED REINFORCEMENT LEARNING WITH MEMORY
-
Developed LLM-guided reinforcement learning with structured feedback to address the sample inefficiency of RL
-
Built a neurosymbolic framework where LLM agents collaborate via symbolic world models in embodied environments
June 2020 - August 2020
INTERNATIONAL INSTITUTE OF EARTHQUAKE ENGINEERING AND SEISMOLOGY
Research Intern
MODIFYING EARTHQUAKE SIGNALS USING SIGNAL PROCESSING ALGORITHMS
By implementing Signal Processing Algorithms, earthquake records on a given dataset were modified as part of preprocessing for further usage. After baseline adjustments, several filters were used to eliminate long-period noise.
PUBLICATIONS
MIRA: Memory-Integrated Reinforcement Learning Agent with Limited LLM Guidance - Under review ICLR 2026
DR. SWELL: Dynamic Reasoning and Learning with Symbolic World Model for Embodied LLM-Based Multi-Agent Collaboration - Language, Agent, and World Models (LAW) for Reasoning and Planning Workshop - NeurIPS 2025
CUBE: Collaborative Multi-Agent Block-Pushing Environment for Collective Planning with LLM Agents - Scaling Environments for Agents (SEA) workshop - NeurIPS 2025
AURA: Adaptive Unified Reasoning and Automation based on LLMs and MARL for NextG Cellular Networks - AI4NextG workshop - NeurIPS 2025
Game theoretic approach presented in Annenberg Symposium ’25
Actor-Twin Framework for Task Graph Scheduling - Adaptive and Learning Agents (ALA) workshop - AAMAS 2025
Smart Crystal Ball on a Budget: Reinforcement Learning and Prediction for Budget-Friendly Comfort - IEEE ICA 2024
Shorter version presented at Deployable RL workshop - RLC 2024
CAREForMe: Contextual Multi-Armed Bandit Recommendation Framework for Mental Health - MOBILESoft 2024
Smart Routing with Precise Link Estimation: DSEE-Based Anypath Routing for Reliable Wireless Networking - IEEE ICMLCN 2024
EDUCATION

2024 - Present
Pursuing a PhD
Major in Electrical Engineering/ Minor in Math
UNIVERSITY OF SOUTHERN CALIFORNIA
Ming Hsieh Department of Electrical and Computer Engineering​
Advisor: Professor Bhaskar Krishnamachari
2022 - 2024
Master's Degree
UNIVERSITY OF SOUTHERN CALIFORNIA
Ming Hsieh Department of Electrical and Computer Engineering​
Advisor: Professor Bhaskar Krishnamachari

2017 - 2022
Bachelor's Degree
UNIVERSITY OF TEHRAN
Department of Electrical and Computer Engineering
Program of Study: Telecommunication​
HONORS AND AWARDS
Winner of the Annenberg Research and Creative Project Symposium 2025 across the School of Cinematic Arts, the School for Communications & Journalism, and the School of Engineering.
Awarded the Outstanding Poster Award at the 13th Annual Research Festival, University of Southern California
Awarded the Annenberg Fellowship top off, University of Southern California.
M.Sc. Admission from Electrical Engineering Department, University of Tehran, as an exceptional talent student.
Ranked among top 10 students in undergraduate class, Electrical Engineering Department, University of Tehran.
Ranked among top 0.5% in the nationwide university entrance exam in Mathematics and Physics fields for B.Sc. degree, 2017.
SKILLS
PROGRAMMING LANGUAGE
Python, R, MATLAB, LATEX, C++
LIBRARIES AND FRAMEWORKS
Pytorch, Gymnasium(Gym), Numpy, Pandas, Tianshou