Research

My research interest lies at the intersection of formal methods, reinforcement learning and robotics. Namely, how to use logic and graph based reasoning tools that formal methods provide to learn complex robotic skills. My goal is to develop useful integrations of high-level symbolic reasoning with low-level motor learning. Below are selected highlights of my current and past research.

Temporal Logic Guided Safe Reinforcement Learning Using Control Barrier Functions

Propose a system that integrates temporal logic guided reinforcement learning with control barrier functions and control Lyapunov functions
Introduce three ways to use temporal logic and its equivalent finite state automata (FSA) in our system - provide reward for the RL agent, perform goal selection for the control Lyapunov function for guided exploration and define safe sets for the control barrier functions
Demonstrate the applicability of our framework in a simulated continuous control task with safety con- straints, known system dynamics and unknown environ- mental dynamics, and study the use and effectiveness of each component in the system.

paper

Hierarchical Temporal Logic Guided Reinforcement Learning

Provide a hierarchical approach for task definition.
High-level tasks and low-level controls are connected with multiple layers of task abstractions.
Complex tasks can be trained efficiently end-to-end with of-the-shelf RL algorithms and a multi-level hierarchical policy can be obtained.

paper

Automata Guided Reinforcement Learning With Demonstration

Integrate temporal logic and finite state automata (FSA) with reinforcement learning and demonstrations.
The proposed framework generates intrinsic rewards that aligns with the overall task objective.
The FSA gives rise to an interpretable and hierarchical structure to the resultant policy.

paper

Reinforcement Learning With Temporal Logic Rewards

Compare the learning performance and quality of learned policies for reinforcement learning agents using temporal logic rewards with those that use heuristic rewards.
Evaluate against temporal-dependent robotic manipulation tasks. The video below shows learning of a toast placing task. The robot has to learn where and when to release the bread (task specified in temporal logic).

paper

Task Frame Estimation During Model-Based Teleoperation For Satellite Servicing

Develop a model-based tool registration system to be used in conjunction with a hybrid position/force controller for teleoperation in satellite servicing tasks.
The system is to handle user specified motions while maintaining desired contact between the tool and the operating surface (automatic contact registration and error correction).
The system is evaluated on a mock setup composed of a DaVinci master station and a Barret whole arm manipulator (WAM).

paper

Parameter Estimation And Anomaly Detection While Cutting Insulation During Telerobotic Satellite Servicing

Develop an adaptive force estimation system for automatic anomaly detection in satellite serving tasks.
The system is evaluated on a mock setup composed of a DaVinci master station and a Barret whole arm manipulator (WAM) with custom developed user interface.

paper