I am a third year PhD student studying at the University of Toronto. I am supervised by Sheila McIlraith and Jimmy Ba. My current interest lies in endowing artificial agents with the ability to exploit their past experiences to accellerate their learning on new tasks. Currently, I am working with model-based reinforcement learning.
Presented at the NeurIPS 2020 Deep RL Workshop (Oral).
Published at ICLR 2021.
Code is available on github.
Presented at the NeurIPS 2018 Deep RL Workshop.
CSC413 is a course which gives an overview of both foundational ideas and recent advances in neural net algorithms. During my time as a TA, I taught a tutorial, wrote a homework assignment about self-attention and normalizing flow models, and graded student projects.
Managed a class of 744 students, personally teaching a discussion section with 50 students. CS 189 is a proof-based class where students learn concepts like regression, classification, density estimation, dimensionality reduction, and clustering.
Working on deep reinforcement learning algorithms applied to multiagent games. Exploring techniques to make agents less exploitable by adapting quickly to new opponents.
Integrated Gmail Ads data into a MapReduce pipeline that generates terabytes of clean training data by combining data from multiple sources. Helped with the development of several internal tools, including a Gmail Ads serving stack diagnostic tool, a user model viewer and editor, and a visual user event inspector.
In my opinion, the main limiting factor preventing RL from being as efficient as humans is data reuse. While a human can be shown a new task and use their previous experience to infer what strategies to try, efficiently exploiting past knowledge in RL remains an open and incredibly important problem. In my research as a PhD student at the University of Toronto, I am working on ways to endow RL agents with the ability to share knowledge between tasks, learn multi-purpose models of their world, and discover high-level, reusable skills. By exploring different forms of how to model previous experience, how these models can be learned, and how best to exploit them, I hope to find a scalable and effective way to bridge the gap between the ways humans and artificial agents learns to solve problems.