Keiran Paster

PhD Student at the University of Toronto ยท keirp@cs.toronto.edu

I am a third year PhD student studying at the University of Toronto. I am supervised by Sheila McIlraith and Jimmy Ba. My current interest lies in endowing artificial agents with the ability to exploit their past experiences to accellerate their learning on new tasks. Currently, I am working with model-based reinforcement learning.

Publications

Planning from Pixels using Inverse Dynamics Models

Keiran Paster, Sheila McIlraith, Jimmy Ba

Presented at the NeurIPS 2020 Deep RL Workshop (Oral).

Published at ICLR 2021.

Code is available on github.

Equilibrium Finding via Asymmetric Self-Play Reinforcement Learning

Jie Tang*, Keiran Paster*, Pieter Abbeel

Presented at the NeurIPS 2018 Deep RL Workshop.

Experience

TA for CSC413/2516, Neural Networks and Deep Learning

University of Toronto

CSC413 is a course which gives an overview of both foundational ideas and recent advances in neural net algorithms. During my time as a TA, I taught a tutorial, wrote a homework assignment about self-attention and normalizing flow models, and graded student projects.

January 2021 - May 2021

UGSI for CS 189, Introduction to Machine Learning

UC Berkeley

Managed a class of 744 students, personally teaching a discussion section with 50 students. CS 189 is a proof-based class where students learn concepts like regression, classification, density estimation, dimensionality reduction, and clustering.

January - May 2019

Undergraduate Researcher

Working on deep reinforcement learning algorithms applied to multiagent games. Exploring techniques to make agents less exploitable by adapting quickly to new opponents.

September 2017 - May 2019

Software Engineering Intern

Integrated Gmail Ads data into a MapReduce pipeline that generates terabytes of clean training data by combining data from multiple sources. Helped with the development of several internal tools, including a Gmail Ads serving stack diagnostic tool, a user model viewer and editor, and a visual user event inspector.

May 2017 - August 2017

Interests

In my opinion, the main limiting factor preventing RL from being as efficient as humans is data reuse. While a human can be shown a new task and use their previous experience to infer what strategies to try, efficiently exploiting past knowledge in RL remains an open and incredibly important problem. In my research as a PhD student at the University of Toronto, I am working on ways to endow RL agents with the ability to share knowledge between tasks, learn multi-purpose models of their world, and discover high-level, reusable skills. By exploring different forms of how to model previous experience, how these models can be learned, and how best to exploit them, I hope to find a scalable and effective way to bridge the gap between the ways humans and artificial agents learns to solve problems.