Robot Vision and Learning
My work explores learning with no or minimal use of neither labeled data nor handcrafted task specific reward functions. I have worked on pose estimation using synthetic data, and now I am investigating how to guide goal-conditioned reinforcement learning with agents selecting their own goals.
In robotics it can be very expensive to gather data, and any method that reduces that cost without increasing the burden on the engineer or operator can be very valuable.
In my earlier work I address the problem of pose estimation, determining location and orientation of objects, a problem that is important when for instance making a robotic arm grasp an object.
Our solution uses autoencoders and CAD models to train a pose estimator for multiple objects simultaneously using only synthetic data. It utilizes a multi-view approach to handle local minima in SO3-space outputs. At the time of publishing, this had similar performance to other state of the art solutions while using only RGB information, synthetic data and being both more memory efficient and faster than many other approaches.
Exploration in Reinforcement Learning
My current work examines how to make reinforcement learning agents explore efficiently.
Many methods, such as Ɛ-greedy, have a hard time exploring when rewards are sparse. I propose to use goal-conditioned hindsight learning and intrinsic selection and evaluation of goals to guide exploration, and I am currently exploring the viability of selecting those goals in the embedding space of world models and autoencoder systems like Dreamer.
Dealing with uncertainty is central when working with robots. For example, when grasping an object, the object may not be in a fixed location, so the robot must adapt its trajectory to reach the object. When performing vision tasks using a camera, however, uncertainty is not accounted for, and the camera is assumed to produce ideal images. Naturally, images may be far from ideal in real world scenarios. Simple lighting changes may be enough to produce sub-optimal images. The long term goal of this project is to make vision systems more robust, by making them adaptable. Adaptability will be accomplished by introducing parameters in the camera pipeline. To account for uncertainty, confidence will be introduced. By producing a measure of confidence, the system will be able to identify and adapt to sub-optimal situations. The current focus in this project is an attempt to model confidence by performing out-of-distribution detection using a Normalizing Flow model. Normalizing Flows have previously been used successfully in the field of out-of-distribution detection, so the idea is to construct an out-of-distribution detection problem to identify good or bad images in terms of computer vision tasks.