Alex Bewley

Publications

RSN: Range Sparse Net for Efficient, Accurate LiDAR 3D Object Detection

Range Sparse Net (RSN) is a simple, efficient, and accurate 3D object framework for real time detection using LiDAR with extensive range. Lightweight 2D convolutions on dense range images results in significantly fewer selected foreground points, thus enabling the later sparse convolutions in RSN to efficiently operate. RSN runs at more than 60 frames per second on a 150mx150m detection region on Waymo Open Dataset (WOD) while being more accurate than previously published detectors.

Local Metrics for Multi-Object Tracking

Local metrics provide an intuitive mechanism to explicitly specify the trade-off between detection and association for evaluating object trackers.

Range Conditioned Dilated Convolutions for Scale Invariant 3D Object Detection

A novel 3D object detection framework that processes LiDAR data directly on its native range image representation. To overcome scale sensitivity in this perspective view, a range-conditioned dilation (RCD) layer is proposed to dynamically adjust a continuous dilation rate as a function of the measured range. Unparalleled performance is achieved at long range detection when combined with a second stage refinement.

Large Scale Outdoor Scene Reconstruction and Correction with Vision

The BOR2G system developed at the Oxford Robotics Institute fuses data from multiple sensor modalities (cameras, lidars, or both) and regularizes the resulting 3D model. We use a compressed 3D data structure which allows us to operate over a large scale. A earned correction mechanism which takes the global context of the reconstruction and adjusts the constructed mesh addressing pathological errors.

Learning to Drive from Simulation without Real World Labels

A method for transferring a vision-based lane following driving policy from simulation to operation on a rural road without any real-world labels. Our approach leverages recent advances in image-to-image translation to achieve domain transfer while jointly learning a single-camera control policy from simulation control labels.

Dropout Distillation for Efficiently Estimating Model Confidence

An efficient way to output better calibrated uncertainty scores from neural networks. These Distilled Dropout Network makes standard (non-Bayesian) neural networks more introspective by adding a new training loss.

Learning to Drive in a Day with Deep Reinforcement Learning

This work demonstrates model-free deep reinforcement learning on an autonomous car in the real world. With a handful of exploration and optimisation steps performed on the single onboard NVIDIA DRIVE PX2, our model-free algorithm learnt to follow its lane without any prior map.

Neural Stethoscopes: Unifying Analytic, Auxiliary and Adversarial Network Probing

This work unifies auxiliary tasks, adversarial information removal and side tasks analysis with a single multi-task learning framework we call neural stethoscopes. Neural stethoscopes are then used to interrogate specific visual cues a network learns in the context of intuitive physics. Furthermore, we are able to actively de-bias network predictions as well as enhance performance via suitable auxiliary and adversarial stethoscope losses.

Deep Cosine Metric Learning for Person Re-Identification

This work presents a method for learning a feature embedding where the cosine similarity is effectively optimised through a simple re-parametrization of the conventional softmax classification regime. At test time, the final classification layer can be stripped of the Network, facilitating nearest neighbour queries on unseen individuals using the cosine similarity metric.

Incremental Adversarial Domain Adaptation for Continually Changing Environments

Continuous appearance shifts such as changes in weather and lighting conditions can impact the performance of deployed machine learning models. Unsupervised domain adaptation aims to address this challenge, though current approaches do not utilise the continuity of the occurring shifts. This work presents an adversarial approach for lifelong, incremental domain adaptation which benefits from unsupervised alignment to a series of sub-domains which successively diverge from the labelled source domain.

Meshed Up: Learnt Error Correction in 3D Reconstructions

Dense reconstructions often contain errors that prior work has so far minimised using high quality sensors and regularising the output. Nevertheless, errors still persist. This paper proposes a machine learning technique to identify errors in three dimensional (3D) meshes. Beyond simply identifying errors, our method quantifies both the magnitude and the direction of depth estimate errors when viewing the scene.

Hierarchical Attentive Recurrent Tracking

Inspired by how the human visual cortex employs spatial attention and separate “where” and “what” processing pathways to actively suppress irrelevant visual features, this work develops a hierarchical attentive recurrent model for single object tracking in videos.

DeepSORT: Simple Online and Realtime Tracking with a Deep Association Metric

Building on the success of the SORT tracking framework, this work extends the location based tracker with appearance based association optimised via metric learning on a deep neural network.

Addressing Appearance Change in Outdoor Robotics with Adversarial Domain Adaptation

Appearance changes due to weather and seasonal conditions represent a strong impediment to the robust implementation of machine learning systems in outdoor robotics. This work develops a framework for applying adversarial techniques to adapt popular, state-of-the-art network architectures with the additional objective to be invariant across conditions.

What Makes a Place? Building Bespoke Place Dependent Object Detectors for Robotics

This paper is about enabling robots to improve their perceptual performance through repeated use in their operating environment, creating local expert detectors fitted to the places through which a robot moves.

Vision based Detection and Tracking in Dynamic Environments with Minimal Supervision

My PhD thesis in the format of thesis-by-publication composed mainly from papers competed between 2013-2016. Submitted late 2016, accepted 2017 and finally published publically in 2018.

SORT: Simple Online and Realtime Tracking

This work presents a fast, yet simple, technique for updating trajectory estimates within an online multiple object tracking framework. Furthermore, the impact of detection quality on tracking is highlighted by achieving stat-of-the-art performance on a recent tracking benchmark.

Background Modelling with Applications to Visual Object Detection in an Open Pit Mine

This work investigates the use of appearance based object detection in an open pit mine. Various forms of background modelling techniques are explored for adapting a pretrained detector to the novel environment.

ALExTRAC: Affinity Learning by Exploring Temporal Reinforcement within Association Chains

This paper presents a self-supervised approach for learning to associate object detections in a video sequence as often required in tracking-by-detection systems.

Fine-Grained Classification via Mixture of Deep Convolutional Neural Networks

A novel deep convolutional neural network (DCNN) architecture is proposed for fine-grained image classification. This architecture, called MixDCNN, combines the output of several DCNNs within a mixture model framework and is shown to outperform other methods.

From ImageNet to Mining: Adapting Visual Object Detection with Minimal Supervision

A background modeling approach to reducing the false positive rate of a pre-trained object detector for use in an open-pit mining environment.

Fine-Grained Bird Species Recognition via Hierarchical Subset Learning

This paper presents a novel method to improve fine-grained classification based on hierarchical subset learning. First a similarity tree is formed where classes with strong visual correlations are grouped into subsets. An expert local classifier with strong discriminative power to distinguish visually similar classes is then learnt for each subset.

Online Self-Supervised Multi-Instance Segmentation of Dynamic Objects

A training free method for detecting and tracking moving objects is presented and evaluated with video footage from a moving camera.

Advantages of Exploiting Projection Structure for Segmenting Dense 3D Point Clouds

A simple, yet efficient method for finding nearest neighbours in projected 3D point clouds is presented with applications towards object segmentation.

Development of a Dragline In-Bucket Bulk Density Monitor

This paper details the implementation and trialling of a prototype in-bucket bulk density monitor on a production dragline.

Real-Time Volume Estimation of a Dragline Payload

This paper presents a method for measuring the in-bucket payload volume on a dragline excavator for the purpose of estimating material bulk density in real-time.