Skip to main content

All Publications

Active Learning Accelerated Discovery of Stable Iridium Oxide Polymorphs for the Oxygen Evolution Reaction
Energy & Materials | June 18, 2020

The discovery of high-performing and stable materials for sustainable energy applications is a pressing goal in catalysis and materials science. Understanding the relationship between a material’s structure and functionality is an important step in the process, such that viable polymorphs for a given chemical composition need to be identified. Machine-learning-based surrogate models have the potential to accelerate the search for polymorphs that target specific applications. Herein, we report a readily generalizable active-learning (AL) accelerated algorithm for identification of electrochemically stable iridium oxide polymorphs of IrO2 and IrO3. The search is coupled to a subsequent analysis of the electrochemical stability of the discovered structures for the acidic oxygen evolution reaction (OER). Structural candidates are generated by identifying all 956 structurally unique AB2 and AB3 prototypes in existing materials databases (more than 38000). Next, using an active learning approach, we find 196 IrO2 polymorphs within the thermodynamic amorphous synthesizability limit and reaffirm the global stability of the rutile structure. We find 75 synthesizable IrO3 polymorphs and report a previously unknown FeF3-type structure as the most stable, termed α-IrO3. To test the algorithms performance, we compare to a random search of the candidate space and report at least a 2-fold increase in the rate of discovery. Additionally, the AL approach can acquire the most stable polymorphs of IrO2 and IrO3 with fewer than 30 density functional theory optimizations. Analysis of the structural properties of the discovered polymorphs reveals that octahedral local coordination environments are preferred for nearly all low-energy structures. Subsequent Pourbaix Ir–H2O analysis shows that α-IrO3 is the globally stable solid phase under acidic OER conditions and supersedes the stability of rutile IrO2. Calculation of theoretical OER surface activities reveal ideal weaker binding of the OER intermediates on α-IrO3 than on any other considered iridium oxide. We emphasize that the proposed AL algorithm can be easily generalized to search for any binary metal oxide structure with a defined stoichiometry. READ MORE

Image
iridium oxide polymorphs article image
Machine learning for continuous innovation in battery technologies
Energy & Materials | June 15, 2020

TRI Authors: Muratahan Aykol, Patrick Herring, & Abraham Anapolsky All Authors: Muratahan Aykol, Patrick Herring, & Abraham Anapolsky

Batteries, as complex materials systems, pose unique challenges for the application of machine learning. Although a shift to data-driven, machine learning-based battery research has started, new initiatives in academia and industry are needed to fully exploit its potential.  Read more

Citation: Aykol, Muratahan, Patrick Herring, Abraham Anapolsky. “Machine learning for continuous innovation in battery technologies.” Nature Reviews Materials (2020). https://doi.org/10.1038/s41578-020-0216-y 

 

Image
Machine learning for continuous innovation in battery technologies
Spatio‑Temporal Graph for Video Captioning with Knowledge Distillation
Automated Driving, Robotics | June 14, 2020

TRI Authors: KH Lee, A. Gaidon

All Authors: B. Pan, H. Cai, DA Huang, KH Lee, A. Gaidon, E. Adeli, JC Niebles

Video captioning is a challenging task that requires a deep understanding of visual scenes. State-of-the-art methods generate captions using either scene-level or object-level information but without explicitly modeling object interactions. Thus, they often fail to make visually grounded predictions, and are sensitive to spurious correlations. In this paper, we propose a novel spatio-temporal graph model for video captioning that exploits object interactions in space and time. Our model builds interpretable links and is able to provide explicit visual grounding. To avoid unstable performance caused by the variable number of objects, we further propose an object-aware knowledge distillation mechanism, in which local object information is used to regularize global scene features. We demonstrate the efficacy of our approach through extensive experiments on two benchmarks, showing our approach yields competitive performance with interpretable predictions. Read More

Citation: Pan, Boxiao, Haoye Cai, De-An Huang, Kuan-Hui Lee, Adrien Gaidon, Ehsan Adeli, and Juan Carlos Niebles. "Spatio-Temporal Graph for Video Captioning with Knowledge Distillation." CVPR, 2020.

 

Image
Spatio‑Temporal Graph for Video Captioning with Knowledge Distillation
Real‑Time Panoptic Segmentation from Dense Detections
Automated Driving, Robotics | June 14, 2020

TRI Authors: J. Li, A. Bhargava, A. Raventos, V. Guizilini, C. Fang, A. Gaidon

All Authors: R. Hou, J. Li, A. Bhargava, A. Raventos, V. Guizilini, C. Fang, J Lynch, A. Gaidon

Panoptic segmentation is a complex full scene parsing task requiring simultaneous instance and semantic segmentation at high resolution. Current state-of-the-art approaches cannot run in real-time, and simplifying these architectures to improve efficiency severely degrades their accuracy. In this paper, we propose a new single-shot panoptic segmentation network that leverages dense detections and a global self-attention mechanism to operate in real-time with performance approaching the state of the art. We introduce a novel parameter-free mask construction method that substantially reduces computational complexity by efficiently reusing information from the object detection and semantic segmentation sub-tasks. The resulting network has a simple data flow that does not require feature map re-sampling or clustering post-processing, enabling significant hardware acceleration. Our experiments on the Cityscapes and COCO benchmarks show that our network works at 30 FPS on 1024x2048 resolution, trading a 3% relative performance degradation from the current state of the art for up to 440% faster inference. Read More

Citation: Hou, Rui, Jie Li, Arjun Bhargava, Allan Raventos, Vitor Guizilini, Chao Fang, Jerome Lynch, and Adrien Gaidon. "Real-Time Panoptic Segmentation from Dense Detections." CVPR 2020.

 

Image
Real‑Time Panoptic Segmentation from Dense Detections
Autolabeling 3D Objects with Differentiable Rendering of SDF Shape Priors
Automated Driving, Robotics | June 14, 2020

TRI Authors: W. Kehl, A. Bhargava, A. Gaidon

All Authors: S. Zakharov, W. Kehl, A. Bhargava, A. Gaidon

We present an automatic annotation pipeline to recover 9D cuboids and 3D shapes from pre-trained off-the-shelf 2D detectors and sparse LIDAR data. Our autolabeling method solves an ill-posed inverse problem by considering learned shape priors and optimizing geometric and physical parameters. To address this challenging problem, we apply a novel differentiable shape renderer to signed distance fields (SDF), leveraged together with normalized object coordinate spaces (NOCS). Initially trained on synthetic data to predict shape and coordinates, our method uses these predictions for projective and geometric alignment over real samples. Moreover, we also propose a curriculum learning strategy, iteratively retraining on samples of increasing difficulty in subsequent self-improving annotation rounds. Our experiments on the KITTI3D dataset show that we can recover a substantial amount of accurate cuboids, and that these autolabels can be used to train 3D vehicle detectors with state-of-the-art results. Read More

Citation: Zakharov, Sergey, Wadim Kehl, Arjun Bhargava, and Adrien Gaidon. "Autolabeling 3D Objects with Differentiable Rendering of SDF Shape Priors." CVPR, 2020.

 

Image
Autolabeling 3D Objects with Differentiable Rendering of SDF Shape Priors
3D Packing for Self‑Supervised Monocular Depth Estimation
Robotics | June 14, 2020

TRI Authors: V. Guizilini, R. Ambrus, S. Pillai, A. Raventos, A. Gaidon

All Authors: V. Guizilini, R. Ambrus, S. Pillai, A. Raventos, A. Gaidon

Although cameras are ubiquitous, robotic platforms typically rely on active sensors like LiDAR for direct 3D perception. In this work, we propose a novel self-supervised monocular depth estimation method combining geometry with a new deep network, PackNet, learned only from unlabeled monocular videos. Our architecture leverages novel symmetrical packing and unpacking blocks to jointly learn to compress and decompress detail-preserving representations using 3D convolutions. Although self-supervised, our method outperforms other self, semi, and fully supervised methods on the KITTI benchmark. The 3D inductive bias in PackNet enables it to scale with input resolution and number of parameters without overfitting, generalizing better on out-of-domain data such as the NuScenes dataset. Furthermore, it does not require large-scale supervised pretraining on ImageNet and can run in real-time. Finally, we release DDAD (Dense Depth for Automated Driving), a new urban driving dataset with more challenging and accurate depth evaluation, thanks to longer-range and denser ground-truth depth generated from high-density LiDARs mounted on a fleet of self-driving cars operating world-wide. Read More

Citation: Guizilini, Vitor, Rares Ambrus, Sudeep Pillai, and Adrien Gaidon. "Packnet-sfm: 3d packing for self-supervised monocular depth estimation." CVPR, 2020,

Image
3D Packing for Self‑Supervised Monocular Depth Estimation
A Review on Challenges and Successes in Atomic-Scale Design of Catalysts for Electrochemical Synthesis of Hydrogen Peroxide
Energy & Materials | June 10, 2020

Hydrogen peroxide is a valuable chemical oxidant with a wide range of applications in a variety of industrial processes, especially in water sanitization. Electrochemical synthesis of hydrogen peroxide (H2O2) through a two-electron oxygen reduction reaction (2e-ORR) or a two-electron water oxidation reaction (2e-WOR) has emerged as an appealing process for onsite production of this chemically valuable oxidant. On-site produced H2O2 can be applied for wastewater treatment in remote locations or any applications where H2O2 is needed as an oxidizing agent. This Review studies the theoretical efforts in understanding the challenges in catalysis for electrochemical synthesis of H2O2 as well as providing design principles for more efficient catalyst materials. READ MORE

Image
electrochemical synthesis article image
Spatiotemporal Relationship Reasoning for Pedestrian Intent Prediction
Automated Driving, Robotics | May 31, 2020

TRI Authors: KH Lee,A. Gaidon

All Authors: B. Liu, E. Adeli, Z. Cao, KH Lee, A. Shenoi, A. Gaidon, JC Niebles

Reasoning over visual data is a desirable capability for robotics and vision-based applications. Such reasoning enables forecasting the next events or actions in videos. In recent years, various models have been developed based on convolution operations for prediction or forecasting, but they lack the ability to reason over spatiotemporal data and infer the relationships of different objects in the scene. In this letter, we present a framework based on graph convolution to uncover the spatiotemporal relationships in the scene for reasoning about pedestrian intent. A scene graph is built on top of segmented object instances within and across video frames. Pedestrian intent, defined as the future action of crossing or not-crossing the street, is very crucial piece of information for autonomous vehicles to navigate safely and more smoothly. We approach the problem of intent prediction from two different perspectives and anticipate the intention-to-cross within both pedestrian-centric and location-centric scenarios. In addition, we introduce a new dataset designed specifically for autonomous-driving scenarios in areas with dense pedestrian populations: the Stanford-TRI Intent Prediction (STIP) dataset. Our experiments on STIP and another benchmark dataset show that our graph modeling framework is able to predict the intention-to-cross of the pedestrians with an accuracy of 79.10% on STIP and 79.28% on Joint Attention for Autonomous Driving (JAAD) dataset up to one second earlier than when the actual crossing happens. These results outperform baseline and previous work. Read More

Citation: Liu, Bingbin, Ehsan Adeli, Zhangjie Cao, Kuan-Hui Lee, Abhijeet Shenoi, Adrien Gaidon, and Juan Carlos Niebles. "Spatiotemporal Relationship Reasoning for Pedestrian Intent Prediction." IEEE Robotics and Automation Letters 5, no. 2 (2020): 3485-3492.

 

Image
pedestrian intent modeling publication image
Neural Outlier Rejection for Self‑Supervised Keypoint Learning
Automated Driving | April 27, 2020

TRI Authors: Jiexiong Tang (intern), Hanme Kim, Vitor Guizilini, Sudeep Pillai, Rares Ambrus

All Authors: Jiexiong Tang, Hanme Kim, Vitor Guizilini, Sudeep Pillai, Rares Ambrus

Identifying salient points in images is a crucial component for visual odometry, Structure-from-Motion or SLAM algorithms. Recently, several learned keypoint methods have demonstrated compelling performance on challenging benchmarks. However, generating consistent and accurate training data for interest-point detection in natural images still remains challenging, especially for human annotators. We introduce IO-Net (i.e. InlierOutlierNet), a novel proxy task for the self-supervision of keypoint detection, description and matching. By making the sampling of inlier-outlier sets from point-pair correspondences fully differentiable within the keypoint learning framework, we show that are able to simultaneously self-supervise keypoint description and improve keypoint matching. Second, we introduce KeyPointNet, a keypoint-network architecture that is especially amenable to robust keypoint detection and description. We design the network to allow local keypoint aggregation to avoid artifacts due to spatial discretizations commonly used for this task, and we improve fine-grained keypoint descriptor performance by taking advantage of efficient sub-pixel convolutions to upsample the descriptor feature-maps to a higher operating resolution. Through extensive experiments and ablative analysis, we show that the proposed self-supervised keypoint learning method greatly improves the quality of feature matching and homography estimation on challenging benchmarks over the state-of-the-art.  Read More

Citation: Tang, Jiexiong, Hanme Kim, Vitor Guizilini, Sudeep Pillai, and Rares Ambrus. "Neural Outlier Rejection for Self-Supervised Keypoint Learning." ICLR 2020.

 

Image
Neural Outlier Rejection for Self‑Supervised Keypoint Learning
Semantically‑Guided Representation Learning for Self‑Supervised Monocular Depth
Automated Driving, Robotics | April 26, 2020

TRI Authors: Vitor Guizilini, Jie Li, Rares Ambrus, Adrien Gaidon

All Authors: Vitor Guizilini, Rui Hou, Jie Li, Rares Ambrus, Adrien Gaidon Self-supervised learning is showing great promise for monocular depth estimation, using geometry as the only source of supervision. Depth networks are indeed capable of learning representations that relate visual appearance to 3D properties by implicitly leveraging category-level patterns. In this work we investigate how to leverage more directly this semantic structure to guide geometric representation learning, while remaining in the self-supervised regime. Instead of using semantic labels and proxy losses in a multi-task approach, we propose a new architecture leveraging fixed pretrained semantic segmentation networks to guide self-supervised representation learning via pixel-adaptive convolutions. Furthermore, we propose a two-stage training process to overcome a common semantic bias on dynamic objects via resampling. Our method improves upon the state of the art for self-supervised monocular depth prediction over all pixels, fine-grained details, and per semantic categories. Read more

Citation: Guizilini, Vitor, Rui Hou, Jie Li, Rares Ambrus, and Adrien Gaidon. "Semantically-Guided Representation Learning for Self-Supervised Monocular Depth." ICLR 2020

 

Image
Semantically‑Guided Representation Learning for Self‑Supervised Monocular Depth