Featured Publications
All Publications
Recent progress in 3D object detection from single images leverages monocular depth estimation as a way to produce 3D pointclouds, turning cameras into pseudo-lidar sensors. These two-stage detectors improve with the accuracy of the intermediate depth estimation network, which can itself be improved without manual labels via large-scale self-supervised learning. However, they tend to suffer from overfitting more than end-to-end methods, are more complex, and the gap with similar lidar-based detectors remains significant. In this work, we propose an end-to-end, single stage, monocular 3D object detector, DD3D, that can benefit from depth pre-training like pseudo-lidar methods, but without their limitations. Our architecture is designed for effective information transfer between depth estimation and 3D detection, allowing us to scale with the amount of unlabeled pre-training data. Our method achieves state-of-the-art results on two challenging benchmarks, with 16.34% and 9.28% AP for Cars and Pedestrians (respectively) on the KITTI-3D benchmark, and 41.5% mAP on NuScenes. READ MORE
Simulators can efficiently generate large amounts of labeled synthetic data with perfect supervision for hard-to-label tasks like semantic segmentation. However, they introduce a domain gap that severely hurts real-world performance. We propose to use self-supervised monocular depth estimation as a proxy task to bridge this gap and improve sim-to-real unsupervised domain adaptation (UDA). Our Geometric Unsupervised Domain Adaptation method (GUDA) learns a domain-invariant representation via a multi-task objective combining synthetic semantic supervision with real-world geometric constraints on videos. GUDA establishes a new state of the art in UDA for semantic segmentation on three benchmarks, outperforming methods that use domain adversarial learning, self-training, or other self-supervised proxy tasks. Furthermore, we show that our method scales well with the quality and quantity of synthetic data while also improving depth prediction. READ MORE
In this work, we present DBgen, a Python library that provides a framework for defining extract-transform-load (ETL) pipelines to create and populate SQL databases. DBgen is most useful when the underlying data has complex relationships, requires multi-step analysis, is large-scale, and the type of data being collected changes frequently. Scientific data often fits this description. With current tooling, defining ETL pipelines for this particularly difficult- to-manage data is so onerous that a great deal of it does not end up being stored in a database and is opaque. DBgen is designed to fill the gap in the current tooling and reduce the barrier to defining ETL pipelines such data. READ MORE
Surface adsorption is a crucial step in numerous processes, including heterogeneous catalysis, where the adsorption of key species is often used as a descriptor of efficiency. We present here an automated adsorption workflow for semiconductors which employs density functional theory calculations to generate adsorption data in a high-throughput manner. Starting from a bulk structure, the workflow performs an exhaustive surface search, followed by an adsorption structure construction step, which generates a minimal energy landscape to determine the optimal adsorbate–surface distance. An extensive set of energy-based, charge-based, geometric, and electronic descriptors tailored toward catalysis research are computed and saved to a personal user database. The application of the workflow to zinc telluride, a promising CO2 reduction photocatalyst, is presented as a case study to illustrate the capabilities of this method and its potential as a material discovery tool. READ MORE
Predicting driver intentions is a difficult and crucial task for advanced driver assistance systems. Traditional confidence measures on predictions often ignore the way predicted trajectories affect downstream decisions for safe driving. In this letter, we propose a novel multi-task intent recognition neural network that predicts not only probabilistic driver trajectories, but also utility statistics associated with the predictions for a given downstream task. We establish a decision criterion for parallel autonomy that takes into account the role of driver trajectory prediction in real-time decision making by reasoning about estimated task-specific utility statistics. We further improve the robustness of our system by considering uncertainties in downstream planning tasks that may lead to unsafe decisions. We test our online system on a realistic urban driving dataset, and demonstrate its advantage in terms of recall and fall-out metrics compared to baseline methods, and demonstrate its effectiveness in intervention and warning use cases. READ MORE
The rational solid-state synthesis of inorganic compounds is formulated as catalytic nucleation on crystalline reactants, where contributions of reaction and interfacial energies to the nucleation barriers are approximated from high-throughput thermochemical data and structural and interfacial features of crystals, respectively. Favorable synthesis reactions are then identified by a Pareto analysis of relative nucleation barriers and phase selectivities of reactions leading to the target. We demonstrate the application of this approach in reaction planning for the solid-state synthesis of a range of compounds, including the widely studied oxides LiCoO2, BaTiO3, and YBa2Cu3O7, as well as other metal oxide, oxyfluoride, phosphate, and nitride targets. Pathways for enabling the retrosynthesis of inorganics are also discussed. READ MORE
Nonprecious hydrogen evolution reaction (HER) catalysts commonly suffer from severe dissolution under open-circuit potential (OCP). In this work, using calculated Pourbaix diagrams, we quantitatively analyze the stability of a set of well-known active HER catalysts (MoS2, MoP, CoP, Pt in acid, and Ni3Mo in base) under working conditions. We determine that the large thermodynamic driving force toward decomposition created by the electrode/electrolyte interface potential is responsible for the substantial dissolution of nonprecious HER catalysts at OCP. Our analysis further shows the stability of HER catalysts in acidic solution is ordered as Pt ≈ MoS2 > MoP > CoP, which is confirmed by the measured dissolution rates using an inductively coupled plasma mass spectrometer. On the basis of the gained insights, we suggest strategies to circumvent the catalyst dissolution in aqueous solution. READ MORE
Scanning transmission electron microscopy (STEM) allows for imaging, diffraction, and spectroscopy of materials on length scales ranging from microns to atoms. By using a high-speed, direct electron detector, it is now possible to record a full two-dimensional (2D) image of the diffracted electron beam at each probe position, typically a 2D grid of probe positions. These 4D-STEM datasets are rich in information, including signatures of the local structure, orientation, deformation, electromagnetic fields, and other sample-dependent properties. However, extracting this information requires complex analysis pipelines that include data wrangling, calibration, analysis, and visualization, all while maintaining robustness against imaging distortions and artifacts. In this paper, we present py4DSTEM, an analysis toolkit for measuring material properties from 4D-STEM datasets, written in the Python language and released with an open-source license. We describe the algorithmic steps for dataset calibration and various 4D-STEM property measurements in detail and present results from several experimental datasets. We also implement a simple and universal file format appropriate for electron microscopy data in py4DSTEM, which uses the open-source HDF5 standard. We hope this tool will benefit the research community and help improve the standards for data and computational methods in electron microscopy, and we invite the community to contribute to this ongoing project. READ MORE
Successful robotic operation in stochastic environ- ments relies on accurate characterization of the underlying probability distributions, yet this is often imperfect due to limited knowledge. This work presents a control algorithm that is capable of handling such distributional mismatches. Specifically, we propose a novel nonlinear MPC for distributionally robust control, which plans locally optimal feedback policies against a worst-case distribution within a given KL divergence bound from a Gaussian distribution. Leveraging mathematical equivalence between distributionally robust control and risk-sensitive optimal control, our framework also provides an algorithm to dynam- ically adjust the risk-sensitivity level online for risk-sensitive control. The benefits of the distributional robustness as well as the automatic risk-sensitivity adjustment are demonstrated in a dynamic collision avoidance scenario where the predictive distribution of human motion is erroneous. READ MORE
Multi-object tracking is an important ability for an autonomous vehicle to safely navigate a traffic scene. Cur- rent state-of-the-art follows the tracking-by-detection paradigm where existing tracks are associated with detected objects through some distance metric. The key challenges to increase tracking accuracy lie in data association and track life cycle management. We propose a probabilistic, multi-modal, multi- object tracking system consisting of different trainable modules to provide robust and data-driven tracking results. First, we learn how to fuse features from 2D images and 3D LiDAR point clouds to capture the appearance and geometric information of an object. Second, we propose to learn a metric that combines the Mahalanobis and feature distances when comparing a track and a new detection in data association. And third, we propose to learn when to initialize a track from an unmatched object detection. Through extensive quantitative and qualitative results, we show that our method outperforms current state- of-the-art on the NuScenes Tracking dataset. READ MORE