Data Systems Engineer
You’re a data systems engineer that enjoys the challenge of architecting and building amazing, large-scale distributed systems. You have the drive to thrive in a fast-growing environment, crave impactful contributions, and love building technologies that delight stakeholders.
At Toyota Research Institute, we are leading the way on artificial intelligence, autonomous passenger vehicles, and home robots, which all rely on massive amounts of data. Critical to this mission is interactive, distributed big data analytics including extraction, transformations, high-performance query processing, and rich visualizations. In this role, you will work with a top-notch team on the integration of leading data analytics technologies with the big data ecosystem and leading cloud computing frameworks.
- Design and implementation of features that provide tight integration of analytics technologies with the big data ecosystem (Hadoop, Impala, Spark, etc.) to enable highly scalable distributed data processing.
- Engineer smart tools and frameworks that simplify the data stack complexity for our stakeholders.
- Develop/integrate labeling tools and work with teams to provide ground-truth in support of machine learning and simulation.
- Design and implement features for optimal cluster resource management.
- Work on a data platform with AWS and other cloud computing infrastructure.
- In collaboration with our research teams, partner with big data ecosystem vendors and cloud providers to ensure tight and seamless integration with TRI’s data stack.
In addition to your ability to contribute to a collaborative, open, and fun working environment, you should have:
- A BS in Computer Science or related field (MS or PhD is better);
- 2-5+ years of relevant industry experience;
- Experience with database internals, including both query processing and query planning, or other data processing infrastructure like, SQL, NoSQL, and MapReduce;
- Hands-on experience in core components of the Hadoop stack -- HDFS, Hive, YARN, Impala, Spark;
- Experience building applications on leading cloud computing platforms: AWS (Elastic MapReduce, CloudFormation, CloudFront, S3), Microsoft Azure or Google Compute Platform;
- Strong understanding of algorithms in distributed systems, object-oriented design, and scripting using Java, Python, C++, and/or Scala;
- Good troubleshooting skills and willingness to help the field and customer support teams, as needed;
- Ability to juggle priorities, make the right tradeoffs, deliver timely features while ensuring team success, and technology leadership for the company;
- Excellent communication, presentation and collaborative problem-solving skills;
- Team player who wants to change the way people use their data and have a huge impact on a growing team working on groundbreaking technologies;
- Palo Alto, CA
TRI provides Equal Employment Opportunity without regard to the applicant's race, color, creed, gender, gender identity or expression, sexual orientation, national origin, age, physical or mental disability, medical condition, religion, marital status, genetic information, veteran status, or any other status protected under federal, state or local laws.