|
AI Runtime Engineer - San Jose California
Company: Advanced Micro Devices Location: San Jose, California
Posted On: 01/29/2025
WHAT YOU DO AT AMD CHANGES EVERYTHINGWe care deeply about transforming lives with AMD technology to enrich our industry, our communities, and the world. Our mission is to build great products that accelerate next-generation computing experiences - the building blocks for the data center, artificial intelligence, PCs, gaming and embedded. Underpinning our mission is the AMD culture. We push the limits of innovation to solve the world's most important challenges. We strive for execution excellence while being direct, humble, collaborative, and inclusive of diverse perspectives.The RoleWe are building IREE as an open-source compiler and runtime solution to productionize ML on a variety of usage scenarios and hardware targets. Among them, having wide and performant GPU support is critical. We aim at a broad range of GPU coverage, from mobile to datacenter, via a unified software stack. It requires us to write the most efficient code to interact with the OS and device drivers with minimal dependency and small binary size. There will be no short of intriguing technical challenges to tackle, and there are abundant chances to collaborate with industry experts working at different layers of the stack. If this sounds interesting to you, please don't hesitate to reach out to us!The PersonAn ideal candidate should be familiar with GPU runtime APIs, GPU drivers, GPU architectures, OS, parallel/asynchronous programming, efficient resource management. He/she should be comfortable at performing quantitative analysis of workload and drive improvements at suitable software stack layers. Most importantly, the candidate is willing to learn and work across boundaries.Key Responsibilities: - Design, develop, and maintain GPU related runtime implementations in IREE over HIP, CUDA, Vulkan, DirectX, Metal.
- Design, develop, and maintain multi-GPU runtime and communication solutions including collectives.
- Manage testing and releasing of runtime components.
- Quantitively analyze end-to-end model performance, identify bottlenecks, propose ideas to improve, prototype and productionize solutions.
- Design and implement compiler passes to better schedule and utilize resources.
- Design and implement Python interactions with runtime components.
- Drive towards general solutions that benefit different all GPU targets and the overall community.Preferred Experience:
|
|