Current Statistics
1,607,645 Total Jobs 333,747 Jobs Today 16,998 Cities 222,734 Job Seekers 146,858 Resumes |
|
|
|
|
|
|
Principal Engineer, AIOps - Santa Clara California
Company: NVIDIA Corporation Location: Santa Clara, California
Posted On: 02/02/2025
Principal Engineer, AIOpsWe are looking for an AIOps Principal Engineer who can design, develop, and deploy AI-powered solutions for IT operations. You will work with a team of engineers, data scientists, and domain experts to create and implement innovative applications that leverage NVIDIA's Observability, Infrastructure and Gen AI platforms. You will also collaborate with internal and external customers to understand their needs, define requirements, and deliver high-quality products.What you'll be doing: - Lead the design, development, testing, and deployment of AIOps platform.
- Apply machine learning, deep learning, natural language processing, and other AI techniques to solve IT operations challenges such as anomaly detection, root cause analysis, incident management, and automation.
- Improve IT Infrastructure and Operations Management by defining and measuring AIOps metrics such as accuracy, reliability, scalability, performance, and efficiency.
- Experience in implementing observability principles and practices such as monitoring, logging, tracing, and alerting.
- Deep Knowledge in data science engineering such as data collection, data cleaning, data analysis, data modeling, and data visualization.
- Expertise in integrating AIOps tools with IT operations management (ITOM) and IT service management (ITSM) systems, service desk, change management, configuration management, etc.
- Demonstrate solid leadership skills and ability to lead and empower engineers and data scientists.
- Design and communicate the AIOps roadmap, vision, and strategy to the team and the partners.
- Collaborate effectively with customers, such as IT managers, business users, vendors, and partners, to ensure alignment and satisfaction.
- Play a pivotal role in harnessing AI, generative AI, and machine learning for Nvidia IT teams.What we need to see:
- Bachelor's degree or higher in computer science, engineering, or related field (or equivalent experience).
- 15+ years of industry experience in extensive engineering projects, with a particular emphasis on infrastructure automation, distributed systems, and tool development for managing large-scale private or public cloud systems.
- 5+ years of experience and understanding working with AIOps technologies and platforms.
- Proficient in Python, TensorFlow, PyTorch, or other AI frameworks and libraries.
- Proficiency in Python and Go programming; your coding and debugging expertise are pivotal to your success in this role.
- Demonstrated commitment to sound software engineering principles and a strong willingness to acquire new skills.
- Experience in working with IT systems, tools, and processes such as ITSM, ITOM, monitoring, logging, and alerting.
- Ability to work independently and collaboratively in a fast-paced and dynamic environment.
- Hands-On experience in designing and implementing end-to-end architecture and large-scale rollout of AIOps product.
- Developed Gen AI applications using LLMs, RAG for incident diagnosis, identifying root causes and incident resolution.Ways to stand out from the crowd:
|
|
|
|
|
|
|