![](/images/main_left.jpg) |
Site Reliability Engineer (TS/SCI + CI Poly) - Herndon Virginia
Company: The Darkstar Group Location: Herndon, Virginia
Posted On: 02/02/2025
DescriptionThe DarkStar Group is seeking a Site Reliability Engineer with a TS/SCI + CI Poly clearance to join one of our top projects in Herndon, VA. Below is an overview of the project, as well as information on our company and our benefits.THE PROJECTThe DarkStar Group's team solves unique and challenging intelligence problems for a Special Operations customer. This work is as close to the mission as a technologist can get, so the environment is fast-paced: team members face rapidly-changing requirements and priorities as mission needs evolve. If you hate monotony and want to use your skills to have a direct impact on real-world operational success, this is the project for you.We are a multi-faceted software development and systems administration team working to build and maintain software applications backed by a self-managed cloud infrastructure (OpenStack) with a true big-data footprint (over 10 petabytes). Our diverse background of experience in mission support and software development serves as a catalyst to solve unique and challenging intelligence problems in support of special operations analysts and their ongoing activities. Prototyping and frequent, iterative feedback are core to our delivery approach, anchored by a need to work quickly in support of our missions.The technical stack is quite robust and includes Java, Python, C#, C/C++, Geospatial tools, Big Data and Graph Products (Hadoop, MapReduce, Spark, ElasticSearch, Neo4j), Linux, OpenStack, AWS, Ansible, SQL/NoSQL, Text Processing, Cloud Services, Containerization, Infrastructure as Code (IAC), and more.Work on this program takes place in the Herndon, VA area (we cannot support remote work) and requires a TS clearance and a willingness to obtain a CI Poly: a current TS/SCI + CI Poly is preferred.THE ROLEThe DarkStar Group is seeking a Site Reliability Engineer (RSE) for our OpenShift PaaS organization. You will be responsible for ensuring the availability, performance, and scalability of our OpenShift environments. You will collaborate with development, operations, and product teams to automate processes, build robust monitoring systems, and enhance the overall reliability of our platforms.Key Responsibilities: - System Reliability & Scalability: Design, implement, and maintain highly available OpenShift clusters to support mission-critical applications.
- Automation & Infrastructure as Code (IaC): Develop and maintain automation scripts and tools to streamline deployment, scaling, and recovery processes using tools like Ansible, Terraform, and Helm.
- Monitoring & Incident Management: Build and enhance monitoring and alerting systems (e.g., Prometheus, Grafana, ELK). Respond to and resolve incidents, conducting post-mortem analyses to identify root causes.
- Performance Optimization: Analyze and optimize system performance, ensuring minimal latency and maximum throughput.
- Collaboration: Work closely with development teams to implement DevOps best practices, CI/CD pipelines, and platform enhancements.
- Security & Compliance: Ensure platforms meet security and compliance requirements by integrating tools for vulnerability scanning, policy enforcement, and logging.Required Skills:
- Bachelor's degree in Computer Science, Engineering, or equivalent experience.
- Minimum 5+ years of experience as an SRE, DevOps Engineer, or related role.
- Expertise in OpenShift or Kubernetes platform administration.
- Strong knowledge of Linux systems, networking, and containerization technologies (Docker).
- Proficiency in scripting languages such as Python, Bash, or Go.
- Experience with CI/CD pipelines (e.g., Jenkins, GitLab CI/CD).
- Familiarity with monitoring and logging tools like Prometheus, Grafana, ELK, or Splunk.Desired Skills (Optional):
- OpenShift certification (e.g., Red Hat Certified Specialist in OpenShift Administration).
- Experience with cloud platforms (AWS, Azure, or GCP).
- Knowledge of service mesh technologies (Istio, Linkerd).
- Strong understanding of microservices and distributed systems architecture.About The DarkStar GroupOur CompanyThe DarkStar Group is a small business that solves BIG problems. We're one of the Inc. 5000 fastest-growing private companies in the US, and our engineers and scientists support the most critical national security missions in Virginia, Maryland, and elsewhere. Data Science, Software Engineering, Cloud/AWS Infrastructure, and Cyber/CNO are our core areas of expertise. We offer interesting and important work, job security, some of the best and most flexible benefits you'll find in the IC, and salaries so strong that they'll likely surprise you.Our BenefitsThe DarkStar Group offers exceptional compensation and benefits:
|
![](/images/main_right.jpg) |