Senior DevOps Engineer (Trading Platform)
The team responsible for Trading Infrastructure is currently developing a cutting-edge Trading Platform that supports multi-asset trading with a strong emphasis on low-latency execution, robust risk controls, and seamless integration across various workflows including trading, risk, operations, and finance.
The platform's architecture is modular, comprising key components like market data feeds, order gateways, execution algorithms, risk engines, UI dashboards, middle office reconciliation, and account infrastructure. The design philosophy revolves around event-driven, deterministic system design, real-time observability, and high-level security.
Technologies utilized in our stack include Java for low-latency purposes, Python, Web UI employing React and Ag-Grid, Aeron, ClickHouse, Kubernetes, along with modern CI/CD tools. The team focuses on automation, scalability, and performance, harnessing AI-assisted development tools to enhance productivity and quality collectively.
Responsibilities:
- Design, set up, and manage scalable trading systems infrastructure encompassing CI/CD pipelines, observability tools, and various runtime environments.
- Administer and optimize databases such as AWS Aurora, PostgreSQL, and ClickHouse.
- Automate provisioning and configuration using tools like Ansible and Terraform for EC2 instances and associated resources.
- Ensure secure and stable environments through best practices in VPC design, IAM governance, and secret management.
- Establish system metrics and alerts through tools like Prometheus, Grafana, and Loki.
- Uphold GitHub repository standards and branching mechanisms across development teams.
- Monitor infrastructure usage to achieve cost-effectiveness by continuous optimization and cost control strategies across AWS and containerized deployments.
- Implement backup and disaster recovery strategies for critical systems.
- Collaboate with engineering teams on containerization efforts and optimizing runtime performance.
- Assess and integrate AI/ML tools to drive automation, diagnostics, and operational efficiency improvements.
Requirements:
- Minimum of 5 years' DevOps or Site Reliability Enginnering experience in high-availability real-time systems.
- Hands-on experience with AWS services such as EC2, VPC, IAM, CloudWatch, RDS, and cost monitoring tools.
- Proficient in provisioning and config management using Ansible; familiarity with Terraform is beneficial.
- Strong skills in ClickHouse and PostgreSQL; knowledge of kdb+ or InfluxDB is a plus.
- Solid scripting abilities in Python, Bash, or equivalent.
- Practical experience with Docker, Kubernetes, Helm, and deployment automation.
- Familiarity with monitoring and logging tools; experience with Prometheus and Grafana is preferred.
- Security-focused with expertise in IAM, encryption, and secure system design.
- Capable of monitoring and optimizing computing resources to maintain performance within budget limits.
- Comfortable utilizing AI tools for enhancing automation, diagnostics, or efficiency gains.
Communication & Collaboration:
- Proficient in English language (spoken and written); proficiency in other languages is advantageous but not mandatory.
- Experienced in working within a global team context with colleagues across different regions.
- Strong communication skills with the ability to engage effectively across various levels of the organization.
- Capable of communicating complex technical concepts to both technical and non-technical audiences and coordinating effectively across diverse teams.
Life at the organization:
- Enables thinking big and exploring new opportunities within a supportive and talented team.
- Proactive working environment that empowers innovative solutions.
- Focus on internal growth and skill development for personal and professional advancement.
- Supportive work culture emphasizing collaboration and mutual assistance.
- Unified team ethos working towards a common goal.
- Flexible work arrangements and opportunities for career progression.
Benefits:
- Competitive salary and attractive annual leave entitlements including birthday leave.
- Flexibility in work hours and hybrid or remote work setups.
- Internal mobility program offering diverse career avenues.
- Work perks such as a company Visa card upon joining.
