Research & Development

SRE (Site Reliability Engineering)

-Continuously reduce the risk of system operation and maintenance, and can handle various emergency events 7*24 hours;
-Communicate with the project team and R&D, and give regular feedback and promote improvements to the problems existing in the business operation environment;
-According to user feedback and business development needs, continuously promote system/process iterative upgrades and quickly respond to business needs;
-Assist in the planning, design, implementation and optimization of the automated operation and maintenance platform;

●\tA strong passion for automation and repeatable processes
●\tHands-on experience leveraging containerization including Kubernetes (EKS or similar), docker and other container technologies
●\tDeep exposure to at least one of the following cloud providers: AWS, GCP or Alibaba cloud
●\tExperience working with a mainstream programming language such as Java, go and Python
●\tBroad experience with modern CI/CD pipelines ( Gitlab, Jenkins etc)
●\tDemonstrates innovative methods of declaratively automating cloud-based IaaS/PaaS deployments and applications using modern and innovative GitOps and DevOps techniques and technologies.
●\tA fervent enthusiasm for infrastructure-as-code
●\tA love for operational metrics including MTTD & MTTR and the capabilities and practices that allow these to be continually improved upon
●\tExcellent understanding of OS, platform and network security practices, patterns and frameworks
●\tNetwork topology and infrastructure architectures on Linux based platforms - Data driven and passions in Observability

Company Directory