SLATE Testing At Uber

tags:

categories:


  • SLATE – Short Lived Application Testing Environments
  • Massively successful companies Facebook(Meta) and Uber have sophisticated testing strategies that is non traditional and “works”.

The Traditional Approaches are below

  • Uber’s testing strategy is leaning forward more on “online tests” than “offline tests”
  • A separate school of thought is that writing heavy unit tests is useless as it does not handle as many real world examples. At Uber, it is a microservice architecture with lots of service dependencies.
  • To test new service versions, an upstream and down stream dependency behavior needs to be prepared ahead of time, and simulated similar to production environments.
    • This way the behaviours are already tested with production configurations adding to higher level of confidence of deployments.

Testing Requirements:

  • Up to date dependencies should be as similar to production environment as possible. Keeping staging environment similar to production has always been a big challenge, there is a different in data and config preparation
  • Sharing environments between developers – there are so many developers each with their own changes. Claiming resources to test the changes is not feasible
  • Live Production dependencies separated – having ability to isolate certain dependencies without affecting real users
  • Cost – Reducing the cost of redundant infrastructure as spinning up services for testing can be costly for big company like Uber.

SLATE

  • SLATE will spin up an environment for testing isolated at Git branch or Git ref level.
  • Multiple slate services can be deployed at the same time
  • One kind of service can only have one Slate instance deployed at a time
  • SLATE environments will automatically reclaim resources after 2 days.

Tools to support SLATE testing

  • There is a library called Cadence created. It has a CLI tool to create an environments. It has many parts including a Slate Server that the CLI interacts with.
  • Another library called Peloton is used to manage resources
  • Uber’s infrastructure sounds like Kubernetes. The routing of requests is done via yaml files. Mesh capability is required.
  • Test Accounts are created so real users are not affected.
  • For observability, Jaeger Baggage is used to trace end to end.
  • Developers has access to controller to handle how to route traffic.

Takeaways:

  • SLATE – building short lived test microservices, and freeing up resources immediately.
    • Cadence can be deployed with kubernetes, but also has docker compose deployment
    • Distributed tracing with Jaeger
    • Peloton for resource managing
    • Config cache
  • Improved testing experience and velocity for E2E testing for the company.
  • Staging might be able to be deprecated in favour of using SLATE.

Ref:

Leave a comment