Pragmatic Pineapple 🍍

Building a High-Speed, Scalable, and Fault-Tolerant System A/B Testing Middleware on The Edge

Published Last updated loading views

Here’s the outline for the slides and what will be shown in the presentation:

Intro about me and the talk (1-3 slides)

The problem

  • A/B testing
    • What is A/B testing in principle
  • Architecture of the system
    • The previous solution at the organization
  • Previous (slow & heavy) solution
    • Why the previous solution motivated us to look for an alternative
  • Future (fast & lean) solution
    • New solution involving Edge Computing

The solution (happy path) V1

  • Cloudflare Worker
    • Cloudflare Edge network and their Workers
  • A/B test decision made in the cloud and stored in the header
    • Using 3rd party tool for A/B testing and how the decision happens in the cloud near the end user
  • Origin (backend)
    • What serves are listening and expecting A/B decision
  • Edge Cache
    • We had to have everything CDN cached properly
  • Testing everything
    • How to test the whole system and services that they are working
    • A tool called Artillery for high load testing

The solution (when things go left) V2

  • What happens if 3rd party A/B testing service is down?
    • We need to store events and resubmit when the service goes online
    • How to store events?
  • Options for storing user events - KV, Durable Objects, Kafka, Amazon SQS
  • Durable Objects
    • A feature from Cloudflare that introduces distributed computing/programming
  • Emptying the queue from Durable Object
  • Distributed Computing with Durable Object Groups
  • Testing everything (again)
    • How we simulated 1000 req/s from our machines

The solution (common sense) V1.1

  • Cloudflare Worker does the main job
  • If 3rd party service is down, save events to Durable Object and retry later
  • Reestimate the cost of such fault tolerant system and put a limit on how tolerant it is
  • Backup plan for future development

Takeaways

  • Speed and Scalability on Edge
  • Fault tolerance and where to put a limit - figure out what you are building and for who, talk to management and people responsible for business decisions
  • Have a backup plan
  • Feel free to experiment
  • Advocate for common sense (even if it does not appear as common)

Hacker News
Twitter
LinkedIn
Facebook
Nikola Đuza

Written by Nikola Đuza who helps developers improve their productivity by sharing pragmatic advice & applicable knowledge on JavaScript and Ruby. You can connect with him on Twitter.

Join the newsletter!

Subscribe to get latest content by email and to become a fellow pineapple 🍍

© 2026 Nikola Đuza