The context: an unprecedented pandemic
In Australia, the pandemic resulted in our government agency clients becoming the go-to source of critical, daily information.
In a short time, this activity led to a massive increase in web traffic, equating to approximately 360 million daily requests. Meanwhile, error rates remained significantly low, at less than 0.01%.
So, how do you ensure your websites continue to perform during these high-traffic events in a pandemic or during other crises?
Through our hosting platform, Skpr, and its underlying infrastructure of AWS, we discovered three key learnings that you can apply to safeguard your environment through prolonged traffic spikes.
- A solid caching strategy.
- Maintain an efficient cache with a hit ratio above 75% (sites on our platform hit 99% during the pandemic).
- Serve cached items for longer and reuse items.
- Use CloudFront to protect the user experience.
- Reduce cache key variation.
- Undertake automatic testing for critical pages using CI tools.
- Use an external In-Memory cache to avoid database deadlocks.
- Scaling out on demand.
- Manage a Kubernetes cluster and other AWS-managed services.
- Undertake horizontal pod autoscaling using Kubernetes.
- Cluster autoscale using CloudFront, Elastic Load Balancer and Elastic File System.
- Decouple the state of the application.
- Fine-tune PHP resources to manage the maximum number of processors.
- A protected database.
- Use Amazon Aurora for MySQL for smoother, faster scaling or infrastructure change.
- Use AWS RDS Proxy to make faster new connections.
- Use K6 to measure load testing.
- Use New Relic to fix bottlenecks.
- Use CloudWatch Dashboards to monitor impacts on infrastructure.
The following video provides further detailed technical insights into how Skpr provisions and manages this infrastructure to help our clients respond to unprecedented traffic spikes.