A year ago I set out to make WordPress super scalable. Not that the WordPress.com guys aren’t doing a good enough job or anything, but just because… Because WordPress is a great platform to build shit on. It’s easy for non-technical people to grok. It’s powerful to build things with (think actions and filters). But it’s built on PHP, and slow, and cpu hungry, and ram hungry. How do you scale it? I’ve found the answer …
The Docker Way
Docker is an infrastructure tool. It allows you to build homogeneous nodes in a cluster and deploy (almost) anything to them. There’s two catches when it comes to WordPress and Docker: docker containers are ephemeral by nature so it requires immutable characteristics. The other catch is: WordPress is effing mutable as F.ck.
Although, it’s not specifically WordPress that makes WordPress immutable. While WordPress does write some things to the filesystem (like .htaccess files and uploads), it is the plugins that are bad. While, engineering a solution was difficult, implementing it will be 100x the effort.
Making WordPress file structure stay immutable isn’t too difficult though, not compared to making the surrounding infrastructure behave in an ephemeral environment. Such as Memcached, MySQL, etc. How does that work?
Basically, the solution (if you choose to deploy these things with Docker), is to deploy a ton of them. Make everything as redundant and available as possible. That way, if a node fails, you won’t lose all your data. That way, if your orchestration decides to move things around, your site won’t go down. Which brings up a good point, how do you orchestrate all these things?
Find The Right Orchestra
It all comes down orchestration. You don’t want to put memcached on a node that is starving for ram. You don’t want to put php-fpm on a node starving for CPU — or MySQL on a full hard drive. The cool thing about Docker, is that you can move things about, it’s like having a datacenter inside a datacenter…
And so, I set about learning how to build the stack inside of Docker. What do all the moving pieces look like? Well they look like this:
It’s actually pretty simple. The hard part is all the supporting magic with dynamic ip’s, dynamic nodes, etc. Which it ends up looking like this:
Alright, so that’s quite a bit of stuff… and a lot to manage as a human being. So, I started evaluating various platforms that can schedule and manage the placement of docker containers.
Mesos was one of the first one’s I looked at. They were in beta at the time, and it looked promising. It was too much that I didn’t need, and I had a lot of trouble getting it (and keeping it) running. Things are looking much better over there now. For a larger company, it is perfect. For just me, it is a bit overkill. I loved the fact that it can manage more than just docker workloads, and chronos is an amazing tool.
Kube is sooooo low level. I’ll probably end up using it, but I haven’t had the need quite yet. It’s an amazing infrastructure though. With it being so low level, it still needs help from one of the other players mentioned here.
Deis is brilliant. Esp if you like the heroku workflow. It’s 100% compatible with heroku buildpacks and can do some really awesome things. I left it behind because of what it couldn’t do (which may no longer be the case): host backing services like MySQL (galera) and memcached.
To put it bluntly, it’s a leaky abstraction on top of kubernetes. I’m a fan, but it just wasn’t a good fit for the workload I have.
Nope. I was on AWS when it went down in 2012 … never again will I have cloud vendor lockin. No sir. Not to mention you can’t run it locally, so how do you trust that it will work when you deploy? The beauty of docker is that you can run them anywhere, so too should I be able to run the scheduler! Left this one behind on principle alone.
Docker just bought Tutum, fairly recently. I’ve had to fork his containers and modify them so many times to be half-way secure and fix bugs that I don’t know why I thought about trusting his orchestration tool. Well, I did. It wasn’t half-bad. It was a bit buggy, there’s no multi-user aspect at all. A good portion isn’t open source (that I could easily find). I still hang out on the Tutum slack channel…
A leaky abstraction on top of docker/compose. They even built their own OS on top of docker … that’s pretty cool. This isn’t a bad thing to be a leaky abstraction over. IMHO, this is the best one I’ve used thus far. They understand the idea of persistent storage, using convoy. If you can run it in docker, you can run it on Rancher. Thus, you can run kubernetes if you’re up for it. They’ve got a great community of images, load balancers are built in … the whole thing makes sense. These guys came to the game late, so they learned from all the other ones.
Where to go From Here
Over the last couple of weeks, I’ve been working on clam to create the basic building blocks of hosting WordPress. Over the next couple of weeks, I’ll be aiming for building a catalog of services that will be “actually useful” to host WordPress. Then, I’ll move this here blog over to it (as well as a few other sites I host) … and then life might just get a bit interesting…
But seriously, why Docker?
Docker is a great tool that allows super fast iteration over something. Instead of running about and reconfiguring dozens of web servers to run PHP7.0, I changed one line of code in a Dockerfile. This is time saved, which is likely equal to money saved… when the costs have to be low, getting things done expediently is the name of the game. Not to mention I have revision history of those images, often with a note as to why it changed. That’s ultimately why I chose Docker…
Until next time,