Boing Boing 3.0, and Ask the Sysadmin!


#1

I thought I’d start a thread to discuss the infrastructure at Boing Boing - current, future, and answer any questions about running a pretty high-profile blog. I’ve been threatening to do this for the Boingers as posts, but I never expected to find much of an audience that way. This is a better way to do so, IMHO.

For some background: I took over the hosting of Boing Boing on Halloween, 2003. I had already been hosting Cory’s personal site for some time before this, and the Boingers were in a pinch, with the site down for several days. I took my personal desktop, a white-box P233MMX down to my friend’s cage at the carrier hotel at 151 Front Street West in Toronto, and the rest is history.

As for my (personal) background - I’ve been a sysadmin since 1994, and managing large-scale infrastructure since 2001, where I designed the infrastructure for First Data’s first internet-enabled payment terminals (on Linux, which was a bit of a maverick move at the time - my boss made me put Solaris systems in front of them at first until they were actually proven to be faster). In the years since I’ve worked as the Director of Technology Operations for what was once the fourth largest online advertising network, as well as a stint as DTO for the Wikimedia Foundation. I’ve managed some big networks :slight_smile:

Anyway, ask me your questions! I’m going to add two more posts below, one about the current BB infrastructure, and one about the new setup, that I’ve dubbed *BB 3.0", and keep this as a running journal of the transition.


#2

Current setup:

The current Boing Boing runs on six servers, as follows:

  • 3 web front-end servers handling dynamic calls to Wordpress
  • 1 admin server handling the admin interface for Wordpress, and acting as origin for media.boingboing.net
  • 2 master-master replicated MySQL 5.6 servers for the database

The servers are currently managed hosting boxes from Priority Colo in Toronto, who have been awesome partners of our for years now. All the servers run Red Hat Enterprise Linux 6,

Sync between the frontends and the admin server is handled through a mess of rsync scripts and manual updates scripts our devs use to keep things in lockstep. At the time I set this up there wasn’t any good software-based sync setup that didn’t risk being a single point of failure.

Boing Boing is fronted by Fastly as our CDN. They are seriously awesome and offer control over content right down to the Varnish VCL level. I’m a huge fan of their setup and design.

The current system uses Monit for daemon restarts and outage reporting, Munin for systems analytics and long-term trend analysis (including mysql), and a hodgepodge of shell scripts for a backup infrastructure (as well as Vaultpress for an offsite solution which I highly recommend).

HTTPS and load balancing is handled using F5 LTMs.


#3

Boing Boing 3.0

The new Boing Boing infrastructure is going to implement several new advancements that have come up since 2008, which is when I built the existing setup. Most of the gear was leftover from a migration contract I handled for Federated Media, where they paid me partially in equipment, which leads to some strange archetecture decisions (like 15K drives instead of SSDs!

This is still a work in progress!

Servers:

  • 6 HP DL360p G7 servers, dual-hexa-core Xeons, 64GB RAM, 6 15k drives in RAID10, 1 hot spare, configured as follows:
  • 2 web front-ends with 64GB RAM, 6 15k drives in RAID10, 1 hot spare
  • 1 admin server (for SSH Agent Forwarding, as well as Wordpress admin interface for the Boingers)
  • 1 “tools” server for monitoring, backup, misc admin functions
  • 2 DB servers. Unlke the rest, they’re configured with 4 SSDs, and 128GB Ram.

Software:

  • We’ll be running Red Hat Enterprise Linux 7, including it’s pretty awesome support for docker-latest
  • Database will be Percona Server 5.7
  • Monitoring via Icinga 2, munin, and PMM for the database
  • Backups using Duplicity (offsite with Vaultpress)
  • Config management for the whole thing using the super-awesome Ansible
  • Shared storage using either GlusterFS or Ceph, I haven’t decided yet
  • Encryption via LUKS and encrypted OwnCloud (Can’t wait for client-side encryption for the latter!)
  • LVS/HAProxy load-balacing with Let’s Encrypt automated HTTPS certs for origins

CDN setup - We’ll continue to use Fastly as our reverse proxy CDN talking to our origins, however we’re adding two more layers to the mix:

  • Wordpress Photon for image caching, and
  • Google has graciously offered us to have access to Project Shield for DDoS protection, in front of everything

Whew! So yeah, that’s my current project. :slight_smile:


Making, Crafting, Creating... aka Whatcha workin' on?
#4

So what’s it run on now?


#5

I’ve updated the post above with that answer! :smiley:


#6

Only 6 boxes? Wow.


#7

Is a hot-dog a sandwich?

(OK, I got no serious questions but, it is seriously cool to see the nuts-and-bolts behind the scene, even if I don’t have a clue about it. Thanks!)


#8

Pics?


#9

For sure. Next time I’m down at Priority Colo I’ll nab some shots of both the old setup and the new :slight_smile:


#10

Thank you for your expert guidance and advice; you are a true wolf/hunter, oh orenwolf. bow

P.S. I have relied on orenwolf in the past, and I have always been guided true…


#11

What is your password?


#12

Boinboingadmin/Boingboingadmin1 of course.

Hey @orenwolf quick question, I’ve been running 5.5 for a long time now (security patches are applied of course). In your opinion if performance is acceptable, are there any reasons to move to 5.6+, or lessons learned?

Also, just as your opinion (not necessarily the opinion of BB), how was the move to discourse? That couldn’t have been easy.


#13

So, first of all, I heartily recommend running Percona’s mysql builds - they’re FLOSS, and the Percona team is half the folks who split off from mysql AB when Oracle bought mysql. They’ve incorporated patches from google and facebook as well as their own tweaks, and they provide a ton of really useful tools to the community )including percona toolkit, which used to be called maatkit until they hired him!)

5.6 has some features that matter in big installations, and don’t really matter for moderate sized installs. Things like GTLDs so that if you have a cascading chain of servers in master-slave-master loops (oh the fun), you can track specific transactions down, as well as the PERFORMANCE_SCHEMA virtual database, which many tools make use of for statistical analysis of what’s going on in your database. Percona, on top of that, added NUMA handling that was a lifesaver if you have systems that are basically huge gobs of RAM just running mysql, but have two processors with their own RAM banks. the new BB mysql servers have 128GB of ram each and were annoying the crap out of me with swapping (!) in that situation until Google’s backported NUMA Patches were added. (Good news for you is, Percona even backported it to 5.5!).

So, if 5.5 is working for you, stick with it. BB 3.0 is actually going with Percona’s build of mysql 5.7 mainly so I can play with TokuDB and its online backup tools., but one of the awesome additions to mysql 5.7 is the presence of the JSON datatype, which was previously a bit reason people were maintaining both mysql and mongodb instances in the past.


#14

Fastly, and the HTTPS offloading to the Big-IP takes a ton of load off. So much so that in fact for BB 3.0, I’m only building two web frontends instead of three (but more on that later!)


#15

I’ve been thinking about going down the percona route. I just have this, I dunno, old school mentality: if your database can’t handle billions of records and reasonable joins with 4gb of RAM, something else is the problem.

Alternatively, 128gb of RAM costs as much as a pumpkin spice latte at Starbucks.

The latest master I set up was 32 with eight cores, and it runs white hot. It is the auditing I think that is killing me, and percona appears to have better tools for figuring out what exactly is eating everything (explain, top, slow query logs, hell even strace aren’t optimal), so I think I’ll look at percona this weekend.

Thanks!


#16

I can tell you that there have been at least two instances in recent memory for me where installing the Percona release of 5.5 (5.6 wasn’t general release yet) literally saved the DB stack from crumbling under the transaction load. A big part of why is the Google and Facebook patches, 5.6 incorporates most of those changes into mainline, which is why the 5.6 percona release focuses on other things, but there’s a group of super smart folks. their performance blog is always stuffed with lots of amazing info, often from Peter Zaitsev himself. It’s great to see the high-performance-dev-turned-CEO still getting his hands dirty with code, which is another reason I think these guys are awesome. :slight_smile:

That being said, though, I don’t want to take away from the other mysql dev spinoff either, the MariaDB folk are also awesome. It’s a great time to be a mysql DBA, really. :smiley:


#17

Oh FSM it is good to hear one other sane person say that.

“Oh, but we will publish all of our info as Kafka topics, who needs a database?”


#18

Updated the third post with information on BB 3.0! Current status is I’ve installed the servers and base OS, and begun the Ansible configs for the servers themselves. My goal is to have everything moved over by end of March, we’ll see how it goes.


#19

#20

The funniest part for me is that during Key ceremonies–yes, they are called that–is everyone just looked bored.

If I am ever a key officer again, bowing, shaking hands, pomp, and circumstance will be mandatory.

(Oh, and the keys need to be Degaussed by The Hawk)