USSM scalability and general geekery

DMZ · April 18, 2007 at 8:41 pm · Filed Under Site information 

During the first inning, I got to see load go from 1 to 15 to about 60, and then stay there for four innings. At one point, I actually blocked new connections for a while until the load dropped, then let people back in — and it jumped back to 60 again.

As Dave says, there’s got to be a solution for this. But I haven’t found it yet. So if you’re interested in another one of these technical wonkery threads, read on.

Stuff tried:
– I tried caching through .htaccess on a previous day and didn’t see any benefit at all.
– We’ve got wp-cache turned on, standard settings
– I’ve got mysql caching turned on, that worked really well, though I have no idea if the values are optimized.

I had top and systat open watching, fortunately. Before you note the 50% idle, I’ve discovered that’s a known bug – if you’ve got a CPU that supports hyperthreading and it’s not turned on, it reports 2x the actual ticks, and won’t ever go past 50% usage even as its maxed.

Here’s the start of the spike

last pid: 89071;  load averages:  6.50,  2.77,  1.60   up 119+00:35:37 22:26:29
231 processes: 29 running, 202 sleeping
CPU states: 44.7% user,  0.0% nice,  5.0% system,  0.2% interrupt, 50.1% idle
Mem: 859M Active, 833M Inact, 203M Wired, 87M Cache, 112M Buf, 20M Free
Swap: 4070M Total, 94M Used, 3977M Free, 2% Inuse, 64K In

  PID USERNAME    THR PRI NICE   SIZE    RES STATE  C   TIME   WCPU COMMAND
  473 mysql       107  20    0 93872K 45360K kserel 0 188.2H  5.47% mysqld
88450 www           1  98    0 23696K 14056K RUN    0   0:11  2.25% httpd
89031 www           1   4    0 23352K 13696K sbwait 0   0:01  2.05% httpd
88441 www           1  98    0 23456K 13820K RUN    0   0:12  1.76% httpd
88472 www           1  98    0 23456K 13824K RUN    0   0:12  1.51% httpd
88707 www           1  97    0 23632K 14004K RUN    0   0:08  1.46% httpd
89015 www           1   4    0 23232K 13580K sbwait 0   0:01  1.46% httpd
88711 www           1   4    0 23576K 13948K sbwait 0   0:07  1.32% httpd
89040 www           1   4    0 23352K 13696K sbwait 0   0:00  1.30% httpd
89014 www           1   4    0 23624K 13972K sbwait 0   0:02  1.26% httpd
89008 www           1   4    0 23380K 13732K sbwait 0   0:01  1.20% httpd
89043 www           1  97    0 23676K 14000K RUN    0   0:00  1.18% httpd
89039 www           1   4    0 23688K 14000K sbwait 0   0:00  1.12% httpd
89022 www           1   4    0 23380K 13720K sbwait 0   0:01  1.10% httpd
89030 www           1   4    0 23340K 13696K sbwait 0   0:01  1.10% httpd
67008 www           1   4    0 25432K 15864K sbwait 0  69:58  1.07% httpd
89034 www           1   4    0 23268K 13616K sbwait 0   0:01  1.05% httpd

I think all those sbwaits are people on slow connections or whatever.

    1 users    Load 15.92  5.13  2.47                  Apr 18 22:26

Mem:KB    REAL            VIRTUAL                     VN PAGER  SWAP PAGER
        Tot   Share      Tot    Share    Free         in  out     in  out
Act 1142680    3708  1543904    15292  103800 count
All 2041832    7080159528636    29196         pages
                                                                 Interrupts
Proc:r  p  d  s  w    Csw  Trp  Sys  Int  Sof  Flt     78 cow    4056 total
    64  3  1212  1   1029 225333822  233  218 2133 210436 wire        14: ata
                                                  1137448 act      30 16: uhc
10.4%Sys   0.0%Intr 39.6%User  0.0%Nice 50.0%Idl   641776 inact       18: ata
|    |    |    |    |    |    |    |    |    |      57864 cache       23: ehc
=====>>>>>>>>>>>>>>>>>>>>                            2996 free     30 32: em0
                                                          daefr  1998 cpu0: time
Namei         Name-cache    Dir-cache                     prcfr  1998 cpu1: time
    Calls     hits    %     hits    %                   3 react
     1236     1236  100                                   pdwake
                                     1992 zfod            pdpgs
Disks   ad4                          1277 ozfod           intrn
KB/t   0.00                            64 %slo-z   114464 buf
tps       0                          2271 tfree        37 dirtybuf
MB/s   0.00                                        100000 desiredvnodes
% busy    0                                         35734 numvnodes
                                                    24737 freevnodes

then here we are as the server really slows

last pid: 89130;  load averages: 57.98, 24.99, 10.76   up 119+00:38:04 22:28:56
284 processes: 51 running, 233 sleeping
CPU states: 49.1% user,  0.0% nice,  0.7% system,  0.1% interrupt, 50.0% idle
Mem: 1446M Active, 288M Inact, 209M Wired, 57M Cache, 112M Buf, 3292K Free
Swap: 4070M Total, 94M Used, 3977M Free, 2% Inuse

  PID USERNAME    THR PRI NICE   SIZE    RES STATE  C   TIME   WCPU COMMAND
88835 www           1  98    0 23724K 14056K RUN    0   0:04  0.88% httpd
89115 www           1   4    0 23280K 13312K sbwait 0   0:01  0.68% httpd
89033 www           1  98    0 23468K 13800K select 0   0:01  0.68% httpd
88705 www           1   4    0 23400K 13740K sbwait 0   0:08  0.68% httpd
89096 www           1   4    0 23728K 13676K sbwait 0   0:01  0.54% httpd
89024 www           1  98    0 23528K 13868K RUN    0   0:01  0.54% httpd
89084 www           1   4    0 23472K 13444K sbwait 0   0:01  0.49% httpd
89071 www           1   4    0 23504K 13820K sbwait 0   0:01  0.49% httpd
89068 www           1   4    0 23500K 13836K sbwait 0   0:01  0.49% httpd
88707 www           1  97    0 23632K 13980K RUN    0   0:08  0.49% httpd
88804 www           1   4    0 23560K 13872K sbwait 0   0:03  0.49% httpd
89119 www           1   4    0 23284K 13312K sbwait 0   0:01  0.44% httpd
89078 www           1   4    0 23280K 13324K sbwait 0   0:01  0.44% httpd
44773 www           1  97    0 25620K 15656K RUN    0 707:43  0.44% httpd
89059 www           1  98    0 23692K 14008K RUN    0   0:01  0.39% httpd
88605 www           1  97    0 23692K 14040K RUN    0   0:09  0.39% httpd
89118 www           1  97    0 23284K 13320K RUN    0   0:01  0.34% httpd
    1 users    Load 47.59 19.17  8.20                  Apr 18 22:28

Mem:KB    REAL            VIRTUAL                     VN PAGER  SWAP PAGER
        Tot   Share      Tot    Share    Free         in  out     in  out
Act 1463764    3900  1873252    15520   84592 count
All 2041240    7380159858352    29680         pages
                                                                 Interrupts
Proc:r  p  d  s  w    Csw  Trp  Sys  Int  Sof  Flt        cow    4139 total
    72    25183  2   1340  28332608  336  174  133 213620 wire        14: ata
                                                  1468448 act      65 16: uhc
 3.8%Sys   0.2%Intr 46.1%User  0.0%Nice 50.0%Idl   283864 inact     7 18: ata
|    |    |    |    |    |    |    |    |    |      81328 cache       23: ehc
==>>>>>>>>>>>>>>>>>>>>>>>                            3264 free     65 32: em0
                                                          daefr  2001 cpu0: time
Namei         Name-cache    Dir-cache                     prcfr  2001 cpu1: time
    Calls     hits    %     hits    %                     react
    15076    15069  100                                   pdwake
                                      119 zfod            pdpgs
Disks   ad4                           101 ozfod           intrn
KB/t  56.56                            84 %slo-z   114464 buf
tps       5                           196 tfree        52 dirtybuf
MB/s   0.26                                        100000 desiredvnodes
% busy    1                                         35740 numvnodes
                                                    24742 freevnodes

It looks like we’re getting killed just on raw CPU power and the number of connections, and not disk usage (though memory is sometimes tight).

In any event — suggestions, thoughts, ideas welcome.

Comments

38 Responses to “USSM scalability and general geekery”

  1. kenarneson on April 18th, 2007 8:57 pm

    Simple question: have you tried reducing the number of httpd processes?

    It sounds counterintuitive, but I’ve found on the Toaster that when traffic spikes up, I usually need to reduce the number of httpd processes, rather than increase them. Of course, mine is a totally different piece of software, but the same principle may apply: if you let the total CPU load get too high, none of the httpd processes get enough resources to do anything, and things crawl to a halt. Better to make the client wait for a connection than to let the process wait for CPU time.

  2. DMZ on April 18th, 2007 9:03 pm

    I had not considered that — how’d you go about doing that, just set it in httpd.conf?

  3. kenarneson on April 18th, 2007 9:23 pm

    Yeah, in httpd.conf. It depends which module you are running: usually it’s in the section, although if you’re running a threaded server, it might be the . Just cut all the numbers in half and see what happens when you restart httpd. You should see half the number of httpd threads when you run a ps command. If you don’t, it’s probably the other module you need to edit.

  4. kenarneson on April 18th, 2007 9:24 pm

    oops, the tags got edited out that should be:

    usually it’s in the IfModule prefork.c section

    although it might be the IfModule worker.c

  5. sidereal on April 18th, 2007 9:54 pm

    Before you note the 50% idle, I’ve discovered that’s a known bug

    Are you sure? You’re probably right, but another possibility is that you have a multiproc machine, but your kernel isn’t SMP, in which case apache won’t get access to the other proc.

    I know this was brought up in the previous thread about site performance, and I’m not sure if it went anywhere, but the long run solution is to never hit the database on a page load. There’s no reason to. There are no user-specific or otherwise custom features on the site. Except for the ‘Logged in as foo’ if you’re logged in, the page looks exactly the same to everyone that hits it, but you’re grinding through php and your db on every hit regardless.

    It would take about 30 lines of php to do the following:
    1) On hit, load the rendered article and comment data from a cachefile on disk if it exists. Otherwise, build it normally and then drop the rendered page (except possibly the comment form) to disk (using ob_start, ob_get_contents, etc).
    2) On comment, delete the cache file.
    3) Party

    I suspect your performance would go up at least 10x. If you’re interested you can pop me an email at sidereal@gmail.com and I can send along some code snippets. It’s code I’ve written numerous times.

  6. sidereal on April 18th, 2007 10:03 pm

    Whoops, I just found out wp-cache is supposed to do basically what I’m describing. But if it’s working, I can’t imagine how you’re having problems. Serving up a static file is a really fast operation for apache.

  7. DMZ on April 18th, 2007 10:11 pm

    wp-cache is there, it’s turned on, it seems to be working, and yet…

    I know. I wonder if it’s the “Logged in as DMZ. Logout »” line — that’s going to vary for every user, right? I wonder if that forces generation of a new version for each person/login.

  8. sidereal on April 18th, 2007 10:25 pm

    I don’t know much about wp-cache, but my guess would be that it drops the major portion of the page to disk, but allows for a few dynamic elements (like the ‘Logged in as’ line) and then reassembles them out the door. Not a completely static operation, but not slow either.

    Also,

    44773 www 1 97 0 25620K 15656K RUN 0 707:43 0.44% httpd

    Holy crap. That’s a lot of cpu time taken up by one (apparently really old) apache thread. Is it still running? Kill it for kicks.

  9. Jimmie the Geek on April 18th, 2007 10:38 pm

    Are you sure? You’re probably right, but another possibility is that you have a multiproc machine, but your kernel isn’t SMP, in which case apache won’t get access to the other proc.

    Yeah, the last time DMZ had a thread like this, I had him look at that very thing. They do have an SMP kernel, and they both show up in dmesg and /proc/cpuinfo.

    It’s gotta be some sort of Apache or WordPress tuning, but I’ve been out of the Open Source end of things for too long to be of much use anymore. 🙁

    Jimmie

  10. Jimmie the Geek on April 18th, 2007 10:43 pm

    DMX, have you taken a look at this?

    http://httpd.apache.org/docs/1.3/misc/perf-tuning.html

    Specifically the secion on process creation.

    Jimmie

  11. DMZ on April 18th, 2007 10:44 pm

    Nah, it’s a single proc, I had this discussion with the dudes.

    If you search for “50% idle BSD” or something similar like that you can find the bug described.

  12. scotje on April 18th, 2007 10:45 pm

    I saw
    this article the other day and thought of you Derek. 🙂

    Also, I notice your running Apache 1.3. Any reason you’re not running 2.2?

    What are your httpd.conf values for MinSpareServers and MaxSpareServers? How about MaxRequestsPerChild and KeepAliveTimeout?

    Lastly, are you running a PHP bytecode cache like Zend Optimzer?

  13. scotje on April 18th, 2007 10:49 pm

    Oh yeah, don’t know if you’ve read this guide.

  14. scotje on April 18th, 2007 10:56 pm

    Jimmie totally beat me too that one didn’t he? 🙂

  15. Jimmie the Geek on April 18th, 2007 10:58 pm

    Yeah, but at least you didn’t call him DMX! 😉

    Good luck Derek. I gotta go feed the baby and then off to bed.

    Jimmie

  16. DMZ on April 18th, 2007 11:11 pm

    Also, I notice your running Apache 1.3. Any reason you’re not running 2.2?

    No clue. Are there performance advantages?

    Also, httpd.conf appears to not be in /etc, which is awesome

  17. scotje on April 18th, 2007 11:17 pm

    You can try:

    find / -name "httpd.conf"

    What OS/distro is it? I think you said BSD of some sort?

  18. scotje on April 18th, 2007 11:24 pm

    Also, for your comparison, here’s some synthetic Apache 1.3 vs 2.2 benchmarks.

    (YMMV, offer void on Tuesdays and not applicable to clergymen, etc.)

  19. Adam S on April 18th, 2007 11:31 pm

    I wonder if it’s the “Logged in as DMZ. Logout »” line — that’s going to vary for every user, right? I wonder if that forces generation of a new version for each person/login.
    You could comment out that line for a couple of days and see if it affects the performance/load during game threads. I could see how that would break caching.

  20. DMZ on April 18th, 2007 11:36 pm

    It’s FreeBSD, I think both Apache and MySQL are running straight default (except for the mysql caching work I did).

  21. scotje on April 18th, 2007 11:48 pm

    Looks like /usr/local/etc/apache/httpd.conf would be the place to look if it’s just a ports install of Apache.

  22. DMZ on April 19th, 2007 12:09 am

    And lo, there they are. Nice.

  23. scotje on April 19th, 2007 12:20 am

    I gotta get to sleep here, but I’d start by playing with MaxRequestsPerChild and KeepAliveTimeout in the Apache config.

    MaxRequestsPerChild (henceforth to be known as MRPC because I’m lazy) defaults to 0 (unless the default FreeBSD conf explicitly sets it) which means Apache never kills off a worker because it’s too old. This can be bad if the worker threads are leaking memory or doing anything else nefarious. So MRPC imposes a “Logan’s Run” type solution where once a worker has served a certain number of requests, it gets killed off and a new worker is spawned to replace it.

    KeepAliveTimeout has to do with how long a client can hold open a connection to the server. While the cost of establishing a new TCP connection is somewhat high, we don’t want the server sitting there waiting on the client forever. I believe this defaults to 15 seconds if not specified, so you can try tweaking that down a bit.

    Any idea what ballpark of req/sec you were looking at during that peak time?

  24. scotje on April 19th, 2007 12:26 am

    Oh, and it wouldn’t hurt to see if you can figure out a way to definatively rule out “MySQL thrashing the disks” as a cause of the performance problems.

    I think iostat would be the program that would give you that insight. I’m not really very well versed in FreeBSD internals though, so I’m not familiar with how to interpret iostat’s output.

  25. oNeiRiC232 on April 19th, 2007 9:39 am

    #24: MySql Query caching solves that.

    DMZ: Regarding the following lines —

    Mem: 1446M Active, 288M Inact, 209M Wired, 57M Cache, 112M Buf, 3292K Free
    Swap: 4070M Total, 94M Used, 3977M Free, 2% Inuse

    This tells me two things — 1.) There’s only 2GB of RAM in your server, and it looks to be getting eaten up. Nowadays that’s not a lot, especially for a server under heavy load (Hell that’s what’s recommended for a ho-hum desktop running Vista). Double it if there’s cash in the USSM fund for it. 2.) Swap file = BAD for performance. Turn it right off!!!

  26. oNeiRiC232 on April 19th, 2007 9:43 am

    Also, you’re not going to want to hear it, but WP just isn’t that scalable. Sorry guy.

    Something built on a more robust architecture, like a JSP/EJB framework, would run laps around WP. Yes, getting this would require having software development skills, or the money to buy them, but on the other hand a blog isn’t the hardest thing to make.

  27. DMZ on April 19th, 2007 10:08 am

    Heh. Don’t think that hasn’t occurred to me.

    I have, in relative secrecy, experimented with Movable Type and other options, and the results were pretty awful.

  28. Tom Davis on April 19th, 2007 10:09 am

    You do not want to turn the swap file off. Yes, it’s slow, but you don’t want to see server performance without it when you have a memory crunch. Instead of moving things to disk (swapping), it will discard them from memory entirely and they will have to be brought back in at quite a cost to service time.

    Also, the fact that the swap file is not really in use demonstrates that memory does not appear to be a large problem.

    Unfortunately, it looks like the CPUs are your primary bottleneck, and not by a little bit. There’s not much you can do to reduce your load and still provide decent service time to the readers without more/faster CPUs. Were cost not an option, is clustering (adding another server) even an option with WordPress?

  29. DMZ on April 19th, 2007 10:26 am

    Yeah, CPU #2 is the top of my list of things to do.

    Here’s the problem with that, though: that’s it for CPU slots. After that I’m looking at… another server? We just went into the black on this one. And, essentially, only for game threads? That doesn’t seem like a wise use of resources.

  30. Mat on April 19th, 2007 11:34 am

    Is the bottleneck in performance MySQL?

    Looking at the pre-spike table, we see that MySQL seems to be using about 5% of the CPU, and it wouldn’t appear to need much more than that since you still have some available CPU left at that point. Then in the next table, MySQL is nowhere to be found, which would naively seem like a big problem to me.

    I would suspect that if MySQL wants 5% of the CPU when the server isn’t choking, it probably still wants at least 5% of the CPU when the server is choking, but isn’t getting it. If it really is happy using less than 0.3% of the CPU during the slowdown, then even if you can get it to run 100 times faster through better caching or whatever, then you’re only going to open up enough space for one more httpd process.

    Basically, I guess I’d say that if MySQL is the bottleneck, you might want to look into getting it more CPU time by giving it a higher priority or something. If it isn’t the bottleneck, then it seems like you won’t really gain anything by spending any extra time optimizing the MySQL side of things.

  31. scotje on April 19th, 2007 12:02 pm

    Part of the problem is that we still haven’t identified exactly what the performance bottleneck is, other than that there winds up being way too much work pending for the CPU.

    #25: Query caching doesn’t fix everything, when the table changes (like when a comment is posted) the cached queries for that table are invalidated. On a game thread, this happens a lot. MySQL could very well still be thrashing the disk leaving a lot of processes in an IOWAIT state. This is by far the #1 most common performance bottleneck on a Apache/MySQL/PHP architecture.

    2GB of RAM and a decent CPU should get you into 7 figures in terms of hits/day unless something really complicated is going on with the pages. Disabling swap is…not wise. FreeBSD and other unix systems are a lot better at memory management than windows is. All the memory being in use but none of the swap being in use is an indication of efficient memory management. Unix doesn’t like to waste RAM, so it finds something useful to put in there and doesn’t take it out until it needs to.

    Stepping back for a second though, the game threads are pretty much a doomsday scenario in terms of scalability: constantly changing data with teeming throngs pounding the refresh button. If I was going to look at something from a software standpoint, I would look into finding or developing a real simple (but high performance) solution that would handle the game thread discussions exclusively and leave the rest of the site in WordPress’ capable hands. This could probably even be developed in the context of a WP plugin as something that didn’t even touch MySQL for those discussions.

  32. Nat Irons on April 19th, 2007 12:20 pm

    I wonder if it’s the “Logged in as DMZ. Logout »” line — that’s going to vary for every user, right? I wonder if that forces generation of a new version for each person/login.

    Does wp-cache arrange to serve static files, or just produce a highly optimized database result? I would hope it’d be serving static files, because no amount of optimization would save the server when Felix comes out in the first inning. And in that case the results should be plainly apparent in the apache logs — you’ll see lots and lots of the cached .html files being pushed over the wire.

    I have, in relative secrecy, experimented with Movable Type and other options, and the results were pretty awful.

    What flavor of awful? I know MT a lot better than WordPress, and USSM isn’t doing anything that MT can’t handle. Switching over would still be a pain, though, so if wp-cache can arrange to serve static files with only ~300 or so database updates per game thread, that should be a big win.

  33. oNeiRiC232 on April 19th, 2007 1:14 pm

    #31:

    A game thread is a few hundred posts long, made over a few hours. Invalidating the MySQL cache ~300 times in a few hours isn’t going to thrash anything. That’s an average of two or three invalidations a minute. I’d be shocked if MySQL is thrashing the HD.

    Anyways… I just pulled this from the bottom of the source HTML of the page right now, when it’s pretty calm and we’re just BS’ing about server stuff —

    Even when there’s no game thread it takes WP almost a second just to generate the page it sends you. 0.75 sec/page * 2,000 people slamming refresh = Melted plastic.

    WordPress just blows. The only way to make it work under such heavy load would be to throw a lot of resources at it via replication and load balancing — put MySQL on it’s own box and then replicate the webserver. Or chuck WordPress…

  34. oNeiRiC232 on April 19th, 2007 1:15 pm

    Sorry for the botched quote. It should say —

    Dynamic Page Served (once) in 0.753 seconds

  35. scotje on April 19th, 2007 1:47 pm

    I believe that comment is WP-Cache’s notification of how long it took to build the page initially, that is, the copy that it cached.

    Also, keep in mind that, although the game thread is only 300-400 posts long, those posts don’t come at a regular rate throughout the game.

    I’m not saying that this is definitely a disk IO issue (or even that I think that, in this specific situation, it’s likely to be the cause), I’m just saying that “Oh we have query caching on” doesn’t mean you shouldn’t still inspect the IO situation.

    A methodical approach of definitively ruling out various potential bottlenecks seems like the best way to approach things.

    Derek, I would definitely take a look at installing a byte code cache like the aforementioned Zend Optimizer (which I believe is still free) or the eAccelerator that is mentioned in the WP optimization article. These things really can work wonders for PHP execution times.

  36. oNeiRiC232 on April 19th, 2007 2:15 pm

    Well, now the source has two lines:

    (!– Dynamic Page Served (once) in 0.748 seconds –)
    (!– Cached page served by WP-Cache –)

    It looks like the wp-cache was actually off before and someone’s tinkering with the on button right now. ;-P

  37. sidereal on April 19th, 2007 3:36 pm

    Yay for caching.

    A couple other quick hits:
    We use xcache as our php bytecode cacher. It’s pretty tight and is friendly with php5.

    Run ‘vmstat 10 > /root/vmstat.txt &;tail -f /root/vmstat.txt’ during a game and watch the stats. This is the clearest way I know of to find your bottleneck. If it’s disk, you’ll see pagefile operations here. If it’s cpu, you’ll see it here.

  38. dks on April 19th, 2007 3:40 pm

    Yeah, CPU #2 is the top of my list of things to do.

    Here’s the problem with that, though: that’s it for CPU slots. After that I’m looking at… another server? We just went into the black on this one. And, essentially, only for game threads? That doesn’t seem like a wise use of resources.

    I tried searching the site, but I couldn’t find any mention of exactly what box you got. If it’s a recent Intel-based server, though, you can probably replace the CPU with one of the E53xx quad-cores (and eventually have quad-cores in both sockets). Not cheap, though — anywhere from $500 to $1200, each.

Leave a Reply

You must be logged in to post a comment.