» Survive heavy traffic with your webserver
Recently two of my articles reached the Digg frontpage at the same day. My web server isn't state of the art and it had to handle gigantic amounts of traffic. But still it served pages to visitors swiftly thanks to a lot of optimizations. This is how you can prevent heavy traffic from killing your server.
About this article
There are many things you can do to speed up your website. This article focuses on practical things that I used, without any spending money on additional hardware or commercial software.
In this article I assume that you're already familiar with system administration and hosting / creating websites. In examples I use Ubuntu, but if you use another distro, just make some minor adjustments (like package management) and it should work as well.
Beware, if you don't know what you're doing you could seriously mess up your system.
Cache PHP output
Every time a request hits your server, PHP has to do a lot of processing, all of your code has to be compiled & executed for every single visit. Even though the outcome of all this processing is identical for both visitor 21600 and 21601. So why not save the flat HTML generated for visitor 21600, and serve that to 21601 as well? This will relieve resources of your web server and database server because less PHP often means less database queries.
Now you could write such a system yourself but there's a neat package in PEAR called Cache_Lite that can do this for us, benefits:
- it saves us the time of inventing the wheel
- it's been thoroughly tested
- it's easy to implement
- it's got some cool features like lifetime, read/write control, etc.
Installing is like taking candy from a baby. On Ubuntu I would:
sudo aptitude install php-pear
sudo pear install Cache_Lite
And we're ready to use one of our most important assets!
To learn exactly how to implement Cache_Lite into your code I've written another article called: Speedup your website with Cache_Lite ».
Create turbo charged storage
With the PHP caching mechanism in place, we take away a lot of stress from your CPU & RAM, but not from your disk. This can be solved by creating a storage device with your system's RAM, like this:
mkdir -p /var/www/www.mysite.com/ramdrive
mount -t tmpfs -o size=500M,mode=0744 tmpfs /var/www/www.mysite.com/ramdrive
Now the directory /var/www/www.mysite.com/ramdrive is not located on your disk, but in your system's memory. And that's about 30 times faster :) So why not store your PHP cache files in this directory? You could even copy all static files (images, css, js) to this device to minimize disk IO. Two things to remember:
- All files in your ramdrive are lost on reboot, so create a script to restore files from disk to RAM
- The ramdrive itself is lost on reboot, but you can add an entry to /etc/fstab to prevent that
To learn exactly how to tackle te above, I've written another article called: Create turbocharged storage using tmpfs ».
Leave heavy processing to cronjobs
For example. I count the number of visits for every singe article. But instead of updating a counter for an article every visit (which involves row locking and a WHERE statement), I use simple and relativity performance-cheap SQL INSERTS into a separate table.
The gathered data is processed every 5 minutes by a separate PHP script that's automatically run by my server. It counts the hits per article, then deletes the gathered data and updates the grand totals in a separate field in my article table. So finally accessing the hit count of an article takes no extra processing time or heavy queries.
If you want more in depth information on writing cronjobs, I've written another article called: Schedule tasks on Linux using crontab ».
Optimize your database
Use the InnoDB storage engine
If you use MySQL, the default storage engine for tables is MyISAM. That not ideal for a high traffic website because MyISAM uses table level locking, which means during an UPDATE, nobody can access any other record of the same table. It puts everyone on hold!
InnoDB however, uses Row level locking. Row level locking ensures that during an UPDATE, nobody can access that particular row, until the locking transaction issues a COMMIT.
phpmyadmin allows you to easily change the table type in the Operations tab. Though it never caused me any problems, it's wise to first create a backup of the table you're going to ALTER.
Use optimal field types
Wherever you can, make integer fields as small as possible (not by changing the length but by changing it's actual integer type). Here's an overview:
| range signed | range unsigned | |||
| fieldtype | min | max | min | max |
| TINYINT | -128 | 127 | 0 | 255 |
| SMALLINT | -32,768 | 32,767 | 0 | 65,535 |
| MEDIUMINT | -8,388,608 | 8,388,607 | 0 | 16,777,215 |
| INT | -2,147,483,648 | 2,147,483,647 | 0 | 4,294,967,295 |
| BIGINT | -9,223,372,036,854,775,808 | 9,223,372,036,854,775,807 | 0 | 18,446,744,073,709,551,615 |
So if you don't need negative numbers in a column, always make a field unsigned. That way you can store maximum values with minimum space (bytes). Also make sure foreign keys have matching field types, and place indexes on them. This will greatly speedup queries.
In phpmyadmin there's a link Propose Table Structure. Take a look sometime, it will try to tell you what fields can be optimized for your specific db layout.
Queries
Never select more fields than strictly necessary. Sometimes when you're lazy you might do a:
SELECT * FROM `blog_posts`
even though a
SELECT `blog_post_id`,`title` FROM `blog_posts`
would suffice. Normally that's OK, but not when performance is your no.1 priority.
Tweak the MySQL config
Furthermore there are quite some things you can do to the my.cnf file, but I'll save that for another article as it's a bit out of this article's scope.
Save some bandwidth
Save some sockets first
Small optimizations make for big bandwidth savings when volumes are high. If traffic is a big issue, or you really need that extra server capacity, you could throw all CSS code into one big .css file. Do this with the JS code as well. This will save you some Apache sockets that other visitors can use for their requests. It will also give you better compression rations, should you choose to mod_deflate or compress your javascript with Dean Edwards Packer.
I know what your thinking. No, don't throw all the CSS and JS in the main page. You still really want this separation to:
- make use of the visitor's browser cache. Once they've got your CSS, it won't be downloaded again
- not pollute your HTML with that stuff
And now some bandwidth ;)
- Limit the number of images on your site
- Compress your images
- Eliminate unnecessary whitespace or even compress JS with tools available everywhere.
- Apache can compress the output before it's sent back to the client through mod_deflate. This results in a smaller page being sent over the Internet at the expense of CPU cycles on the Web server. For those servers that can afford the CPU overhead, this is an excellent way of saving bandwidth. But I would turn all compression off to save some extra CPU cycles.
Store PHP sessions in your database
If you use PHP sessions to keep track of your logged in users, then you may want to have a look at PHP's function: session_set_save_handler. With this function you can overrule PHP's session handling system with you own class, and store sessions in a database table or in Memcached.
Now a key attribute to success, is to make this table's storage engine: MEMORY (also known as HEAP). This stores all session information (should be tiny variables) in the database server's RAM. Taking away disk IO stress from your web server, plus allowing to share the sessions with multiple web servers in the future, so that if you're logged in on server A, you're also logged in on server B, making it possible to load balance.
Sessions on tmpfs
If it's too much of a hassle to store sessions in a MEMORY database, storing session files on a ramdisk is also a good options to gain some performance. Just make the /var/lib/php5 live in RAM. To learn exactly how to do this, I've written another article called: Create turbocharged storage using tmpfs ».
Sessions in Memcached
I recently (22th June, 08) found another (better) way to store sessions in a cluster-proof, resource-cheap way and dedicated a separate article on it called: Enhance PHP session management ».
More tips
Some other things to google on if you want even more:
- eAccelerator
- memcached
- tweak the apache config
- squid
- turn off apache logging
- Add 'noatime' in /etc/fstab on your web and data drives to prevent disk writes on every read
I'll update this section with usefull comments by people such as yourself ;)
Stay up to date
You can track my blog
articles and
comments. You may also find my
bookmarks interesting. Or
Follow me on Twitter
Like this article?
|
Then Digg it! Or use another bookmark button below to show your support & help me spread the word. |
RelatedArticles like this one» Enhance PHP session management |
tags: linux, apache, php, PEAR, caching, disk IO, performance, mysql
category: Howto - Webserver
read: 65,906 times






tagcloud
#36. Kevin on 23 February 2010
If you want something slower but easier, have a look at pound. That doesn't rewrite IP packets at kernel level, but just forwards level 7 traffic. So yeah: slower, but easier and in some cases (different networks/whatever) the only option.
#35. Andrew on 23 February 2010
#34. Kevin on 21 February 2010
#33. Andrew on 04 February 2010
As far as performance goes, what would you suggest is the best way to add another server into the apache mix?
Would it be installing a private cloud with eucalytus? Perhaps an ubuntu cluster - more for reliabilty really? What about that old SETI concept, the cluster of workstations COW? Does that exist in any form today?
#32. azhar on 27 January 2010
#31. Kevin on 07 January 2010
#30. Tom on 06 January 2010
Memcached is best used for high read/low write situations. However sessions are re-written every script execution which means it's faster to store your sessions in a DB. However if you have data sets that update infrequently, then it's better to use Memcached.
A problem I've also discovered with Memcached is when using multiple Memcached servers (using the php binary, not the pecl module) and one of those servers looses connectivity, Apache starts throwing segfaults. This includes cases where you flush 1 Memcached server, but not all of them.
#29. Matt Kukowski on 16 December 2009
e.g. SELECT username FROM table WHERE id='1' LIMIT 1
That way MySQL will end the query as soon as the WHERE claus is satisfied and with 1 record (or how ever many records you will need)
... [more]
Also, always use Persistant MySQL connections like pconnect()...
#28. aobeda on 19 October 2009
vary good
#27. aobeda on 19 October 2009
very good
#26. Shawn on 17 October 2009
#25. earth host on 09 October 2009
#24. Kevin on 17 September 2009
#23. DV on 08 September 2009
This is a great article!
#22. Kevin on 22 August 2009
#21. Julius Beckmann on 12 August 2009
You did not mention PHP Op-Code Cachers like APC and XCache - They can reduce the load by simply installing them and let them cache your PHP Scripts.
Also your MyISAM and InnoDB tipp is no general fact, it has to be selected wisely on your setup and website.
... [more]
You also forget to mention moving static files to Amazon S3 cloud or simply using Lighttpd or Nginx for static files.
#20. Kevin on 12 August 2009
#19. brant on 04 August 2009
Not only are these good tips for heavy traffic'd sites, but good tips in general for a speedy and responsive website.
#18. Kevin on 15 July 2009
#17. ephman on 10 July 2009
#16. M A Hossain Tonu on 24 May 2009
This could be good part of server load balancing.
Tonu
... [more] Software Engr.
#15. Kevin on 01 February 2009
#14. brainextender on 27 January 2009
Just to mention it. tmpfs is allowed to use virtual memory to swap pages back to disk. May be you dont want that? ramfs won't behave itself in that manner.
if you've enough ram your files are cached by os (here ubuntu).
... [more]
Check free command. So there is no need to put them in a ramdisk. iostat will show no disk activity then.
#13. ahydra on 20 November 2008
#12. Dilli R. Maharjan on 16 May 2008
Thanks.
#11. Robin Speekenbrink on 27 February 2008
Also: on the note of using deflate in apache: most webservers have CPU to spare but no memory to spare and thus mod_deflate might be handy (connections are handled faster etc and thus apache can handle requests more quickly, thus reducing the concurrent load)
#10. Gregory on 12 February 2008
Regards,
http://www.olemera.com/loans/home-loans/integer-home-loans/
#9. Dave on 19 September 2007
If you receive steady, regular traffic, make sure your server's CPU usage rarely goes above 30%. This might seem low but remember that Digg can drive a lot of traffic to your site in a very short amount of time. Whenever the load on our servers reaches 40% at the peak time we buy another one and put it into the load balancer. This is a rule-of-thumb and works fairly well for us. We run 70-odd websites this way, some of which receive over 300,000 unique visitors per day and have survived day-long front page Diggings without degrading performance. We also go over 130MBits/sec while being Dugg although, to be fair, some of the pages could be a little lighter... 4MB is normal for a home page isn't it ? :-P
#8. Kevin on 16 September 2007
#7. Ray on 16 September 2007
http://www.oneunified.net/blog
#6. Yang on 07 September 2007
#5. Kevin on 06 September 2007
#4. Simon on 06 September 2007
Surely if you cache the compressed result then you only have to do it once. Everyones a winner then.
#3. Kevin on 06 August 2007
#2. Nima on 06 August 2007
#1. Unrated.be on 02 August 2007