/    Sign up×
Community /Pin to ProfileBookmark

[RESOLVED] Caching Dynamic Content?

Does content that comes from PHP and MySQL, like a forum or blog, get cached by HTTP headers?

to post a comment
PHP

8 Comments(s)

Copy linkTweet thisAlerts:
@MindzaiOct 27.2009 — By default most servers are not configured to send an Expires or Cache-Control header for anything. A server that's been properly configured (which most aren't) should be sending one or both of these headers for some elements, usually "static" content such as images, scripts, css etc. The actual page/script files themselves normally shouldn't be cached otherwise updated content will never be detected. It's easy enough to check what your particular server is doing by just looking at the response headers though. Caching of dynamic content is normally done server-side by writing evaluated content to disk and serving it from there for a certain time. Obviously this doesn't avoid the HTTP request but does mean you don't have to process once per request.
Copy linkTweet thisAlerts:
@Joseph_WitchardauthorOct 27.2009 — What I'm concerned about is my vBulletin forum. I've set it up not to cache, but because of that (I think, anyway), it's causing an Amazon ad to have to load every time a new page is requested. I'm worried about setting up caching, though, because I don't want it to keep people from viewing new posts.
Copy linkTweet thisAlerts:
@MindzaiOct 27.2009 — Set your server to only send the Expires server for images then ? You can use the FilesMatch directive with Apache. You should find that the image is not actually being downloaded again though, the server should be sending a 304 response if the user has it in cache and it hasn't changed on the server. The purpose of the Expires header is to prevent the HTTP request all together, which is obviously a performance boost.
Copy linkTweet thisAlerts:
@MindzaiOct 27.2009 — I just checked your forum and it's rotating images from amazon so there is an inevitable amount of new requests, but some were sending 304 responses. If you cache these images (and css + js) with a far-future expires header I expect your users will thank you. You could also enable gzipping for the css and js as well.
Copy linkTweet thisAlerts:
@Joseph_WitchardauthorOct 29.2009 — I just checked your forum and it's rotating images from amazon so there is an inevitable amount of new requests, but some were sending 304 responses. If you cache these images (and css + js) with a far-future expires header I expect your users will thank you. You could also enable gzipping for the css and js as well.[/QUOTE]

Okay, say that again, but this time a little more slowly and in a bit more detail (code examples would help):p Haha, sorry, but I'm not near as good at this as most of the people here.
Copy linkTweet thisAlerts:
@MindzaiOct 29.2009 — When a client (eg a web browser) requests a web page, the first thing it asks the server for is the HTML (or whatever) document. It then starts parsing this document looking for other resources such as javascript files, css files, images etc. When it comes across another one of these resources which needs to be loaded, the client fires of another request to the relevant server, which returns the resource.

Take an <img> tag as an example. The client will send a request to the relevant server (specified in the src attribute of the <img> tag) asking for the image, which the server then returns. When the browser gets the image, it saves it in it's cache (assuming it's cache is enabled, but I expect it's only us web developers who disable caches!).

The next time it comes across the same image anywhere, the client checks in its cache to see if it already has the image. It does, but how does it know that the image hasn't been changed on the server? To be certain, it takes the modified time of the image (which is sent by the server along with the image) and makes a request to the server along the lines of "I have a version of image x which was modified on this date at this time. Do I have the latest version?". The server then responds by either confirming that the client does indeed have the latest version (by sending a 304 response code), or else if the image has been modified, sending the new version.

This is the default behaviour of most web servers. It saves bandwidth when a file is cached since the few bytes of text it takes to check that the cached version is current is insignificant compared to just downloading the resource again each time. However, the client still needs to make an HTTP request to check it has a current version, and since browsers only make a limited number of concurrent HTTP requests, this itself is the biggest cause of slow-loading pages.

The fix for this is to configure the server in such a way that when it sends a resource to the browser, it also sends a HTTP header (a small bit of textual info) which says to the client ("I promise this image wont be modified for x hours/days/weeks/months/years etc"). This means that the client can avoid sending the HTTP request to check it has the latest version, because the server told it right from the start how long the resource would be valid for.

To set this up, the best way is to use Apache's mod_expires. Say for example you want to specify that all images, css and javascript are valid for the next month, you would add the following to your Apache config:

&lt;FilesMatch ".(gif|jpe?g|png|js|css)$"&gt;
ExpiresDefault "access plus 1 month"
&lt;/FilesMatch&gt;


Now the problem you have is that you don't have control of amazon's server configs, so enabling caching for their images is an issue. However on your forum home page, I count 95 image requests coming from your own domain, which took me 6 seconds to load over a 20meg connection with my browser's cache disabled, and almost 2 seconds with the cache enabled - and I'm using a modern browser which runs multiple HTTP requests for images concurrently. If your users are using a (not much) older browser which does 2 at once, they are in for a long wait. Javascript is even worse - since the browser halts all other requests while scripts are being requested (to ensure they load in the correct order), checking 6 scripts took me as long as all of the images.

In total I spent 4.5 out of 6 seconds on unnecessary requests. 4.5 seconds doesn't sound like a very long time, but it really is quite a lot, and I'm on a very quick connection. Given you could avoid over 100 round trips to the server with just 3 lines of code, it makes sense to do it.

Even better, if you serve these resources from a subdomain browsers will request more items at once.

As for gzipping, it is the technique used to compress textual content before sending it to the client. You have over 100KB of textual content (html, css and js) for your forum home page. Gzipping it will probably cut this in half. Gzipping can be enabled via Apache's mod_deflate. To set it up, add the following to your server's config:

<i>
</i>&lt;Location /&gt;
# Insert filter
SetOutputFilter DEFLATE

# Netscape 4.x has some problems...
BrowserMatch ^Mozilla/4 gzip-only-text/html

# Netscape 4.06-4.08 have some more problems
BrowserMatch ^Mozilla/4.0[678] no-gzip

# MSIE masquerades as Netscape, but it is fine
# BrowserMatch bMSIE !no-gzip !gzip-only-text/html

# NOTE: Due to a bug in mod_setenvif up to Apache 2.0.48
# the above regex won't work. You can use the following
# workaround to get the desired effect:
BrowserMatch bMSI[E] !no-gzip !gzip-only-text/html

# Don't compress images
SetEnvIfNoCase Request_URI
.(?:gif|jpe?g|png)$ no-gzip dont-vary

# Make sure proxies don't deliver the wrong content
Header append Vary User-Agent env=!dont-vary
&lt;/Location&gt;


Hopefully that clarifies it a bit.
Copy linkTweet thisAlerts:
@Joseph_WitchardauthorOct 30.2009 — Okay, this is what I have in my .htaccess file for the rest of my site (e.g. not the forum).

[CODE]
# start caching
ExpiresActive On

# expire images by one hour
ExpiresByType image/gif A3600
ExpiresByType image/jpeg A3600
ExpiresByType image/png A3600

# expire html, css, and javascripp
ExpiresByType text/html A1800
ExpiresByType text/css A172800
ExpiresByType text/javascript A172800
[/CODE]


Could I do the first bit of code you gave me by this method? This is another Apache Module I think. I honestly don't remember what it's called, but I figured it must be different than the syntax in your code.

And thanks? I'll definitely look into mod_deflate.
Copy linkTweet thisAlerts:
@Joseph_WitchardauthorNov 01.2009 — Actually, now that I think about it, do you need the location tags in the .htaccess file? You said config. I'm not sure if that means .htaccess or a file that only my host has access to.

**EDIT: Nevermind. I decided to try it, and after removing the location tags, it works, and cut down a fair amount of bytes! Thanks so much Mindzai?
×

Success!

Help @Joseph_Witchard spread the word by sharing this article on Twitter...

Tweet This
Sign in
Forgot password?
Sign in with TwitchSign in with GithubCreate Account
about: ({
version: 0.1.9 BETA 6.17,
whats_new: community page,
up_next: more Davinci•003 tasks,
coming_soon: events calendar,
social: @webDeveloperHQ
});

legal: ({
terms: of use,
privacy: policy
});
changelog: (
version: 0.1.9,
notes: added community page

version: 0.1.8,
notes: added Davinci•003

version: 0.1.7,
notes: upvote answers to bounties

version: 0.1.6,
notes: article editor refresh
)...
recent_tips: (
tipper: @nearjob,
tipped: article
amount: 1000 SATS,

tipper: @meenaratha,
tipped: article
amount: 1000 SATS,

tipper: @meenaratha,
tipped: article
amount: 1000 SATS,
)...