HTTP content-type and browser support
Posted by Neil on 22nd June 2007
What's the deal?
Depending on which Web browser you use, you may have occasionally noticed that some files coming from certain sites may not render how you expect. This is particularly noticable with images. The reason is almost always down to the content-type HTTP header being incorrectly specified, and what you see depends on how the browser adheres to the HTTP/1.1 specification.
For reference, RFC 2616 (Hypertext Transfer Protocol -- HTTP/1.1) states the following:
Any HTTP/1.1 message containing an entity-body SHOULD include a Content-Type header field defining the media type of that body. If and only if the media type is not given by a Content-Type field, the recipient MAY attempt to guess the media type via inspection of its content and/or the name extension(s) of the URI used to identify the resource. If the media type remains unknown, the recipient SHOULD treat it as type "application/octet-stream".
The most common form of the problem is when images are sent to the browser with the content-type set to "text/plain". Most image formats are binary in nature and browsers properly following the spec should respect the header and render the data as plain text (usually resulting in gibberish). Even though some image formats can be ASCII-based like XPM and PNM, they have their own content-types to remove any ambiguity.
Some browsers that do not follow the spec choose to ignore the content-type and use their own methods for guessing what the data is. The simplest is to examine the file extension of the URL, if present. Another is to sniff the data looking for signatures that identify the format (libmagic does this).
To test browser behaviour, I used Apache's mod_rewrite to transparently redirect requests to a PHP script that served a PNG image with a configurable content-type header. Each "fake" URL was then navigated to and the browsers' actions noted.
Legend
| followed spec |
| guessed subtype correctly although incorrect type was supplied |
| completely ignored content-type and guessed |
| Browser | Operating system | image/png, correct ext | image/png, incorrect ext | image/png, no ext | image/jpeg, correct ext | image/jpeg, incorrect ext | image/jpeg, no ext | text/plain, correct ext | text/plain, incorrect ext | text/plain, no ext |
|---|---|---|---|---|---|---|---|---|---|---|
| Internet Explorer 6 | Windows | image | image | image | image | image | image | image | image | image |
| Internet Explorer 7 | Windows | image | image | image | image | image | image | image | image | image |
| Firefox 2.0.0.4 | Windows | image | image | image | image | image | image | text | text | text |
| Linux | image | image | image | image | image | image | text | text | text | |
| Opera 9.21 | Windows | image | image | image | image | image | image | text | text | text |
| Linux | image | image | image | image | image | image | text | text | text | |
| Konqueror 3.5.7 | Linux | image | image | image | image | image | image | text | text | text |
| Safari 3 beta | Windows | image | image | image | image | image | image | image | image | download1 |
What did we learn?
The content-type header is readily ignored depending on the context in which a resource is requested. The browsers traditionally regarded as being more standards-compliant (Firefox, Opera, Konqueror) do it less often. However, all browsers ignore the content-type when they are expecting an image (and only an image will do) such as within an <img> element.
You can run into problems if your HTTP server issues a completely wrong content-type for images. While they will be displayed when embedded into an HTML document, hotlinking to them will not work in Opera, Firefox or Konqueror (and likely Safari) unless the browser already has the image in its cache (with the correct content-type assigned). This should be easy to fix with most servers.
Notes
-
Safari 3 beta on Windows seems to be the only tested browser that examined the URL for file extensions. It downloaded the image served as text/plain with no extension without warning the user. In fact, it seems to do this with any file it cannot display, including executables...
