karl karl 16 Nov 2009

Issues with bad implementations of “Accept” HTTP header

The DDOS which was not a DDOS. We had an issue recently between the Web sites of RDS and Le Grand Club. Le Grand Club has been literally killed by the trafic coming from RDS, which is around 1 million hits a day. When checking the Grand Club HTTP logs of Le Grand Club, we noticed that all referers were coming from the home page of RDS, but coming from many different IP addresses and User Agents. What was happening? A security issue? An implementation issue on Le Grand Club in Rails?

No. Just a human error in the markup and bad HTTP browsers implementations.

An IMG element on the home page was calling a dynamic html file of Le Grand Club. With each request on the home page of RDS, a request was made on Le Grand Club.

How to avoid that this happens again in the future?

Accept header in HTTP 1.1

When typing a URL in a browser address bar or following a link in a Web page, the client (browsers such as Firefox, Opera, Safari for example) sends information to the server. It’s a “business card” giving enough details for knowing how the server should handle the client. This is a common pattern of social relationship before entering into a dialog.

The Accept header as defined in the HTTP specification.

14.1 Accept

The Accept request-header field can be used to specify certain media types which are acceptable for the response. Accept headers can be used to indicate that the request is specifically limited to a small set of desired types, as in the case of a request for an in-line image.

       Accept         = "Accept" ":"
                        #( media-range [ accept-params ] )
       media-range    = ( "*/*"
                        | ( type "/" "*" )
                        | ( type "/" subtype )
                        ) *( ";" parameter )
       accept-params  = ";" "q" "=" qvalue *( accept-extension )
       accept-extension = ";" token [ "=" ( token | quoted-string ) ]

The asterisk "*" character is used to group media types into ranges, with "*/*" indicating all media types and "type/*" indicating all subtypes of that type. The media-range MAY include media type parameters that are applicable to that range.

viaHTTP/1.1: Header Field Definitions.

Accept header in practice

Let’s try to visit the W3C Web site. The following is what Firefox is sending to the Apache server of W3C.

Host:              www.w3.org
User-Agent:        Mozilla/5.0 (Macintosh; U; Intel Mac OS X 10.5; fr; rv:1.9.1.5) Gecko/20091102 Firefox/3.5.5
Accept:            text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8
Accept-Language:   fr,fr-fr;q=0.8,en-us;q=0.5,en;q=0.3
Accept-Encoding:   gzip,deflate
Accept-Charset:    ISO-8859-1,utf-8;q=0.7,*;q=0.7
Keep-Alive:        300
Connection:        keep-alive
If-Modified-Since: Sun, 15 Nov 2009 13:42:32 GMT
If-None-Match:     "703c-478691095e200;89-3f26bd17a2f00"
Cache-Control:     max-age=0

The Accept header informs the server that the client will be able to process document of a certain nature. In this case, Firefox is saying: “I accept document which are html (text/html) or xhtml (application/xhtml+xml) or xml (application/xml) or if everything else fails, try to send me something in another format.”

If the server has what is needed, it will send the requested document for this specific URL in the right format (here html). So far no magic. Everything is perfect. Simple!

We receive the document which contains calls to other resources on the Web, such as stylesheet, scripts, and images. The html document will contain IMG element and get from the src attribute, the url to GET. Firefox sends again an accept header for this image.

Accept: image/png,image/*;q=0.8,*/*;q=0.5

It clearly says that Firefox is accepting images, PNG format in priority, then other type of images if no PNG available, and finally anything. The server receiving this request can send back an image in PNG, or GIF, or JPEG.

Solution Rails

We know that for specific URLs, we have specific type of contents. So when Firefox is sending an accept image for a URL which is in fact HTML. We can easily decide that the server should reply “406 Not Acceptable“, the proper way in HTTP to say that you can’t provide the right format to the client. The next morning, I seat with Benoit Goyette and discussed about handling in Rails this HTTP corner case. Better be ready for the next time. In a few times, Benoit finished a prototype code, perfectly working in a test environment.

before_filter :accept_headers
  def accept_headers
  if request.env['HTTP_ACCEPT'] =~ /^image.*$/ && !(request.env['REQUEST_PATH'] =~ /^.*[jpg|jpeg|gif|png]$/)
    render :nothing => true, :status => 406
  end
end

Hourra? Not exactly… It was without counting the bad implementation landscape of browsers. What about Opera?

text/html, application/xml;q=0.9, application/xhtml+xml, application/x-obml2d, image/png, image/jpeg, image/gif, image/x-xbitmap, */*;q=0.1

and Webkit? like Internet Explorer… worse.

*/*

Accept header is then not the solution for blocking this case of bad markup. We are still exploring an elegant and flexible solution.

2 opinions

  1. Your problem is not that your solution would not work, but that the (rails) implementation is a little poor. The little experience I’ve had with content-negotiation tells me you do NOT want to rely on regular expressions (at least not simplistic ones) to parse Accept headers.

    If you take the time to properly parse the accept headers, then you can implement an algorithm that could send a 406 if the Accept headers neither include image/ nor */*.

  2. It would not work, because you do not know if it is an img tag which requests the resource or an iframe for example. :)

Leave an opinion