1

Closed

Web API Not fully handling Accept-Charset

description

Currently the Web API does not understand Accept-Charset headers where a preference is stated. A value of:

Accept-Charset: utf-16

works, but a value of:

Accept-Charset: utf-8; q=0.2, utf-16;q=1.8

does not. Also, if an unsupported charset is requested, the server should respond with HTTP 406 Not Acceptable. Instead it returns UTF8.
Closed Sep 26, 2012 at 7:09 AM by HongmeiG
Given this works fine in most cases, we will not fix this for the current release.

comments

kichalla wrote Sep 14, 2012 at 6:08 PM

Hi,

I am unable to repro your issue...this is most probably because your Quality factor for Utf-16 is > 1.0...Quality factor needs to be between 0.0 and 1.0.

For example...the following works:

Accept-CharSet: utf-8;q=0.2, utf-16; q=0.8

Regarding the status code, its a known issue actually...

Thanks,
Kiran

pierslawson wrote Sep 15, 2012 at 10:18 PM

You are right... I had been just looking at the Accept-Charset header spec that doesn't set an upper bound to q, but I guess it can be inferred from the Accept header spec which does set boundaries.

It is good to know the other minor issue is being looked at.

pierslawson wrote Sep 15, 2012 at 10:20 PM

Having said that... I can't see the Accept-Charset 406 issue any where else. The other mentions of 406 appear to be for different scenarios.

HenrikN wrote Sep 16, 2012 at 5:34 AM

The media type is the main thing we negotiate on and for that you can ask the DefaultContentNegotiator to either send a default formatter response or a 406 response. For accept-charset we ask the formatter that was picked based on media type to then pick the best charset encoding but if it can't find one then it uses the default charset encoding.

This behavior is fine from an HTTP perspective. Using the media type as the key component in the negotiation algorithm helps provide a bounded algorithm.

Henrik

pierslawson wrote Sep 18, 2012 at 4:14 PM

Henrik, I agree the server doesn't strictly need to obey the Accept-Charset header, but if it doesn't return a 406 and instead returns a default UTF-8, user agents might feel the need to inspect the Content-Type heading in order to decide whether they can safely decode the response (rather than simply handling an obvious error).

Granted, there will not be many generic user agents out there that will ask for an arbitary charset against any resouce... but I'm sure the change to make the server behave a little better would not be too complex.

HenrikN wrote Sep 18, 2012 at 5:16 PM

In practice my sense is that it mostly works out ok: If no Accept-Charset is indicated then either the default character for a particular media type is used; or if there is a request entity body with a given charset then we match that in the response. That is, if the client sends UTF-16 then we return UTF-16 unless something else is requested explicitly with Accept-Charset header.

That said, you are more than welcome to come up with a fix to support 406 and submit a pull request. It would be great to see what it looks like.

Henrik

pierslawson wrote Sep 19, 2012 at 11:39 PM

Hi Henrik.. I was about to put my money where my mouth is and download the code, until I read the instructions... is that correct, I need Windows 8 and Visual Studio 2012? If so that rules me out for quite a while!