WebApi 2.1 ReadAsMultipartAsync saves to more than one location at once (bug)

Topics: ASP.NET Web API
Jan 30, 2014 at 6:16 PM
Edited Jan 30, 2014 at 6:26 PM
Since 2.1, ReadAsMultipartAsync now saves posted files in two locations at once if the streamProvider overload is used. This is on Win2008R2 with .NET 4.0 You can use any example code for ReadAsMultipartAsync to repro the issue, either on this site or the internet.

If I pass in no streamProvider (Request.Content.ReadAsMultipartAsync() ), it will save the file parts to [System.Web.HttpRuntime.CodegenDir]\uploads as it should. If I pass in any form of a MultipartFormDataStreamProvider (which requires a rootPath parameter) it will save the file to the CodegenDir AND my rootPath as the file is uploading.

The behavior in 2.0 was it would to into CodegenDir under the covers and then GetLocalFileName in a custom class would tell it where the final location should be.

(Sort of related - I rather like that it CAN save the file to my GetLocalFileName location during the upload instead of into CodegenDir first. If you could leave that as a "feature" that would be great. Would allow for simple monitoring of the upload progress, if needed.)
Developer
Jan 30, 2014 at 6:43 PM

Hi,

I would like to have some clarification regarding this…You mention “If I pass in no streamProvider (Request.Content.ReadAsMultipartAsync() ), it will save the file parts to [System.Web.HttpRuntime.CodegenDir]\uploads”…but in this case ReadAsMultipartAsync will cause all the content to be read into memory and not into local file system..

So can you please provide a standalone repro or share the code you are using?

Thanks,

Kiran

Jan 30, 2014 at 6:49 PM
Sure. Here's the standard code that uses the default MultipartFormDataStreamProvider:
        [HttpPost()]
        public async Task<HttpResponseMessage> Post()
        {
            if (!Request.Content.IsMimeMultipartContent())
            {
                throw new HttpResponseException(HttpStatusCode.UnsupportedMediaType);
            }

            var streamProvider = new MultipartFormDataStreamProvider(@"C:\Uploads");
            List<string> files = new List<string>();

            try
            {
                // Read the MIME multipart content using the stream provider we just created.
                await Request.Content.ReadAsMultipartAsync(streamProvider);
                //await Request.Content.ReadAsMultipartAsync();

                foreach (MultipartFileData file in streamProvider.FileData)
                {
                    files.Add(file.LocalFileName);
                }

                // Send OK Response along with saved file names to the client. 
                return Request.CreateResponse(HttpStatusCode.OK, files);
            }
            catch (Exception e)
            {
                return Request.CreateErrorResponse(HttpStatusCode.InternalServerError, e);
            }
        }
This uploads it to both locations at the same time, as I described. If you switch the commented ReadAsMultipartAsync lines, it still stores the file in [System.Web.HttpRuntime.CodegenDir]\uploads according to my tests. Thanks.
Developer
Jan 30, 2014 at 8:28 PM

Thanks for the details. I am unable to repro the issue myself.

Following link has the location to the Web Project and Fiddler request that you can use to see if you are able to repro at your end. Take a look at it and let me know you have any questions.

http://sdrv.ms/1bD5Ydj

We recently uncovered an issue related to request streams targeted for Web API being read by other components of Asp.net results in issues. I am not sure if your issue is related to that. So I would compare the Web.config entries in my project and yours to see what the major differences are…also it would help if you can share your Web.config file…

Starting Web API 2.0, .NET 4.5 is a requirement so I am not sure when you mention you are using Web API 2.0 with .NET 4.0.

Thanks,

Kiran

Jan 30, 2014 at 10:06 PM
Edited Jan 30, 2014 at 10:19 PM
The .NET version I reported was a mistake -- we are running .NET 4.5 on that server. So no biggie there.

Further testing reveals the problem still exists in your test project. I went back to basics and created a brand new WebAPI project from scratch in VS2013. This gave me the WebAPI 2.0 and MVC 5.0 bits. This was published to the same development server and it functioned as I expected: the file is uploaded to CodegenDir and then moved out to the PATH specified in the stream provider.

Next, I downloaded your project and published it over that first project. One change was made to web.config to allow larger files to get POST-ed, but otherwise it's exactly your code. When I upload a file, it appears in both places at the same time during upload and the processor usage goes up. I wonder if you were unable to detect it because enough bytes weren't sent for it to appear in CodegenDir between refreshes. Were you using any monitoring besides watching the folder for the file to appear? Sysinternals procmon.exe with a filter on w3wp.exe is very helpful to see which files are written.

Please give your code another try, but send a very large file - maybe something around 4-5MB or more and watch the folders. I believe there is still an issue here, and now that I've tried the older WebAPI/MVC again, the processor usage issue I posted also seems to be real (proc usage was less than 5% in the API 2.0 test).

I'm more than happy to provide anything else you need. We were almost ready for production, but this puts the brakes on. It could be that the request streams targeting issue is coming into play here as well. Thanks.
Jan 30, 2014 at 10:16 PM
Edited Jan 30, 2014 at 10:35 PM
More -- in the default WebAPI project in the first test, we updated MVC to 5.1 from 5.0 and there was no problem. Next, we updated WebAPI to 2.1 from 2.0 and then the problem happens. No other code or config changes were done. Processor usage also jumps up to 15 or 20%.

It appears something in System.Net.Http.Formatting, System.Web.Http, or System.Web.Http.WebHost is causing the issue. Thanks
Developer
Jan 30, 2014 at 11:29 PM

Thanks for the additional information. I have tried the following and some observations:

1. Changed the Web.config settings to upload a ~100mb file. My settings in web.config

<system.web>

<httpRuntime targetFramework="4.5" maxRequestLength="153600" /><!-- 150mb in kilo bytes-->

<system.webServer>

<security>

<requestFiltering>

<requestLimits maxAllowedContentLength="157286400" /><!-- 150mb in bytes-->

Obseravtion:

What I noticed in this case is that an “uploads” folder is created in the codegendir. I see few files getting created but they get deleted very quickly.

2. Changed the Web API default buffer policy to explicitly set it be Non-buffer mode

Obseravtion: I do not see the files getting generated in the codegendir.

Question: In case of #1, don’t you see this file getting deleted at all?

Thanks,

Kiran

Jan 31, 2014 at 12:12 AM
If I understand your question, I see a file in the format [randomname].post get created in the CodegenDir\uploads directory, while at the exact same time a c:\uploads\BodyPart_blah file is created in the stream provider path.

The CodegenDir\uploads file is deleted after it's done, and the "BodyPart_blah" file remains as it should. But, I continue to see both files get created at the same time and high CPU. If you can't see the file in CodegenDir get created, you might have to simulate a Thread-delayed upload.

Again, the normal loop from 2.0 is for only [randomname].post to get created, and then moved based on my MultipartFormDataStreamProvider.GetLocalFileName override. Ideally, I would just rather it use the stream provider path + GetLocalFileName first and never store it in CodegenDir, but that might be a feature request. Thanks.
Jan 31, 2014 at 11:37 PM
I'm fairly confident that this is an issue. If you agree, can this be transferred to a work item in the Issues section? We can't really release our code into production until this is fixed. We intend for this code to handle a few dozen uploads at a time and the CPU load issue alone probably won't allow that.

It's possible this is caused by other request stream access issues, as you indicated, and we're happy to wait until those are resolved. But I'm surprised nobody else has noticed at least the additional CPU load when using this function. Thanks.
Developer
Feb 1, 2014 at 12:29 AM
Edited Feb 1, 2014 at 12:31 AM
Hi, I am not sure if this is an issue, but can you do the following to see if it gives you a better performance. By default, Web API’s policy, when hosted in IIS, is to buffer the request’s content. So if you are trying to upload large files, you would be consuming lot of memory. To avoid this you can change the policy to non-buffer.

Following is an example:
config.Services.Replace(typeof(IHostBufferPolicySelector), new CustomPolicy());

    public class CustomPolicy : WebHostBufferPolicySelector
    {
        public override bool UseBufferedInputStream(object hostContext)
        {
            return false;
        }
    }
Please try this and let us know. Thanks, Kiran
Feb 1, 2014 at 3:39 AM
Edited Feb 1, 2014 at 3:44 AM
If "filename" is detected in the Content-Disposition header, MultipartFormDataStreamProvider doesn't buffer to memory -- it buffers to a Filestream (disk):
MultipartFormDataStreamProvider Class

I took the Web Project you linked to me in the post above and ran the test again. Like your last test, I allowed larger uploads through web.config but in all other ways it is your code. Here is what I see on my server when uploading a 921MB file (so it goes slow):

Here is the upload at one point:
First screenshot

Here it is at another point later (so you know it's the same upload):
Second screenshot

Notice how it's saving the file to two places at once, as I have described. Also note the processor usage. This is only one file and it shouldn't require over 10% of the CPU -- I've seen it go into the 20's.

We were about to release the server-side part of our uploading system, but wanted to wait for 2.1 to fix the Flash issue. So, if there is anything else I can provide to demonstrate this issue let me know. Thanks.
Developer
Feb 1, 2014 at 5:21 AM
Web API framework depends on ASP.Net runtime to provide the request stream to it. Now since Web API's buffer policy is set to be Buffered by default, ASP.Net runtime provides the request stream in a buffered mode. Now how ASP.Net runtime does this buffering (like creating temp files etc.), I am really not sure and is out of my knowledge.

As per my understanding, when buffer policy is Buffered, Web API(ex: MultipartFormDataStreamProvider in this case) starts reading the request stream only after the entire request content is put into the ASP.Net's buffers.

So what I would like to ask you is, have you tried switching the default buffer policy of Web API to be in Non-buffer mode? You can quickly make changes for these and you can use the sample code that I had posted in my previous reply.

We would like to know if you still see bad performance even with Non-buffer policy. In case you do we will do deeper investigation regarding this.
Feb 1, 2014 at 2:19 PM
I did as you asked and uploaded the same file again. Here's a new screenshot:

Third

Note the processor usage at 24%. Memory consumption stayed almost constant at about 150MB for this process. But, you are correct that now ASP.NET no longer buffers the request so CodegenDir\uploads no longer contains a file. So why would memory consumption never go up? Related to this, it does not explain why the BodyPart_blah file is created at this point. MultipartFormDataStreamProvider is not supposed to create this file during upload per the specs for this class.

In the WebAPI 2.0 classes for this, ASP.NET does handle the initial disk buffering and then an override of this class would move the file to its final location. It's an older example, but versions of CustomMultipartFormDataStreamProvider class demo the procedure: [strathweb.com](http://www.strathweb.com/2012/08/a-guide-to-asynchronous-file-uploads-in-asp-net-web-api-rtm/ . If the buffering is turned off, why wouldn't the entire file go into memory?

Disabling the buffer in ASP.NET doesn't seem to change the issue and actually presents new questions. All we did was stop ASP.NET from buffering the file to disk. Something doesn't seem quite right with the new classes and I know somebody else will come across this at some point. Again, I'll try any test you want to work this out. Thanks.
Developer
Feb 1, 2014 at 3:56 PM
Edited Feb 1, 2014 at 3:57 PM
Thanks for trying out the Non-buffer scenario. Regarding "MultipartFormDataStreamProvider is not supposed to create this file during upload per the specs for this class. ", this is incorrect...some info below:

MultipartFormDataStreamProvider or other providers deriving from the abstract MultipartStreamProvider depend on Web API's multipart parser. The way this parser works is that it parses each part one-by-one in a forward only manner and the moment it sees a part's body it asks the supplied provider to give a stream. Once this stream is supplied by the provider, the parser starts writing into that stream.

So in this scenario MultipartFormDataStreamProvider provides a FileStream when asked by the parser(and the BodyPart_blah file is created at this moment) and the parser keeps writing to this filestream as it keeps reading the data from the non-buffered request stream that ASP.Net runtime has provided it.

So as you can imagine, this behavior of how the parser and MultipartFormDataStreamProvider work is consistent in both the cases..i.e Web API doesn't buffer by itself here and is only depending on the underlying hosting layers(ASP.Net) to either provide a buffered/un-buffered stream.

Now if you consider MultipartMemoryStreamProvider, it supplies a bufferd stream (MemoryStream) to the parser and then the parser starts right to this stream. So the behavior or logic is consistent across different providers.

Thanks, Kiran
Feb 1, 2014 at 6:32 PM
"So in this scenario MultipartFormDataStreamProvider provides a FileStream when asked by the parser(and the BodyPart_blah file is created at this moment) and the parser keeps writing to this filestream as it keeps reading the data from the non-buffered request stream that ASP.Net runtime has provided it."
Agreed -- for the non-buffered scenario. But now I need to convince you that the BUFFERED scenario in 2.1 has an issue. :)

The next test I just tried (and last one I can think to do) used the WebAPI 2.0 bits from the boiler-plate VS2013 WebAPI project (New Project -> ASP.NET Web Application -> Web API). I copied your ValuesController HttpPost code and modified the request limits in web.config. Everything else was unmodified.

First test: buffered through ASP.NET:
http://sdrv.ms/1dh4BB6

Note the low processor usage and it only creates the "buffered" file [randomname].post in CodegenDir\uploads. At the end of the upload (not shown here), the file was moved out to c:\uploads because GetLocalFilename in my custom class was called after.

Second test: unbuffered (I inserted your CustomPolicy code) so ASP.NET doesn't intercept it first:
http://sdrv.ms/1bhAL2L

Higher processor usage and it writes directly to c:\uploads since ASP.NET was taken out of the parsing loop. This is the scenario you discuss.

Finally, contrast both of those images with the second image I put in an earlier post that uses the 2.1 bits (Second screenshot):
Second screenshot

Note three things:
  1. This is buffered (no CustomPolicy implemented) so ASP.NET is supposed to be in the loop
  2. The file is saved simultaneously to c:\uploads
  3. CPU usage is higher, and consistent with unbuffered, 2.0 uploads
Bottom line - buffered uploading no longer stops the WebAPI 2.1 parser from intercepting the upload and immediately saving its own file.

Now that I've run the 2.0 boiler-plate test, it's apparent that WebAPI's parser isn't as efficient as ASP.NET with CPU usage. Few custom things are and it's not a knock against you guys. But, the consequence is this might not scale to dozens of uploads at a time as I had hoped if unbuffered is the only way for this to work. (For our purposes, it would actually be nice to immediately save the file in a known location with a known filename for monitoring purposes.)

I'll run more stress tests and maybe try to figure out how many uploads WebAPI's parser can handle before CPU load is too great. Do you guys recommend running unbuffered all the time for this type of operation? Is there some way to optimize the parser a bit more?

I still believe there's something going on, but many thanks for all your time on this.
Feb 21, 2014 at 6:06 PM
Edited Feb 21, 2014 at 6:07 PM
Got around to installing 5.1.1 and ran the same test. I still see the behavior where two files are created for each upload -- one in Codegendir and simultaneously another based on my override for GetLocalFileName. This all still seems a little curious.

Does 5.1.x parse the GetLocalFileName file from the Codegendir file as the upload happens? Sysinternals Process Monitor seems to show both getting WriteFile's at the same time so that was one possible conclusion to draw. But, I also see the GetLocalFileName file getting writes first in the list - and CPU usage is out of whack. Are they both reading the upload buffer at the same time? Still a bit confused.

Thanks