自定义MultipartFormDataStreamProvider上传后,通过WebApi从SQL下载大文件

这是我之前提出的一个问题的后续问题,该问题由于过于宽泛而被关闭。 上一个问题

在那个问题中,我解释说我需要通过将块存储为单独的行来将大文件(1-3GB)上传到数据库。 我通过重写MultipartFormDataStreamProvider.GetStream方法来做到这一点。 该方法返回了一个自定义流,它将缓冲的块写入数据库。

问题是重写的GetStream方法是将整个请求写入数据库(包括头文件)。 它在保持内存级别平稳的情况下成功写入数据,但是当我下载文件时,除了文件内容之外,它还会返回下载文件内容中的所有标题信息,因此无法打开文件。

有没有办法在重写的GetStream方法中,只需将文件的内容写入数据库而无需编写标题?

API

[HttpPost] [Route("file")] [ValidateMimeMultipartContentFilter] public Task PostFormData() { var provider = new CustomMultipartFormDataStreamProvider(); // Read the form data and return an async task. var task = Request.Content.ReadAsMultipartAsync(provider).ContinueWith(t => { if (t.IsFaulted || t.IsCanceled) { Request.CreateErrorResponse(HttpStatusCode.InternalServerError, t.Exception); } return Request.CreateResponse(HttpStatusCode.OK); }); return task; } [HttpGet] [Route("file/{id}")] public async Task GetFile(string id) { var result = new HttpResponseMessage() { Content = new PushStreamContent(async (outputStream, httpContent, transportContext) => { await WriteDataChunksFromDBToStream(outputStream, httpContent, transportContext, id); }), StatusCode = HttpStatusCode.OK }; result.Content.Headers.ContentType = new MediaTypeHeaderValue("application/zipx"); result.Content.Headers.ContentDisposition = new ContentDispositionHeaderValue("attachment") { FileName = "test response.zipx" }; return result; } return new HttpResponseMessage(HttpStatusCode.BadRequest); } private async Task WriteDataChunksFromDBToStream(Stream responseStream, HttpContent httpContent, TransportContext transportContext, string fileIdentifier) { // PushStreamContent requires the responseStream to be closed // for signaling it that you have finished writing the response. using (responseStream) { using (var myConn = new SqlConnection(System.Configuration.ConfigurationManager.ConnectionStrings["TestDB"].ConnectionString)) { await myConn.OpenAsync(); using (var myCmd = new SqlCommand("ReadAttachmentChunks", myConn)) { myCmd.CommandType = System.Data.CommandType.StoredProcedure; var fileName = new SqlParameter("@Identifier", fileIdentifier); myCmd.Parameters.Add(fileName); // Read data back from db in async call to avoid OutOfMemoryException when sending file back to user using (var reader = await myCmd.ExecuteReaderAsync(CommandBehavior.SequentialAccess)) { while (await reader.ReadAsync()) { if (!(await reader.IsDBNullAsync(3))) { using (var data = reader.GetStream(3)) { // Asynchronously copy the stream from the server to the response stream await data.CopyToAsync(responseStream); } } } } } } }// close response stream } 

自定义MultipartFormDataStreamProvider GetStream方法实现

  public override Stream GetStream(HttpContent parent, HttpContentHeaders headers) { // For form data, Content-Disposition header is a requirement ContentDispositionHeaderValue contentDisposition = headers.ContentDisposition; if (contentDisposition != null) { // If we have a file name then write contents out to AWS stream. Otherwise just write to MemoryStream if (!String.IsNullOrEmpty(contentDisposition.FileName)) { var identifier = Guid.NewGuid().ToString(); var fileName = contentDisposition.FileName;// GetLocalFileName(headers); if (fileName.Contains("\\")) { fileName = fileName.Substring(fileName.LastIndexOf("\\") + 1).Replace("\"", ""); } // We won't post process files as form data _isFormData.Add(false); var stream = new CustomSqlStream(); stream.Filename = fileName; stream.Identifier = identifier; stream.ContentType = headers.ContentType.MediaType; stream.Description = (_formData.AllKeys.Count() > 0 && _formData["description"] != null) ? _formData["description"] : ""; return stream; //return new CustomSqlStream(contentDisposition.Name); } // We will post process this as form data _isFormData.Add(true); // If no filename parameter was found in the Content-Disposition header then return a memory stream. return new MemoryStream(); } throw new InvalidOperationException("Did not find required 'Content-Disposition' header field in MIME multipart body part.."); #endregion } 

实现了CustomSqlStream调用的Stream的Write方法

  public override void Write(byte[] buffer, int offset, int count) { //write buffer to database using (var myConn = new SqlConnection(System.Configuration.ConfigurationManager.ConnectionStrings["TestDB"].ConnectionString)) { using (var myCmd = new SqlCommand("WriteAttachmentChunk", myConn)) { myCmd.CommandType = System.Data.CommandType.StoredProcedure; var pContent = new SqlParameter("@Content", buffer); myCmd.Parameters.Add(pContent); myConn.Open(); myCmd.ExecuteNonQuery(); if (myConn.State == System.Data.ConnectionState.Open) { myConn.Close(); } } } ((ManualResetEvent)_dataAddedEvent).Set(); } 

“ReadAttachmentChunks”存储过程在插入数据库时​​从db排序的文件中获取相应的行。 因此,代码的工作方式是将这些块拉回来,然后异步将其写回PushStreamContent以返回给用户。

所以我的问题是:

除了内容之外,有没有办法只写入正在上传的文件的内容而不是标题?

任何帮助将不胜感激。 谢谢。

我终于弄明白了。 我过度复杂的写作过程带来了大部分的斗争。 这是我最初问题的解决方案:

为了防止.net缓冲内存中的文件(以便您可以处理大文件上传),首先需要覆盖WebHostBufferPolicySelector,以便它不会缓冲控制器的输入流,然后替换BufferPolicy选择器。

  public class NoBufferPolicySelector : WebHostBufferPolicySelector { public override bool UseBufferedInputStream(object hostContext) { var context = hostContext as HttpContextBase; if (context != null) { if (context.Request.RequestContext.RouteData.Values["controller"] != null) { if (string.Equals(context.Request.RequestContext.RouteData.Values["controller"].ToString(), "upload", StringComparison.InvariantCultureIgnoreCase)) return false; } } return true; } public override bool UseBufferedOutputStream(HttpResponseMessage response) { return base.UseBufferedOutputStream(response); } } 

然后替换BufferPolicy选择器

 GlobalConfiguration.Configuration.Services.Replace(typeof(IHostBufferPolicySelector), new NoBufferPolicySelector()); 

然后,为了避免将文件流写入磁盘的默认行为,您需要提供将写入数据库的流提供程序。 为此,您inheritanceMultipartStreamProvider并重写GetStream方法以返回将写入数据库的流。

  public override Stream GetStream(HttpContent parent, HttpContentHeaders headers) { // For form data, Content-Disposition header is a requirement ContentDispositionHeaderValue contentDisposition = headers.ContentDisposition; if (contentDisposition != null && !String.IsNullOrEmpty(contentDisposition.FileName)) { // We won't post process files as form data _isFormData.Add(false); //create unique identifier for this file upload var identifier = Guid.NewGuid(); var fileName = contentDisposition.FileName; var boundaryObj = parent.Headers.ContentType.Parameters.SingleOrDefault(a => a.Name == "boundary"); var boundary = (boundaryObj != null) ? boundaryObj.Value : ""; if (fileName.Contains("\\")) { fileName = fileName.Substring(fileName.LastIndexOf("\\") + 1).Replace("\"", ""); } //write parent container for the file chunks that are being stored WriteLargeFileContainer(fileName, identifier, headers.ContentType.MediaType, boundary); //create an instance of the custom stream that will write the chunks to the database var stream = new CustomSqlStream(); stream.Filename = fileName; stream.FullFilename = contentDisposition.FileName.Replace("\"", ""); stream.Identifier = identifier.ToString(); stream.ContentType = headers.ContentType.MediaType; stream.Boundary = (!string.IsNullOrEmpty(boundary)) ? boundary : ""; return stream; } else { // We will post process this as form data _isFormData.Add(true); // If no filename parameter was found in the Content-Disposition header then return a memory stream. return new MemoryStream(); } } 

您创建的自定义流需要inheritanceStream并覆盖Write方法。 这是我推翻问题的地方,并认为我需要解析通过buffer参数传递的边界标题。 但这实际上是通过利用偏移和计数参数为您完成的。

 public override void Write(byte[] buffer, int offset, int count) { //no boundary is inluded in buffer byte[] fileData = new byte[count]; Buffer.BlockCopy(buffer, offset, fileData, 0, count); WriteData(fileData); } 

从那里,它只是插入api方法上传和下载。 上传:

  public Task PostFormData() { var provider = new CustomMultipartLargeFileStreamProvider(); // Read the form data and return an async task. var task = Request.Content.ReadAsMultipartAsync(provider).ContinueWith(t => { if (t.IsFaulted || t.IsCanceled) { Request.CreateErrorResponse(HttpStatusCode.InternalServerError, t.Exception); } return Request.CreateResponse(HttpStatusCode.OK); }); return task; } 

为了下载,为了保持较低的内存占用率,我利用PushStreamContent将块推回给用户:

 [HttpGet] [Route("file/{id}")] public async Task GetFile(string id) { string mimeType = string.Empty; string filename = string.Empty; if (!string.IsNullOrEmpty(id)) { //get the headers for the file being sent back to the user using (var myConn = new SqlConnection(System.Configuration.ConfigurationManager.ConnectionStrings["PortalBetaConnectionString"].ConnectionString)) { using (var myCmd = new SqlCommand("ReadLargeFileInfo", myConn)) { myCmd.CommandType = System.Data.CommandType.StoredProcedure; var pIdentifier = new SqlParameter("@Identifier", id); myCmd.Parameters.Add(pIdentifier); myConn.Open(); var dataReader = myCmd.ExecuteReader(); if (dataReader.HasRows) { while (dataReader.Read()) { mimeType = dataReader.GetString(0); filename = dataReader.GetString(1); } } } } var result = new HttpResponseMessage() { Content = new PushStreamContent(async (outputStream, httpContent, transportContext) => { //pull the data back from the db and stream the data back to the user await WriteDataChunksFromDBToStream(outputStream, httpContent, transportContext, id); }), StatusCode = HttpStatusCode.OK }; result.Content.Headers.ContentType = new MediaTypeHeaderValue(mimeType);// "application/octet-stream"); result.Content.Headers.ContentDisposition = new ContentDispositionHeaderValue("attachment") { FileName = filename }; return result; } return new HttpResponseMessage(HttpStatusCode.BadRequest); } private async Task WriteDataChunksFromDBToStream(Stream responseStream, HttpContent httpContent, TransportContext transportContext, string fileIdentifier) { // PushStreamContent requires the responseStream to be closed // for signaling it that you have finished writing the response. using (responseStream) { using (var myConn = new SqlConnection(System.Configuration.ConfigurationManager.ConnectionStrings["PortalBetaConnectionString"].ConnectionString)) { await myConn.OpenAsync(); //stored proc to pull the data back from the db using (var myCmd = new SqlCommand("ReadAttachmentChunks", myConn)) { myCmd.CommandType = System.Data.CommandType.StoredProcedure; var fileName = new SqlParameter("@Identifier", fileIdentifier); myCmd.Parameters.Add(fileName); // The reader needs to be executed with the SequentialAccess behavior to enable network streaming // Otherwise ReadAsync will buffer the entire BLOB into memory which can cause scalability issues or even OutOfMemoryExceptions using (var reader = await myCmd.ExecuteReaderAsync(CommandBehavior.SequentialAccess)) { while (await reader.ReadAsync()) { //confirm the column that has the binary data of the file returned is not null if (!(await reader.IsDBNullAsync(0))) { //read the binary data of the file into a stream using (var data = reader.GetStream(0)) { // Asynchronously copy the stream from the server to the response stream await data.CopyToAsync(responseStream); await data.FlushAsync(); } } } } } } }// close response stream } 

啊。 这很讨厌。 上传后,您必须确保

  1. 将标题与内容部分分开 – 您必须遵循HTTP的RFC文档要求。
  2. 允许分块转移
  3. 当然,内容部分(除非您正在传输文本)将被二进制编码为字符串。
  4. 允许压缩的传输,即GZIP或DEFLATE。
  5. 也许 – 只是可能 – 考虑编码(ASCII,Unicode,UTF8等)。

您无法确保在没有查看所有这些信息的情况下将正确的信息保存到数据库中。 对于后面的项目,所有关于做什么的元数据都将在标题中的某个位置,因此它不仅仅是一次性的。