Quart: sync vs async parsing form data

72 Views Asked by At

I'm investigating the performance of uploading files to the Quart web server

I'm looking at how Quart parses file data when the request type is "multipart/form-data", and I see that synchronous temporary files are used to write files to the stream

def default_stream_factory(
    total_content_length: int | None,
    content_type: str | None,
    filename: str | None,
    content_length: int | None = None,
) -> t.IO[bytes]:
    max_size = 1024 * 500

    if SpooledTemporaryFile is not None:
        return t.cast(t.IO[bytes], SpooledTemporaryFile(max_size=max_size, mode="rb+"))
    elif total_content_length is None or total_content_length > max_size:
        return t.cast(t.IO[bytes], TemporaryFile("rb+"))

    return BytesIO()

I have a question: in the form parser (quart.formparser.MultiPartParser) is it intentionally made synchronous writing to the container (file data stream), instead of using asynchronous writing, for example with aiofiles?

I want to make sure that the really synchronous recording of multipart chunks is done intentionally

For what reasons is this done, would this work better under high load and multiple coroutines?

I tried implementing the aiofiles option and got a slight performance reduction (5-10%) with wrk tool and script:

function read_file(path)
    local file, errorMessage = io.open(path, "rb")
    if not file then
        error("Could not read the file:" .. errorMessage .. "\n")
    end
  
    local content = file:read "*all"
    file:close()
    return content
  end
  
  local Boundary = "----WebKitFormBoundaryePkpFF7tjBAqx29L"
  local BodyBoundary = "--" .. Boundary
  local LastBoundary = "--" .. Boundary .. "--"
  local CRLF = "\r\n"
  local FileBody = read_file("./text_file.txt")
  local Filename = "file"
  local ContentDisposition = 'Content-Disposition: form-data; name="file"; filename="' .. Filename .. '"'
  local ContentType = 'Content-Type: text/plain'
  
  wrk.method = "POST"
  wrk.headers["Content-Type"] = "multipart/form-data; boundary=" .. Boundary
  wrk.headers["Transfer-Encoding"] = "chunked"
  wrk.body = BodyBoundary .. CRLF .. ContentDisposition .. CRLF .. ContentType .. CRLF .. CRLF .. FileBody .. CRLF .. LastBoundary

(note, I specified wrk.headers["Transfer-Encoding"] = "chunked", which required me to build wrk with this PR - https://github.com/wg/wrk/pull/504)

with command:

wrk -t3 -c3 -d100s -s ufile.lua http://localhost:8000/

at a file size of 100 MB the result is:

without aiofiles (native quart.formparser.MultiPartParser):

Running 2m test @ http://localhost:8000/
  3 threads and 3 connections
  Thread Stats   Avg      Stdev     Max   +/- Stdev
    Latency     1.62s    99.26ms   1.93s    70.65%
    Req/Sec     0.00      0.00     0.00    100.00%
  184 requests in 1.67m, 32.16KB read
Requests/sec:      1.84
Transfer/sec:     329.10B

with aiofiles:

Running 2m test @ http://localhost:8000/
  3 threads and 3 connections
  Thread Stats   Avg      Stdev     Max   +/- Stdev
    Latency     1.67s   116.26ms   2.00s    70.30%
    Req/Sec     0.00      0.00     0.00    100.00%
  176 requests in 1.67m, 30.77KB read
  Socket errors: connect 0, read 0, write 0, timeout 11
Requests/sec:      1.76
Transfer/sec:     314.79B

at a file size of 10 MB the result is: without aiofiles (native quart.formparser.MultiPartParser):

Running 2m test @ http://localhost:8000/
  3 threads and 3 connections
  Thread Stats   Avg      Stdev     Max   +/- Stdev
    Latency   173.16ms   38.79ms 503.89ms   81.60%
    Req/Sec     6.42      2.55    10.00     62.39%
  1739 requests in 1.67m, 303.99KB read
Requests/sec:     17.38
Transfer/sec:      3.04KB

with aiofiles:

Running 2m test @ http://localhost:8000/
  3 threads and 3 connections
  Thread Stats   Avg      Stdev     Max   +/- Stdev
    Latency   195.36ms   42.38ms 519.98ms   80.43%
    Req/Sec     5.49      2.16    10.00     71.86%
  1539 requests in 1.67m, 269.16KB read
Requests/sec:     15.38
Transfer/sec:      2.69KB

route is implemented like this

@app.route("/", methods=["POST"])
async def upload_file() -> str:
    await request.files
    return "uploaded"

the modified part of the parser looks like this

...
    if isinstance(event, Field):
        current_part = event
        container = []
        _write = run_sync(container.append)
    elif isinstance(event, File):
        current_part = event
        container: AsyncBufferedReader = await self.start_file_streaming(event, content_length)
        _write = container.write
    elif isinstance(event, Data):
        await _write(event.data)
...

If my tests are correct, I'd like to understand why this happens and if there are any useful ways to use aiofiles when parsing form data?

0

There are 0 best solutions below