Improving Django Application Performance: Comparing Blocking and Non-blocking Implementations

41 Views Asked by At

I encountered an unexpected error (apr_socket_recv: Connection reset by peer (54)) while load testing my Django application with Apache Benchmark (ab). This error occurs when I increase the load to 50 concurrent requests. Surprisingly, even with 20 concurrent requests, there's no significant performance improvement with non-blocking (Can see same throughput as of blocking)

The application consists of two implementations: one using a blocking approach with the default WSGI development server and the other utilizing a non-blocking asynchronous approach with ASGI and the Daphne server. The blocking implementation relies on synchronous requests with the requests library, while the non-blocking approach utilizes aiohttp within an asynchronous view function. Both implementations aim to make POST requests to an external API endpoint and return responses. Despite expecting better performance with the asynchronous approach, the error persists and the performance remains stagnant.

I'm seeking insights into the root cause of this error and potential solutions to improve performance under heavier loads. Additionally, any advice on optimizing the asynchronous implementation or suggestions for better load-testing strategies would be greatly appreciated I'm using ab as a benchmarking tool. The command which I used for benchmarking: ab -c 50 -n 600 -s 800007 -T application/json "http://127.0.0.1:8001/test" Blocking Code:


from rest_framework.decorators import api_view
import requests as requests
from rest_framework.response import Response


@api_view(['GET'])
def index(request):
    res = make_api_request("http://{host}/v1/completions")
    print("blocking response is ---->", res)
    return Response(res, status=200)


def make_api_request(url, method="POST", headers=None, params=None, json_data=None, timeout=None):
    try:
        json_data =  {'prompt': 'Hi, How are you?', 'max_new_tokens': 700, 'temperature': 0.1, 'top_p': 1, 'max_tokens': 700, 'model': 'meta-llama/Llama-2-7b-chat-hf'}

        response = requests.request(method, url, headers=headers, params=params, json=json_data, timeout=timeout)
        return response
    except requests.exceptions.Timeout as e:
        raise TimeoutError(f"Request timed out. The server did not respond within the specified timeout period.")
    except requests.exceptions.RequestException as e:
        raise ConnectionError(f"Request error: {str(e)}")
    except Exception as e:
        raise Exception(f"Exception error: {str(e)}")

Non Blocking Code:

import asyncio
import aiohttp
import logging
from rest_framework.response import Response
from adrf.decorators import api_view
import json

logger = logging.getLogger(__name__)


@api_view(['GET'])
async def index(request):
    # logger.info(f"is async: {iscoroutinefunction(index)}")
    res = await make_api_request("http://{{host}}/v1/completions")
    logger.info("res is ----> %s", res)
    return Response(res, status=200)


async def make_api_request(url, method="POST", headers=None, params=None, json_data=None, timeout=None):
    try:
        json_data =  {'prompt': 'Hi, How are you?', 'max_new_tokens': 700, 'temperature': 0.1, 'top_p': 1, 'max_tokens': 700, 'model': 'meta-llama/Llama-2-7b-chat-hf'}

        async with aiohttp.ClientSession() as session:
            async with session.request(method, url, headers=headers, params=params, json=json_data,
                                       timeout=timeout, ssl=False) as response:
                content = await response.read()
                if 'json' in response.headers.get('Content-Type', ''):
                    content = json.loads(content)
                return content
    except asyncio.TimeoutError:
        raise TimeoutError("Request timed out. The server did not respond within the specified timeout period.")
    except aiohttp.ClientError as e:
        raise ConnectionError(f"Request error: {str(e)}")
    except Exception as e:
        raise Exception(f"Exception error: {str(e)}")
1

There are 1 best solutions below

0
AKX On
  • You should not benchmark the Django runserver development server; it's not meant for heavy load. If you need to benchmark a WSGI app, use e.g. gunicorn or uwsgi to serve the app instead.
  • If you want to compare the performance of a synchronous server to an asynchronous one, be sure to configure the synchronous server to also have enough workers; otherwise it's not a level playing ground at all.
  • Since the remote call takes 2 seconds, changing the surrounding code to be faster or async won't help much with per-request throughput.
    • async code with an async application server does mean that a single async worker can process multiple requests concurrently (when the requests are awaiting for e.g. the aforementioned remote call).