I encountered an unexpected error (apr_socket_recv: Connection reset by peer (54)) while load testing my Django application with Apache Benchmark (ab). This error occurs when I increase the load to 50 concurrent requests. Surprisingly, even with 20 concurrent requests, there's no significant performance improvement with non-blocking (Can see same throughput as of blocking)
The application consists of two implementations: one using a blocking approach with the default WSGI development server and the other utilizing a non-blocking asynchronous approach with ASGI and the Daphne server. The blocking implementation relies on synchronous requests with the requests library, while the non-blocking approach utilizes aiohttp within an asynchronous view function. Both implementations aim to make POST requests to an external API endpoint and return responses. Despite expecting better performance with the asynchronous approach, the error persists and the performance remains stagnant.
I'm seeking insights into the root cause of this error and potential solutions to improve performance under heavier loads. Additionally, any advice on optimizing the asynchronous implementation or suggestions for better load-testing strategies would be greatly appreciated
I'm using ab as a benchmarking tool. The command which I used for benchmarking: ab -c 50 -n 600 -s 800007 -T application/json "http://127.0.0.1:8001/test"
Blocking Code:
from rest_framework.decorators import api_view
import requests as requests
from rest_framework.response import Response
@api_view(['GET'])
def index(request):
res = make_api_request("http://{host}/v1/completions")
print("blocking response is ---->", res)
return Response(res, status=200)
def make_api_request(url, method="POST", headers=None, params=None, json_data=None, timeout=None):
try:
json_data = {'prompt': 'Hi, How are you?', 'max_new_tokens': 700, 'temperature': 0.1, 'top_p': 1, 'max_tokens': 700, 'model': 'meta-llama/Llama-2-7b-chat-hf'}
response = requests.request(method, url, headers=headers, params=params, json=json_data, timeout=timeout)
return response
except requests.exceptions.Timeout as e:
raise TimeoutError(f"Request timed out. The server did not respond within the specified timeout period.")
except requests.exceptions.RequestException as e:
raise ConnectionError(f"Request error: {str(e)}")
except Exception as e:
raise Exception(f"Exception error: {str(e)}")
Non Blocking Code:
import asyncio
import aiohttp
import logging
from rest_framework.response import Response
from adrf.decorators import api_view
import json
logger = logging.getLogger(__name__)
@api_view(['GET'])
async def index(request):
# logger.info(f"is async: {iscoroutinefunction(index)}")
res = await make_api_request("http://{{host}}/v1/completions")
logger.info("res is ----> %s", res)
return Response(res, status=200)
async def make_api_request(url, method="POST", headers=None, params=None, json_data=None, timeout=None):
try:
json_data = {'prompt': 'Hi, How are you?', 'max_new_tokens': 700, 'temperature': 0.1, 'top_p': 1, 'max_tokens': 700, 'model': 'meta-llama/Llama-2-7b-chat-hf'}
async with aiohttp.ClientSession() as session:
async with session.request(method, url, headers=headers, params=params, json=json_data,
timeout=timeout, ssl=False) as response:
content = await response.read()
if 'json' in response.headers.get('Content-Type', ''):
content = json.loads(content)
return content
except asyncio.TimeoutError:
raise TimeoutError("Request timed out. The server did not respond within the specified timeout period.")
except aiohttp.ClientError as e:
raise ConnectionError(f"Request error: {str(e)}")
except Exception as e:
raise Exception(f"Exception error: {str(e)}")
runserverdevelopment server; it's not meant for heavy load. If you need to benchmark a WSGI app, use e.g.gunicornoruwsgito serve the app instead.asyncwon't help much with per-request throughput.asynccode with an async application server does mean that a single async worker can process multiple requests concurrently (when the requests are awaiting for e.g. the aforementioned remote call).