I'm trying to port some code from python 2.7 which uses urllib2.Request, to python 3.10, and use the newer json.dumps() function and the urllib.requests.Request class. My problem is that the server accepts Posts from the original code, but fails with an in internal 500 error when I use updated classes. I expect the issue is that I am not encoding my data into the format the server requires. Both versions of post() here are supposed to pass the bytes of a png image, within a dict, to a web service running on localhost.
Here is the original function, which works in python 2.7:
def post(self, endpoint, data={}, params={}, headers=None):
try:
url = self.base_url + endpoint + "?" + urllib.urlencode(params)
data = json.dumps(data, False) # which parameter gets the False value?
headers = headers or {"Content-Type": "application/json", "Accept": "application/json"}
request = urllib2.Request(url=url, data=data, headers=headers)
response = urllib2.urlopen(request)
data = response.read()
data = json.loads(data)
return data
except Exception as ex:
logging.exception("ERROR: ApiClient.post")
Here are the docs from the python 2.7 Request https://docs.python.org/2.7/library/urllib2.html
class urllib2.Request(url[, data][, headers][, origin_req_host][, unverifiable]) This class is an abstraction of a URL request.
url should be a string containing a valid URL.
data may be a string specifying additional data to send to the server, or None if no such data is needed. Currently HTTP requests are the only ones that use data; the HTTP request will be a POST instead of a GET when the data parameter is provided. data should be a buffer in the standard application/x-www-form-urlencoded format. The urllib.urlencode() function takes a mapping or sequence of 2-tuples and returns a string in this format.
Here are the docs for python 3.10 https://docs.python.org/3.10/library/urllib.request.html#module-urllib.request
class urllib.request.Request(url, data=None, headers={}, origin_req_host=None, unverifiable=False, method=None)
This class is an abstraction of a URL request.
url should be a string containing a valid URL.
data must be an object specifying additional data to send to the server, or None if no such data is needed. Currently HTTP requests are the only ones that use data. The supported object types include bytes, file-like objects, and iterables of bytes-like objects. If no Content-Length nor Transfer-Encoding header field has been provided, HTTPHandler will set these headers according to the type of data. Content-Length will be used to send bytes objects, while Transfer-Encoding: chunked as specified in RFC 7230, Section 3.3.1 will be used to send files and other iterables. For an HTTP POST request method, data should be a buffer in the standard application/x-www-form-urlencoded format. The urllib.parse.urlencode() function takes a mapping or sequence of 2-tuples and returns an ASCII string in this format. It should be encoded to bytes before being used as the data parameter.
And this is my replacement function and decoder, which fails when I submit it to urlopen()
class BytesEncoder(json.JSONEncoder):
def default(self, o):
if isinstance(o, bytes):
return base64.b64encode(o).decode('utf-8')
else:
return super().default(o)
def post(self, endpoint, dict_in={}, params={}, headers=None):
try:
url = self.base_url + endpoint + "?" + urllib.parse.urlencode(params)
# NOTE: In the original code, the string returned by json.dumps is sent directly to urllib2.Request.
# That changes in urllib.request.Request, where the data parameter specifies "bytes"
# Original line was "data = json.dumps(data, False)"
json_str: str = json.dumps(dict_in, sort_keys=True, indent=2, cls=BytesEncoder)
json_bytes: bytes = json_str.encode('utf-8')
headers = headers or {"Content-Type": "application/json", "Accept": "application/json"}
request_out = urllib.request.Request(url=url, data=json_bytes, headers=headers)
"""
FIXME: The following urlopen raises HTTPError
url=http://127.0.0.1:7860/sdapi/v1/img2img?, code=500, reason=Internal Server Error
#/components/schemas/StableDiffusionProcessingImg2Img
"""
response = urlopen(request_out)
response_json_str = response.read()
response_dict = json.loads(response_json_str)
return response_dict
except urllib.error.HTTPError as ex:
StabDiffAuto1111.LOGGER.exception("ERROR: ApiClient.post()")
message = "url=%s, code=%d, reason=%s" % (ex.url, ex.code, ex.reason)
StabDiffAuto1111.LOGGER.error(message)
StabDiffAuto1111.LOGGER.error(str(ex.headers))
My replacement function does several things differently. First, I pass a custom BytesEncoder to json.dumps(). If I don't use a custom encoder, then json.dumps() fails with a "cannot serialize bytes" error. But perhaps I should use something besides base64 to encode the bytes.
Next, instead of re-using whatever was passed as the "data" argument, I create intermediate variables, with the expected type hints. So all is explicitly clear.
Finally, I take the (unicode) string from json.dumps, and I encode that into utf-8 bytes. That was my understanding of the documentation sentence "It should be encoded to bytes before being used as the data parameter." But I expect this is what the server won't accept.
How do I post the data in the same format that the previous function did?
My problem was not just in the one function I was trying to upgrade, but basically everywhere the old code converts from
strtobytesor vice-versa, AND wherever python 3 functions specifybyteswhere python 2 assumesstr. In particular, old code that generated a base64 encodedstrin python 2.7, generates aList[bytes]in python 3. So I needed start by appending.decode('utf-8')to all calls tobase64.b64encode()This migration issue is well documented, once you know what the issue is :/ And until you know, it's tough to figure out, because in python 3,
str,bytes, andList[bytes]are sometimes treated as the same, and sometimes differently.Once I tracked down all the confusions of
bytesvs.str, the following function works as expected: