I am having trouble obtaining all of the fields in a dataset from the Socrata Open Data Network. I am using urllib3 to make the request as such:
url = 'https://data.sfgov.org/resource/g8m3-pdis.json'
http = urllib3.PoolManager(ca_certs=certifi.where())
req = http.request_encode_url('GET', url)
data = json.loads(req.data.decode('utf-8'))
This request returns all but the last 6 fields in the dataset. I have tried using the
$select=*,*
statement to specify which fields I would like to retrieve but that still leaves out the last 6 fields even if I specify them.
However, when I use the
$limit
statement and set it to something greater than 1000 I get some of those fields that I am missing. But, still not all of them.
One thing that does work is specifying which row to grab, so if I use the following:
https://data.sfgov.org/resource/g8m3-pdis.json?ttxid=0000028-02-001
It will return all of the fields that I am looking for.
I guess I could make a request for all of the ttxid fields and then send out a separate request for each one in order to grab all the fields but there has to be a better way to do this. Does anyone have any ideas?
It appears there are a number of fields that are frequently blank:
location,business_corridor,lic_code_description, etc. When using JSON documents, Socrata omits blank fields from the response (which is different from the CSV endpoint).It looks like the default call,
https://data.sfgov.org/resource/g8m3-pdis.json, just happens to return fields that have blanks so does not appear. However, when$limitis added above the default of 1,000, the API happens to returns some rows with complete data so the field appears in limited circumstances.This can also be confirmed by grabbing a row which does not have all fields. The OP showed one example that returned all fields:
https://data.sfgov.org/resource/g8m3-pdis.json?ttxid=0000028-02-001But this example has blank fields so is omitted in the response:
https://data.sfgov.org/resource/g8m3-pdis.json?ttxid=1079490-06-161One way around this is to grab data from the CSV endpoint:
https://data.sfgov.org/resource/g8m3-pdis.csv