TL;DR - Is there a way to use metaflow to download data made available from an outside source via url or is it not possible at this time?
Full version:
I'm trying to set up a flow that downloads data using an API (USDA-NASS) and saves it into s3. For most data, this works fine. However, there is one subset of the data that can't be accessed this way (gridded condition/progress). Instead, it requires running the url for the file ("https://www.nass.usda.gov/Research_and_Science/Crop_Progress_Gridded_Layers/datasets/{file_name}.zip") on the target webpage through requests and then processing the data. This creates a problem for metaflow, as it seems to try and find the file in s3, only to throw up a FileNotFound error. I've asked around my colleagues and looked at the metaflow documentation, but nothing I've referenced contains info on how to go about this task. Instead most resources talk about getting data from s3 and nowhere else. Is there a way to make this work?