I had CSV file with x and y coordinates, as well as variable values for three different time steps as follows:
x, y, var_t1, var_t2, var_t3,
1, 1, 8, 8, 6
1, 2, 6, 1, 2
2, 1, 5, 3, 7
2, 2, 7, 2, 6
I have learned to create a NetCDF file with the following method:
import xarray as xr
xr.Dataset.from_dataframe(df.set_index(['x', 'y'])).to_netcdf('filename.nc')
This results in a NetCDF with x and y as dimensions, and I get 3 different variables.
My goal was to create a NetCDF with x, y and t as dimensions with a single variable.
I managed to achieve this but I feel like I did it in a very complicated fashion.
My solution was to play with the CSV file and make it 3 times longer, while adding a "t" column to represent time steps:
x, y, t, var_t1, var_t2, var_t3,
1, 1, 0, 8, 0, 0
1, 2, 0, 6, 0, 0
2, 1, 0, 5, 0, 0
2, 2, 0, 7, 0, 0
1, 1, 1, 0, 8, 0
1, 2, 1, 0, 1, 0
2, 1, 1, 0, 3, 0
2, 2, 1, 0, 2, 0
1, 1, 2, 0, 0, 6
1, 2, 2, 0, 0, 2
2, 1, 2, 0, 0, 7
2, 2, 2, 0, 0, 6
Now when I apply
import xarray as xr
xr.Dataset.from_dataframe(df.set_index(['x', 'y', 't'])).to_netcdf('filename.nc')
I get a NetCDF with x, y, t dimensions and a single variable for each different time (i.e. when t = 1, only var_t2 != 0).
Would there be a way to achieve this in a much simpler way, in case I encounter a similar problem in the future? This was easy to do with only 3 time steps, but I would be in trouble with tens or thousands of time steps.
Thank you!
Say you have the dataframe
df:You can set x,y as an index, convert it to xarray, merge the variables var_t1... to a new dimension and set
new_timesas the coordinates of the time dimension: