This tutorial helps you get started with the Sea Surface Temperature Daily Analysis dataset: European Space Agency Climate Change Initiative product version 2.1, hosted by AWS opendata.
import fsspec
import matplotlib.pyplot as plt
import s3fs
import xarray as xr
import numpy as np
The following code will load the metadata for the dataset. It is really fast because it uses 'lazy loading'. Data is only loaded when it is needed, and only the chunks that are needed are accessed.
# if you are running on AWS cloud you can use the line below and it will be faster to read and access the data
#ds = xr.open_zarr(fsspec.get_mapper("s3://surftemp-sst/data/sst.zarr"),consolidated=True)
# if you are running anywhere other than AWS cloud (eg. your local computer or mybinder.org) this is the line that will work
ds = xr.open_zarr('https://surftemp-sst.s3.us-west-2.amazonaws.com/data/sst.zarr')
Xarray will print out all the variables in the dataset so you can explore them. Click on the icons to the right of the data to examine the data variable attributes and see some values for the data variable. At the bottom you can explore the global attributes for the dataset. This dataset contains:
Variable name | Description | Units |
---|---|---|
analysed_sst | Analysed sea surface temperature (Kelvin) | kelvin |
analysis_uncertainty | Estimated error standard deviation of analysed_sst | kelvin |
sea_ice_fraction | The estimated fraction of the area covered by sea ice | - |
mask | Bit mask (bit0:sea,bit1:land:bit2:lake,bit3:ice) | - |
ds
<xarray.Dataset> Dimensions: (time: 14367, lat: 3600, lon: 7200, bnds: 2) Coordinates: * lat (lat) float32 -89.97 -89.93 -89.88 ... 89.93 89.97 * lon (lon) float32 -180.0 -179.9 -179.9 ... 179.9 180.0 * time (time) datetime64[ns] 1981-09-01T12:00:00 ... 2... Dimensions without coordinates: bnds Data variables: analysed_sst (time, lat, lon) float32 dask.array<chunksize=(50, 360, 720), meta=np.ndarray> analysed_sst_uncertainty (time, lat, lon) float32 dask.array<chunksize=(50, 360, 720), meta=np.ndarray> lat_bnds (lat, bnds) float32 dask.array<chunksize=(360, 2), meta=np.ndarray> lon_bnds (lon, bnds) float32 dask.array<chunksize=(720, 2), meta=np.ndarray> mask (time, lat, lon) float32 dask.array<chunksize=(50, 360, 720), meta=np.ndarray> sea_ice_fraction (time, lat, lon) float32 dask.array<chunksize=(50, 360, 720), meta=np.ndarray> Attributes: (12/61) Conventions: CF-1.5, Unidata Observation Dataset v1.0 Metadata_Conventions: Unidata Dataset Discovery v1.0 acknowledgment: Funded by the Copernicus Climate Change ... cdm_data_type: grid comment: These data were produced by the Met Offi... contact: http://copernicus-support.ecmwf.int ... ... time_coverage_resolution: P1D time_coverage_start: 20201215T000000Z title: C3S SST L4 product tracking_id: 7fdf2639-26e5-4d4f-a60e-0bcfc9744204 uuid: 7fdf2639-26e5-4d4f-a60e-0bcfc9744204 westernmost_longitude: -180.0
analyzed_sst
over a specific time range¶The data covers the earth's surface at a resolution of 0.05 degrees. If you are interested in temperatures at a single location, extract a timeseries at a specific location. This is easiest to do with two .sel calls. First we select our time range. Next we select the latitude and longitude specifing method="nearest"
to find the nearest point.
ds["analysed_sst"] \
.sel(time=slice("1982-01-01", "1985-12-31")) \
.sel(lat=51, lon=-38, method="nearest") \
.plot(figsize=(10, 5))
[<matplotlib.lines.Line2D at 0x7fa80901af10>]
it is easy to select a subset of the data for given range of latitudes and longitudes. For example, to see the data for a region at a single point in time:
ds["analysed_sst"] \
.sel(time="2006-02-23") \
.sel(lat=slice(30, 45), lon=slice(115, 140)) \
.plot(cmap="coolwarm", figsize=(5, 5))
<matplotlib.collections.QuadMesh at 0x7fa809405050>
ds["analysed_sst"].sel(time="1983-01-01").plot(cmap="coolwarm", figsize=(10, 5))
<matplotlib.collections.QuadMesh at 0x7fa808e7b310>
In this dataset, the uncertainty is calculated for each pixel and included as another data variable. This can be useful when you are trying to understand how accurate the data is or how to blend it with other data or inject into a model. Here we plot the SST uncertainty data for a region (the gulf of guinea) at a single point in time:
ds["analysed_sst_uncertainty"] \
.sel(time="2006-02-23") \
.sel(lat=slice(30, 45), lon=slice(115, 140)) \
.plot(vmin=0, vmax=1)
<matplotlib.collections.QuadMesh at 0x7fa75125e350>
This dataset calculates SST for all ocean pixels, including areas covered by sea ice. In those regions the SST relaxes back to -2 C (271.15 K). For some science applications, it is important to remove data covered by sea ice. This dataset has a variable for sea ice fraction that can be used to mask the data. Here we plot the sea ice fraction for 01 Jan 1983.
ds["sea_ice_fraction"] \
.sel(time="1983-01-01") \
.plot(vmin=0, vmax=1, figsize=(10, 5))
<matplotlib.collections.QuadMesh at 0x7fa755a1eb90>
There is also a mask
in the data. To see what the different values of the mask means, you can print out the attributes.
ds["mask"].attrs
{'comment': 'b0: 1=grid cell is open sea water b1: 1=grid cell is land b2: 1=grid cell is lake surface b3: 1=grid cell is sea ice b4-b7: reserved for future grid mask data', 'flag_masks': [1, 2, 4, 8, 16], 'flag_meanings': 'water land optional_lake_surface sea_ice optional_river_surface', 'long_name': 'sea/land/lake/ice field composite mask', 'source': 'NAVOCEANO_landmask_v1.0 EUMETSAT_OSI-SAF_icemask ARCLake_lakemask', 'valid_max': 31, 'valid_min': 1}
Note in the attributes that this is a bit mask flag. The bits are given in the comments. Below we set up the bit flags and then plot each bit to show the different masks individually over Scandanavia
sea_bit,land_bit,lake_bit,seaice_bit = 0,1,2,3
bits = [sea_bit,land_bit,lake_bit,seaice_bit]
bitstr = ['Sea','Land','Lake','Sea Ice']
fig, axes = plt.subplots(ncols=4, figsize=(15, 5))
subset = (
ds["mask"]
.sel(time="2006-01-18")
.sel(lon=slice(2, 42), lat=slice(53, 72))
)
for ib,bit in enumerate(bits):
print(bitstr[bit])
subset_mask = np.bitwise_and(subset.astype('uint8'), int(2**bit)) > 0
subset_mask.plot(ax=axes[bit],add_colorbar=False)
axes[bit].set_title(bitstr[bit])
Sea Land Lake Sea Ice