Computing transports in ECCO with ECCOv4-py

by Tim Smith for the ECCO Townhall at Ocean Sciences 2020

Note: this is not a tutorial!!

but more of an advertisement. For a thorough tutorial, from learning python to computing the AMOC, see the ECCOv4-py documentation largely put together by Ian Fenty.

In [1]:
import os
from glob import glob
import numpy as np
import matplotlib.pyplot as plt
import xarray as xr
from dask.distributed import Client
import ecco_v4_py as ecco

import warnings
In [2]:
warnings.simplefilter('ignore')
os.environ['PYTHONWARNOINGS'] = 'ignore'

Read model grid and variables: as NetCDF files downloaded from the ecco drive

See also: load_ecco_grid_nc and these functions

For MITgcm binary output, see load_ecco_vars_from_mds, which is a wrapper around open_mdsdataset from xmitgcm

In [3]:
download_dir = f'{os.getenv("STOCKYARD")}/ecco/v4r4'
In [4]:
%%time
ds = xr.open_mfdataset(glob(f'{download_dir}/nctiles_monthly/*/*/*.nc'))
CPU times: user 50.5 s, sys: 10.6 s, total: 1min 1s
Wall time: 2min 20s
In [5]:
ds.data_vars
Out[5]:
Data variables:
    ADVx_TH   (time, k, tile, j, i_g) float32 dask.array<chunksize=(1, 50, 13, 90, 90), meta=np.ndarray>
    ADVy_TH   (time, k, tile, j_g, i) float32 dask.array<chunksize=(1, 50, 13, 90, 90), meta=np.ndarray>
    DFxE_TH   (time, k, tile, j, i_g) float32 dask.array<chunksize=(1, 50, 13, 90, 90), meta=np.ndarray>
    DFyE_TH   (time, k, tile, j_g, i) float32 dask.array<chunksize=(1, 50, 13, 90, 90), meta=np.ndarray>
    UVELMASS  (time, k, tile, j, i_g) float32 dask.array<chunksize=(1, 50, 13, 90, 90), meta=np.ndarray>
    VVELMASS  (time, k, tile, j_g, i) float32 dask.array<chunksize=(1, 50, 13, 90, 90), meta=np.ndarray>
In [6]:
ds.UVELMASS.data
Out[6]:
Array Chunk
Bytes 6.57 GB 21.06 MB
Shape (312, 50, 13, 90, 90) (1, 50, 13, 90, 90)
Count 1248 Tasks 312 Chunks
Type float32 numpy.ndarray
50 312 90 90 13
In [7]:
dsg = xr.open_dataset(f'{download_dir}/nctiles_grid/ECCO-GRID.nc')
In [8]:
print(f'Grid coordinate dataset is {dsg.nbytes/1e9:.02f} GB')
Grid coordinate dataset is 0.09 GB

The coordinates are necessary for computations, and is less than 1GB so safe to load explicitly (eagerly).

In [9]:
dsg.load();
In [10]:
ds = ds.merge(dsg)

Speed things up on an HPC cluster using dask

This notebook is running on frontera, a cutting edge supercomputer at the Texas Advanced Computing Center. Thanks to TACC for the computing resources!

In [11]:
client = Client(n_workers=56)
In [12]:
client
Out[12]:

Client

  • Scheduler: tcp://127.0.0.1:42440

Cluster

  • Workers: 56
  • Cores: 56
  • Memory: 200.62 GB

The LLC grid is somewhat complicated

In [13]:
ds.UVELMASS.isel(k=0).mean('time').plot(col='tile',col_wrap=5,figsize=(16,8));

One goal of this package is to make plotting and grid operations standardized and transparent

In [14]:
plt.rcParams.update({'figure.figsize':(12,6),'font.size':20,'axes.linewidth':1.5,'lines.linewidth':3});
In [15]:
ecco.plot_proj_to_latlon_grid(ds.XG,ds.YC,ds.UVELMASS.isel(k=0).mean('time'),show_colorbar=True,cmin=-.4,cmax=.4);
plt.title('Time mean surface currents showing how\nthe rotated tiles make velocity fields tricky');
distributed.nanny - WARNING - Restarting worker
distributed.nanny - WARNING - Restarting worker
distributed.nanny - WARNING - Restarting worker
distributed.nanny - WARNING - Restarting worker

Get U and V as an oceanographer would expect

In [16]:
%%time
uvel,vvel = ecco.vector_calc.UEVNfromUXVY(ds.UVELMASS,ds.VVELMASS,ds)
CPU times: user 2.04 s, sys: 30.3 ms, total: 2.07 s
Wall time: 2.06 s
In [17]:
ecco.plot_proj_to_latlon_grid(ds.XG,ds.YC,uvel.isel(k=0).mean('time'),show_colorbar=True,cmin=-.4,cmax=.4);
distributed.nanny - WARNING - Restarting worker
distributed.nanny - WARNING - Restarting worker
distributed.nanny - WARNING - Restarting worker
distributed.nanny - WARNING - Restarting worker
distributed.utils_perf - WARNING - full garbage collections took 12% CPU time recently (threshold: 10%)
distributed.nanny - WARNING - Restarting worker
distributed.utils_perf - WARNING - full garbage collections took 12% CPU time recently (threshold: 10%)
distributed.utils_perf - WARNING - full garbage collections took 11% CPU time recently (threshold: 10%)
distributed.utils_perf - WARNING - full garbage collections took 11% CPU time recently (threshold: 10%)
distributed.utils_perf - WARNING - full garbage collections took 11% CPU time recently (threshold: 10%)
distributed.utils_perf - WARNING - full garbage collections took 11% CPU time recently (threshold: 10%)
distributed.utils_perf - WARNING - full garbage collections took 11% CPU time recently (threshold: 10%)
distributed.utils_perf - WARNING - full garbage collections took 11% CPU time recently (threshold: 10%)
distributed.utils_perf - WARNING - full garbage collections took 11% CPU time recently (threshold: 10%)
distributed.utils_perf - WARNING - full garbage collections took 11% CPU time recently (threshold: 10%)
distributed.utils_perf - WARNING - full garbage collections took 11% CPU time recently (threshold: 10%)
distributed.utils_perf - WARNING - full garbage collections took 11% CPU time recently (threshold: 10%)
distributed.utils_perf - WARNING - full garbage collections took 11% CPU time recently (threshold: 10%)
distributed.utils_perf - WARNING - full garbage collections took 11% CPU time recently (threshold: 10%)
distributed.utils_perf - WARNING - full garbage collections took 17% CPU time recently (threshold: 10%)
distributed.utils_perf - WARNING - full garbage collections took 17% CPU time recently (threshold: 10%)
distributed.utils_perf - WARNING - full garbage collections took 17% CPU time recently (threshold: 10%)
distributed.utils_perf - WARNING - full garbage collections took 17% CPU time recently (threshold: 10%)
distributed.utils_perf - WARNING - full garbage collections took 17% CPU time recently (threshold: 10%)
distributed.utils_perf - WARNING - full garbage collections took 17% CPU time recently (threshold: 10%)
distributed.utils_perf - WARNING - full garbage collections took 17% CPU time recently (threshold: 10%)
distributed.utils_perf - WARNING - full garbage collections took 17% CPU time recently (threshold: 10%)
distributed.utils_perf - WARNING - full garbage collections took 17% CPU time recently (threshold: 10%)
distributed.utils_perf - WARNING - full garbage collections took 17% CPU time recently (threshold: 10%)
distributed.utils_perf - WARNING - full garbage collections took 18% CPU time recently (threshold: 10%)
distributed.nanny - WARNING - Restarting worker
distributed.utils_perf - WARNING - full garbage collections took 16% CPU time recently (threshold: 10%)
distributed.utils_perf - WARNING - full garbage collections took 15% CPU time recently (threshold: 10%)
distributed.utils_perf - WARNING - full garbage collections took 15% CPU time recently (threshold: 10%)
distributed.utils_perf - WARNING - full garbage collections took 15% CPU time recently (threshold: 10%)
distributed.utils_perf - WARNING - full garbage collections took 15% CPU time recently (threshold: 10%)
distributed.utils_perf - WARNING - full garbage collections took 15% CPU time recently (threshold: 10%)
distributed.utils_perf - WARNING - full garbage collections took 15% CPU time recently (threshold: 10%)
distributed.utils_perf - WARNING - full garbage collections took 14% CPU time recently (threshold: 10%)
distributed.utils_perf - WARNING - full garbage collections took 14% CPU time recently (threshold: 10%)
distributed.utils_perf - WARNING - full garbage collections took 13% CPU time recently (threshold: 10%)
distributed.utils_perf - WARNING - full garbage collections took 13% CPU time recently (threshold: 10%)
distributed.utils_perf - WARNING - full garbage collections took 13% CPU time recently (threshold: 10%)
distributed.utils_perf - WARNING - full garbage collections took 13% CPU time recently (threshold: 10%)
distributed.utils_perf - WARNING - full garbage collections took 13% CPU time recently (threshold: 10%)
distributed.utils_perf - WARNING - full garbage collections took 13% CPU time recently (threshold: 10%)
distributed.utils_perf - WARNING - full garbage collections took 13% CPU time recently (threshold: 10%)
distributed.utils_perf - WARNING - full garbage collections took 14% CPU time recently (threshold: 10%)
distributed.utils_perf - WARNING - full garbage collections took 14% CPU time recently (threshold: 10%)
distributed.utils_perf - WARNING - full garbage collections took 14% CPU time recently (threshold: 10%)
distributed.utils_perf - WARNING - full garbage collections took 13% CPU time recently (threshold: 10%)
distributed.utils_perf - WARNING - full garbage collections took 13% CPU time recently (threshold: 10%)
distributed.utils_perf - WARNING - full garbage collections took 13% CPU time recently (threshold: 10%)
distributed.utils_perf - WARNING - full garbage collections took 13% CPU time recently (threshold: 10%)
distributed.utils_perf - WARNING - full garbage collections took 13% CPU time recently (threshold: 10%)
distributed.utils_perf - WARNING - full garbage collections took 13% CPU time recently (threshold: 10%)
distributed.utils_perf - WARNING - full garbage collections took 13% CPU time recently (threshold: 10%)
distributed.utils_perf - WARNING - full garbage collections took 13% CPU time recently (threshold: 10%)
distributed.utils_perf - WARNING - full garbage collections took 13% CPU time recently (threshold: 10%)
distributed.utils_perf - WARNING - full garbage collections took 13% CPU time recently (threshold: 10%)
distributed.utils_perf - WARNING - full garbage collections took 13% CPU time recently (threshold: 10%)
distributed.utils_perf - WARNING - full garbage collections took 13% CPU time recently (threshold: 10%)
distributed.utils_perf - WARNING - full garbage collections took 13% CPU time recently (threshold: 10%)
distributed.utils_perf - WARNING - full garbage collections took 13% CPU time recently (threshold: 10%)
distributed.utils_perf - WARNING - full garbage collections took 13% CPU time recently (threshold: 10%)
distributed.utils_perf - WARNING - full garbage collections took 14% CPU time recently (threshold: 10%)
distributed.utils_perf - WARNING - full garbage collections took 14% CPU time recently (threshold: 10%)
distributed.utils_perf - WARNING - full garbage collections took 14% CPU time recently (threshold: 10%)
distributed.utils_perf - WARNING - full garbage collections took 14% CPU time recently (threshold: 10%)
distributed.utils_perf - WARNING - full garbage collections took 14% CPU time recently (threshold: 10%)
distributed.utils_perf - WARNING - full garbage collections took 14% CPU time recently (threshold: 10%)
distributed.utils_perf - WARNING - full garbage collections took 14% CPU time recently (threshold: 10%)
distributed.utils_perf - WARNING - full garbage collections took 14% CPU time recently (threshold: 10%)
distributed.utils_perf - WARNING - full garbage collections took 14% CPU time recently (threshold: 10%)
distributed.utils_perf - WARNING - full garbage collections took 14% CPU time recently (threshold: 10%)
distributed.nanny - WARNING - Restarting worker
distributed.nanny - WARNING - Restarting worker
distributed.utils_perf - WARNING - full garbage collections took 13% CPU time recently (threshold: 10%)
distributed.utils_perf - WARNING - full garbage collections took 13% CPU time recently (threshold: 10%)
distributed.utils_perf - WARNING - full garbage collections took 13% CPU time recently (threshold: 10%)

Compute meridional heat transport across 26$^\circ$N in the Atlantic

See here for how to define other ocean basins, and other related functions such as volume, salt transport.

In [18]:
%%time
mht = ecco.calc_meridional_heat_trsp(ds,lat_vals=26,basin_name='atlExt')
loading  basins.data
data shape  (1170, 90)
dims, num_dims, llc  (1170, 90) 2 90
2 dimensions
f3 shape  (90, 90)
f5 shape  (90, 270)
2D, data_compact shape  (1170, 90)
data_tiles shape  (13, 90, 90)
data_tiles shape =  (13, 90, 90)
loading  basins.data
data shape  (1170, 90)
dims, num_dims, llc  (1170, 90) 2 90
2 dimensions
f3 shape  (90, 90)
f5 shape  (90, 270)
2D, data_compact shape  (1170, 90)
data_tiles shape  (13, 90, 90)
data_tiles shape =  (13, 90, 90)
distributed.utils_perf - WARNING - full garbage collections took 13% CPU time recently (threshold: 10%)
distributed.nanny - WARNING - Restarting worker
distributed.nanny - WARNING - Restarting worker
distributed.nanny - WARNING - Restarting worker
distributed.nanny - WARNING - Restarting worker
distributed.nanny - WARNING - Restarting worker
CPU times: user 7.65 s, sys: 700 ms, total: 8.35 s
Wall time: 8.73 s
In [20]:
%%time 
mht.load();
CPU times: user 0 ns, sys: 41 µs, total: 41 µs
Wall time: 49.6 µs
Out[20]:
<xarray.Dataset>
Dimensions:      (k: 50, lat: 1, time: 312)
Coordinates:
  * time         (time) datetime64[ns] 1992-01-16T12:00:00 ... 2017-12-16
  * k            (k) int64 0 1 2 3 4 5 6 7 8 9 ... 40 41 42 43 44 45 46 47 48 49
  * lat          (lat) int64 26
    PHrefC       (k) float32 49.05 147.15 245.25 ... 53574.863 57940.312
    drF          (k) float32 10.0 10.0 10.0 10.0 ... 387.5 410.5 433.5 456.5
    Z            (k) float32 -5.0 -15.0 -25.0 ... -5039.25 -5461.25 -5906.25
Data variables:
    heat_trsp_z  (time, k, lat) float64 0.5193 0.03655 ... 0.002312 -0.0004046
    heat_trsp    (time, lat) float64 0.6414 0.8293 1.076 ... 0.8356 0.9267
In [21]:
mht.heat_trsp.plot(xlim=('1992','2018'))
plt.grid();
In [22]:
def quick_depth_plot(xda): 
    fig,axs = plt.subplots(1,4,figsize=(18,6),sharey=True,sharex=True)
    
    # Time evolving
    plt.subplot(1,4,(1,3))
    xda.plot(y='Z',x='time',robust=True)
    plt.xlabel('')
    
    # time mean
    plt.subplot(1,4,4)
    xda.mean('time').plot(y='Z')
    plt.ylabel('')
    plt.grid();
    
    plt.tight_layout();
In [23]:
def depth_slice_plot(xda,z_mid=-500):

    for xda_slice in [xda.where(xda.Z>z_mid,drop=True),xda.where(xda.Z<z_mid,drop=True)]:
        quick_depth_plot(xda_slice)
In [24]:
depth_slice_plot(mht.heat_trsp_z)
distributed.utils_perf - WARNING - full garbage collections took 13% CPU time recently (threshold: 10%)

Compute Drake Passage volume transport

See here for pre-defined and arbitrarily designed sections

In [25]:
%%time
# easy as: ecco.calc_section_vol_trsp(ds,'Drake Passage')
# or:
dp = ecco.calc_section_vol_trsp(ds,pt1=[-68,-54],pt2=[-63,-66])
CPU times: user 152 ms, sys: 12.2 ms, total: 165 ms
Wall time: 142 ms
In [26]:
%%time
dp.load();
distributed.utils_perf - WARNING - full garbage collections took 13% CPU time recently (threshold: 10%)
distributed.utils_perf - WARNING - full garbage collections took 13% CPU time recently (threshold: 10%)
distributed.utils_perf - WARNING - full garbage collections took 13% CPU time recently (threshold: 10%)
distributed.nanny - WARNING - Restarting worker
distributed.nanny - WARNING - Restarting worker
distributed.nanny - WARNING - Restarting worker
CPU times: user 10.2 s, sys: 724 ms, total: 10.9 s
Wall time: 11.1 s
Out[26]:
<xarray.Dataset>
Dimensions:     (i: 90, i_g: 90, j: 90, j_g: 90, k: 50, tile: 13, time: 312)
Coordinates:
  * time        (time) datetime64[ns] 1992-01-16T12:00:00 ... 2017-12-16
  * k           (k) int64 0 1 2 3 4 5 6 7 8 9 ... 40 41 42 43 44 45 46 47 48 49
    PHrefC      (k) float32 49.05 147.15 245.25 ... 53574.863 57940.312
    drF         (k) float32 10.0 10.0 10.0 10.0 10.0 ... 387.5 410.5 433.5 456.5
    Z           (k) float32 -5.0 -15.0 -25.0 ... -5039.25 -5461.25 -5906.25
    timestep    (time) int64 732 1428 2172 2892 ... 225708 226452 227172 227904
  * tile        (tile) int64 0 1 2 3 4 5 6 7 8 9 10 11 12
  * j           (j) int64 0 1 2 3 4 5 6 7 8 9 ... 80 81 82 83 84 85 86 87 88 89
  * i_g         (i_g) int64 0 1 2 3 4 5 6 7 8 9 ... 81 82 83 84 85 86 87 88 89
  * j_g         (j_g) int64 0 1 2 3 4 5 6 7 8 9 ... 81 82 83 84 85 86 87 88 89
  * i           (i) int64 0 1 2 3 4 5 6 7 8 9 ... 80 81 82 83 84 85 86 87 88 89
Data variables:
    vol_trsp_z  (time, k) float64 1.83 1.553 1.239 1.107 ... 0.0 0.0 0.0 0.0
    vol_trsp    (time) float64 150.0 145.8 150.9 150.6 ... 147.3 148.3 148.3
    maskW       (tile, j, i_g) int64 0 0 0 0 0 0 0 0 0 0 ... 0 0 0 0 0 0 0 0 0 0
    maskS       (tile, j_g, i) int64 0 0 0 0 0 0 0 0 0 0 ... 0 0 0 0 0 0 0 0 0 0
In [27]:
dp.vol_trsp.plot(xlim=['1992','2018'])
plt.grid();
In [28]:
quick_depth_plot(dp.vol_trsp_z)
In [29]:
plt.rcParams.update({'figure.figsize':(10,10)})
In [30]:
ecco.plot_proj_to_latlon_grid(ds.XC,ds.YG,dp.maskS,projection_type='stereo',lat_lim=-45,show_grid_lines=False,cmap='gray_r');
plt.title('the mask used for computing\nDrake Passage transport');

Shoutouts

The functionality here mirrors gcmfaces in some ways, which we have Gael Forget to thank for.

This wraps nicely with existing infrastructure, which makes a lot of the underlying computations transparent and easy to read or modify, and scalable from laptops to servers. For this we rely on

  • xarray, which makes use of dask
  • xgcm: makes basic grid operations intuitive
  • xmitgcm: handling the I/O for MITgcm users

For these we especially thank Ryan Abernathey for collaboration.

Contributions are always welcome!

In [ ]: