Converting to OceanSITES format

This document describes some of the native data formats present in AMOC datasets provided by different observing arrays.

In the logic of amocarray, we will first convert to an OceanSITES compatible format. Documentation is outlined in the OceanSITES format.

Note: This is a work in progress and not all arrays are fully described. The goal is to provide a summary of the data formats and how they could be transformed into a common format. The common format is not yet defined but will ideally be able to capture most if not all of the original data.

Table of Contents

OSNAP conversion thoughts

At OSNAP, we have variables like MOC_ALL, MOC_EAST and MOC_WEST which are time series (TIME), but these could be represented as MOC (N_PROF, TIME) where instead of the three different variables, N_PROF=3. This would be somewhat more difficult to communicate to the user, since LATITUDE and LONGITUDE are not single points per N_PROF but instead may represent end points of a section.

Variables MOC_ALL_ERR are also provided, which could be translated to MOC_ERR (N_PROF, TIME) with LATITUDE (N_PROF) or LATITUDE_BOUND (N_PROF, 2).

Heat fluxes also exist, as MHT_ALL, MHT_EAST and MHT_WEST, so these could be MHT (N_PROF, TIME).

Potential reformats:

  • Overturning: - MOC and MOC_ERR: time series (dimension: TIME, ``N_LOCATION``=3) where ``N_LOCATION``=3 (e.g. MOC_ALL, MOC_EAST, MOC_WEST)

    • STREAMFUNCTION: (N_LEVELS, TIME, ``N_PROF``=3) - This would be from ``OSNAP_Streamfunction_201408_202006_2023.nc``and is the overturning streamfunction in sigma-theta coordinates.

    • MHT and MHT_ERR: same dimensions as MOC

    • MFT and MFT_ERR: same dimensions as MOC

    • LATITUDE_BOUND: (N_LOCATION, 3) - this would be the latitude bounds for the west, east and full.

    • LONGITUDE_BOUND: (N_LOCATION, 3) - this would be the longitude bounds for the west, east and full.

  • Gridded sections: TEMPERATURE, SALINITY, VELOCITY

    • Dimensions: TIME, N_PROF, N_LEVELS (71, depth=199, longitude=256)

    • Coordinates: LATITUDE, LONGITUDE (N_PROF``=longitude grid,), ``TIME in datetime. And DEPTH (N_LEVELS,)

    • Variables: TEMPERATURE, SALINITY, VELOCITY (TIME, N_PROF, N_LEVELS). Attributes would specify units and the version of temperature/salinity. and specifying what version of temperature/salinity. The flags would have an attribute describing what the values mean (e.g. “1=good, 2=bad, etc”).

RAPID conversion thoughts

For example, at 26°N, the RAPID array produces an AMOC transport time series (volume transport in depth space) which is a 1-dimensional time series with a single registered latitude (26.5) and no registered longitude. It also provides profiles of temperature, salinity and dynamic height representing individual locations (single latitude, single nominal longitude) on a vertical grid of 20 dbar. Several locations are provided, with names like WB, MAR_WEST, MAR_EAST, EB. So there are N_PROF locations, with N_LEVELS and also TIME as dimensions. And the LATITUDE would be N_PROF (a small number, like 4, representing mooring locations)

More recently, they have started providing a section of temperature, saliity and velocity which are then N_PROF, TIME and N_LEVELS, but now the N_PROF (and both LONGITUDE and LATITUDE) would be on a regular grid–or at least with more locations (longer N_PROF), though it’s possible LATITUDE would be a single latitude (26.5).

RAPID also provides layer transports which are single time series with names like t_therm10, t_aiw10, t_ud10, t_ld10, etc, which are between specified depth ranges. These could be simply: TRANSPORT (N_LEVELS, TIME) with DEPTH_BOUND (N_LEVELS, 2) to give an upper and lower bound on the depths used to produce transport in layers? It would also need something like TRANSPORT_NAME (N_LEVELS) of type string.

Check CF conventions for standard names: https://github.com/cf-convention/vocabularies/issues. Note that standard names consist of lower-letters, digits and underscores, and begin with a letter. Upper case is not used. See [here](https://cfconventions.org/Data/cf-standard-names/docs/guidelines.html).

  • moc_vertical.nc:

    • Convert to OceanSITES: Here, we should change the dimension to all-caps DEPTH and TIME. Units on the streamfunction should be sverdrup to de-confliict with Sv for sievert. According to OceanSITES, the order of the variables should be T, Z, Y, X, so the streamfunction should be (TIME, DEPTH). The filename should be something like OS_RAPID_YYYYMMDD-YYYYMMDD_DPR_moc_vertical.nc. Here, we are using the OS prefix, RAPID as the PlatformCode, the date start and end for the DeploymentCode, and the data mode is DPR for derived product. The additional text after is the original filename, moc_vertical.nc.

  • ts_gridded.nc:

  • Convert to OceanSITES: Dimensions should be TIME and DEPTH, where the coordinate name can be PRES for pressure. The featureType global attribute can be timeSeriesProfile.

  • moc_transports.nc:

  • meridional_transports.nc:

Potential reformats:

Key Products:

  • Overturning: - MOC: time series (dimension: TIME)

    • STREAMFUNCTION: (DEPTH, TIME) - this is the vertical profile of MOC (originally stream_function_mar in moc_vertical.nc, note that this extends deeper than the depth grid in ts_gridded.nc due to the incorporation of an AABW profile).

  • Profiles: TEMPERATURE, SALINITY, vertically gridded at mooring locations.

    • Dimensions: TIME, N_PROF, N_LEVELS (242,1)

    • Coordinates: LATITUDE, LONGITUDE (N_PROF``=5,) - these would be the locations of the profiles, which are current in the "long name" for each of the ``TG_west, TG_east, TG_wb3, TG_MARWEST, TG_mareast. etc. TIME in datetime. And PRESSURE (N_LEVELS,) - this is the depth grid in ts_gridded.nc.

    • Variables: TEMPERATURE, SALINITY, TEMPERATURE_FLAG, SALINITY_FLAG (TIME, N_PROF, N_LEVELS). Attributes would specify units and the version of temperature/salinity. and specifying what version of temperature/salinity. The flags would have an attribute describing what the values mean (e.g. “1=good, 2=bad, etc”).

  • Gridded sections: TEMPERATURE, SALINITY, VELOCITY

    • Dimensions: TIME, N_PROF, N_LEVELS (13000, longitude grid, 242?)

    • Coordinates: LATITUDE, LONGITUDE (N_PROF``=longitude grid,), ``TIME in datetime. And PRESSURE (N_LEVELS,)

    • Variables: TEMPERATURE, SALINITY, VELOCITY (TIME, N_PROF, N_LEVELS). Attributes would specify units and the version of temperature/salinity. and specifying what version of temperature/salinity. The flags would have an attribute describing what the values mean (e.g. “1=good, 2=bad, etc”).

  • Layer transports:

    • Dimensions: TIME, N_LEVELS (13779, 5)

    • Coordinates: LATITUDE, LONGITUDE_BOUNDS (scalar, x2), TIME in datetime. And DEPTH_BOUND (N_LEVELS, 2) - this would be the depth bounds for the transport layers.

    • Variables: TRANSPORT (TIME, N_LEVELS) - this would be the time series of transport in layers. This would also have DEPTH_BOUND (N_LEVELS, 2) to give an upper and lower bound on the depths used to produce transport in layers. It would also need something like TRANSPORT_NAME (N_LEVELS, string) to indicate what the layer is (e.g. t_therm10, t_aiw10, etc).

  • Component transports:

    • Dimensions: TIME, N_COMPONENT (13779, 5)

    • Coordinates: LATITUDE, LONGITUDE_BOUNDS (scalar, x2), TIME in datetime. N_COMPONENT for the number of components.

    • Variables: TRANSPORT (TIME, N_COMPONENT) - This would also have TRANSPORT_NAME (N_COMPONENT, string) to indicate what the component is (e.g. t_gs10, t_ek10, etc). This would be similar to the layer transport but without the depth bounds.

MOVE conversion thoughts

MOVE provides the TRANSPORT_TOTAL which corresponds to the MOC, but also things like transport_component_internal (TIME,), transport_component_internal_offset (TIME,), and transport_component_boundary (TIME,). This would be similar to RAPID’s version of “interior transport” and “western boundary wedge”, but it’s not so clear how to make these similarly named.

  • Notes: Similar in structure to RAPID layer decomposition but naming is inconsistent between RAPID and MOVE.

Potential reformats:

  • Overturning: - MOC: time series (dimension: TIME)

  • Component transports:

    • Dimensions: TIME, N_COMPONENT (13779, 3)

    • Coordinates: LATITUDE, LONGITUDE_BOUNDS (scalar, x2), TIME in datetime. N_COMPONENT for the number of components.

    • Variables: TRANSPORT (TIME, N_COMPONENT) - This would also have TRANSPORT_NAME (N_COMPONENT, string) to indicate what the component is (e.g. transport_component_internal, transport_component_internal_offset, transport_component_boundary, etc).

SAMBA conversion thoughts

SAMBA (Upper_Abyssal_Transport_Anomalies.txt) has two main variables which are (TIME,), named ‘upper-cell volume transport anomaly’ which suggests a quantity TRANSPORT_ANOMALY (N_LEVELS, TIME), where we would then have again a DEPTH_BOUND (N_LEVELS, 2).

But the other SAMBA product (MOC_TotalAnomaly_and_constituents.asc) also has a “Total MOC anomaly” (MOC), a “Relative (density gradient) contribution” which is like MOVE’s internal or RAPID’s interior. There is a “Reference (bottom pressure gradient) contribution” which is like MOVE’s offset or RAPID’s compensation. An Ekman (all have this–will need an attribute with the source of the wind fields used), and also a separate “Western density contribution” and “Eastern density contribution” which are not available in the RAPID project, and are not the same idea as the OSNAP west and OSNAP east, but could suggest an (N_PROF``=2, ``TIME) for west and east.

Potential reformats:

  • Overturning:

    • MOC: time series (dimension: TIME)

Note: Check the readme to see what the relationship is between the upper, abyssal and MOC transports.

  • Component transports:

    • Dimensions: TIME, N_COMPONENT (1404, 7)

    • Coordinates: LATITUDE, LONGITUDE_BOUNDS (scalar, x2), TIME in datetime. N_COMPONENT for the number of components.

    • Variables: TRANSPORT (TIME, N_COMPONENT) - This would also have TRANSPORT_NAME (N_COMPONENT, string) to indicate what the component is (e.g. RELATIVE_MOC, BAROTROPIC_MOC, EKMAN, WESTERN_DENSITY, etc).

Note: It would be good to verify how these components should (or shouldn’t) add up to the total transports.