AMOCatlas Format AC1
This document defines the AC1 (Atlantic Circulation) standard data format which is an OceanSITES variant with small deviations. This format provides interoperability between moored estimates of overturning transport from the RAPID, OSNAP, MOVE and SAMBA arrays while ensuring compliance with international oceanographic data standards. Note the deviations below (2.1 Deviations from OceanSITES Standard).
Relationship to Other Format Documents:
Array Format (Native / Original) - Documents native data formats from each array
Converting to OceanSITES format - Describes conversion strategies from native to standardized formats
AMOCatlas Format AC1 - Current standardized output format implementation
This document (format_AC1) - AC1 format with full OceanSITES integration
1. Overview & Context
The AC1 format incorporates OceanSITES v1.4 compliance for enhanced discoverability and interoperability, while adding additional metadata (vocabularies) and standardised naming conventions.
The AC1 format provides enhanced compliance with:
Full Standards Compliance: Complete implementation of CF Conventions 1.8, OceanSITES 1.4, and ACDD 1.3
Enhanced Discoverability: Rich metadata using controlled vocabularies for global data catalogs
Workflow Integration: Compatible with existing AMOCatlas workflows
International Interoperability: Compliance with OceanSITES and GDAC requirements
Provenance Tracking: Comprehensive attribution to original data providers
Extensibility: Supports future variables and array additions
1.1 Relationship to Standards
The AC1 format represents the standardization level in the AMOCatlas hierarchy:
Native Formats → Internal standardised → AC1 Standard
(format_orig) (format_Atlas) (format_AC1)
Compliance Framework: OceanSITES AC1 datasets are designed to meet:
CF Conventions 1.8 compliance (validation tools to be implemented)
OceanSITES 1.4 compatibility (with documented deviations as specified in 2.1 Deviations from OceanSITES Standard)
AMOCatlas-specific validation rules (see
amocatlas.compliance_checker)ACDD-1.3 metadata structure
Standards Integration: The format integrates multiple international standards:
CF Conventions 1.8: Climate and Forecast metadata conventions
OceanSITES 1.4: Ocean observing time series data format
ACDD 1.3: Attribute Convention for Data Discovery
NERC Vocabularies: Controlled vocabularies for oceanographic parameters
2. Key Design Decisions
The AC1 format incorporates several design decisions that enhance interoperability while maintaining scientific accuracy and usability.
2.1 Deviations from OceanSITES Standard
AC1 implements OceanSITES 1.4 with the following deviations optimized for AMOC array data:
Feature |
OceanSITES Standard |
AC1 Format |
|---|---|---|
Date Format |
|
Compact ISO 8601: |
Contributor Metadata |
|
|
Density Coordinates |
Depth/pressure coordinates only |
|
Component Dimension |
Not specified |
|
Coordinate Units |
|
|
Transport Units |
|
|
Rationale for Deviations:
Contributor Pattern: Unified
contributor_*approach simplifies metadata management while providing equivalent functionalitySigma Coordinates: Essential for density-based transport calculations in some arrays
Component Dimension: Enables systematic representation of transport decompositions across arrays
Coordinate Units: UDUNITS-2 singular forms provide better tool compatibility than OceanSITES plural forms
Sverdrup Unit: Full spelling prevents confusion with
Sv(sievert radiation unit)
These deviations maintain CF compliance and ISO 8601 compatibility while optimizing for AMOC-specific scientific requirements.
3. File Organisation & Naming
3.1 File Naming Convention
Files follow the OceanSITES naming pattern with AMOC-specific modifications:
Basic Pattern: OS_[PLATFORM]_[DEPLOYMENT]_[MODE]_[PARAMS].nc
Components:
OS= OceanSITES prefix (maintains compatibility)[PLATFORM]= Platform identifier (e.g., “RAPID”, “OSNAP”)[DEPLOYMENT]= Deployment code (e.g., “20040401-20230211” for date range)[MODE]= Data mode: R (real-time), P (provisional), D (delayed-mode)[PARAMS]= Parameter identifier (e.g., “transports_T12H”, “sections_T1M”)
Examples:
OS_RAPID_20040401-20230211_D_transports_T12H.nc- Delayed-mode transport dataOS_OSNAP_20140801-20200601_D_sections_T1M.nc- Delayed-mode section data
Reference: See OceanSITES file naming in “4.1.1 Deployment Data files Naming Convention” of the OceanSITES manual (https://ocean-uhh.github.io/oceanarray/oceanSITES_manual.html#data-files).
4. Global Attributes
Following OceanSITES 1.4, ACDD 1.3, and CF 1.8 requirements for comprehensive metadata.
Note
Requirement Status: M = Mandatory, HD = Highly Desired, S = Suggested
4.1 Discovery and Identification
Attribute |
Example |
Description |
Vocabulary |
RS |
|---|---|---|---|---|
|
“RAPID” |
OceanSITES site identifier |
OceanSITES Registry |
M |
|
“RAPID” |
Array grouping identifier |
Custom AMOCatlas |
M |
|
“D” |
Data mode: R=real-time, P=provisional, D=delayed |
OceanSITES Standard |
M |
|
“RAPID-MOCHA Transport Time Series” |
Human-readable dataset title |
Free text |
HD |
|
“Transport Moored Arrays” |
OceanSITES theme classification |
OceanSITES Themes |
S |
|
“AMOCatlas” |
Authority providing the dataset ID |
Reverse DNS recommended |
S |
|
“OS_RAPID_20040402-20240327_DPR_transports_T12H” |
Unique dataset identifier (filename without .nc) |
OceanSITES Pattern |
M |
|
“Oceanographic mooring data from the RAPID array at 26°N…” |
Extended description for discovery (≤100 words) |
Free text |
S |
|
“subsurface mooring” |
Platform type from controlled vocabulary |
SeaVoX L06 |
HD |
|
“EARTH SCIENCE > Oceans > Ocean Circulation” |
Discovery keywords (comma-separated) |
GCMD preferred |
S |
|
“GCMD Science Keywords” |
Vocabulary source for keywords |
Standards reference |
S |
|
“Preliminary version; subject to revision” |
Miscellaneous information |
Free text |
S |
4.2 Provenance
Consolidates OceanSITES creator_* and principal_investigator_* fields into unified contributor_* attributes supporting multiple contributors following OG1 patterns.
Attribute |
Example |
Description |
Vocabulary |
RS |
|---|---|---|---|---|
|
“Dr. Jane Doe, Dr. John Smith” |
Names of dataset contributors (comma-separated) |
Free text |
M |
|
Email addresses (aligned with names) |
Email format |
M |
|
|
Persistent IDs (ORCID preferred) |
ORCID/ISNI URLs |
HD |
|
|
“principalInvestigator, creator” |
Roles (aligned with names) |
NERC W08 |
M |
|
Vocabulary for contributor roles |
Standards reference |
M |
|
|
“University of Hamburg, National Oceanography Centre” |
Institutional contributors |
Free text |
M |
|
Institutional identifier vocabulary |
ROR/EDMO preferred |
HD |
|
|
“operator, dataProvider” |
Institutional roles |
NERC W08 |
M |
|
Vocabulary for institutional roles |
Standards reference |
M |
Standard Contributor Roles: Data scientist, Manufacturer, PI, Technical Coordinator, Operator, Owner
Provenance and Data History
Attribute |
Example |
Description |
Format |
RS |
|---|---|---|---|---|
|
“RAPID data collected and made freely available by the RAPID program…” |
Attribution to original data providers (semicolon-separated) |
Free text |
M |
|
“https://doi.org/10.35090/gatech/70342; https://doi.org/10.1029/2018GL077408” |
DOIs of source datasets (semicolon-separated) |
DOI URLs |
M |
|
“0.3.0” |
Version of amocatlas used for processing |
Semantic version |
M |
|
Links to project websites (semicolon-separated) |
URLs |
S |
|
|
“2004-04-02T00:00:00Z” |
Overall dataset start time |
ISO 8601 |
M |
|
DOI assigned to converted dataset (if available) |
DOI URL |
S |
4.3 Geospatial and Temporal Coverage
Attribute |
Example |
Description |
Format |
RS |
|---|---|---|---|---|
|
26.0 |
Southernmost latitude |
Decimal degrees |
M |
|
26.5 |
Northernmost latitude |
Decimal degrees |
M |
|
“degrees_north” |
Latitude units |
UDUNITS-2 |
S |
|
-80.0 |
Westernmost longitude |
Decimal degrees |
M |
|
-13.0 |
Easternmost longitude |
Decimal degrees |
M |
|
“degrees_east” |
Longitude units |
UDUNITS-2 |
S |
|
0.0 |
Minimum depth/height |
Meters |
M |
|
5000.0 |
Maximum depth/height |
Meters |
M |
|
“down” |
Vertical direction convention |
“up” or “down” |
S |
|
“m” |
Vertical coordinate units |
UDUNITS-2 |
S |
|
“2004-04-02T00:00:00Z” |
Dataset start time |
ISO 8601 |
M |
|
“2024-03-27T23:59:59Z” |
Dataset end time |
ISO 8601 |
M |
|
“P19Y11M25D” |
Dataset duration |
ISO 8601 Duration |
S |
|
“PT12H” |
Temporal resolution |
ISO 8601 Duration |
S |
|
“North Atlantic Ocean” |
Geographical coverage |
SeaVoX C19 |
S |
Time Format Rationale: The compact YYYYmmddTHHMMss format reduces attribute string length while maintaining human readability and ISO 8601 compatibility.
File dates: The file dates, date_created and date_modified, are our interpretation of the file dates as defined by ACDD. Date_created is the time stamp on the file, date_modified may be used to represent the ‘version date’ of the geophysical data in the file. The date_created may change when e.g. metadata is added or the file format is updated, and the optional date_modified MAY be earlier.
Geospatial extents: (geospatial_lat_min, max, and lon_min, max) are preferred to be stored as strings for use in the GDAC software, however numeric fields are acceptable. This information is linked to the site information, and may not be specific to the platform deployment.
4.4 Publication and Licensing
Attribute |
Example |
Description |
Format |
RS |
|---|---|---|---|---|
|
“AMOCatlas Development Team” |
Data publisher name |
Free text |
S |
|
Publisher web address |
URL |
S |
|
|
“http://www.oceansites.org, https://doi.org/10.1029/2018GL077408” |
Relevant publications and resources (semicolon-separated) |
URLs/DOIs |
S |
|
“CC-BY-4.0” |
Data license |
License identifier |
S |
|
“These data were collected and made freely available by the OceanSITES program…” |
Recommended citation text |
Free text |
S |
|
“Principal funding provided by Horizon Europe EPOC project…” |
Funding and support acknowledgements |
Free text |
S |
4.5 Technical and Processing Information
Attribute |
Example |
Description |
Format |
RS |
|---|---|---|---|---|
|
“timeSeries” |
CF discrete sampling geometry type |
CF Standard |
M |
|
“OceanSITES time-series data” |
OceanSITES data type classification |
OceanSITES Standard |
M |
|
“1.4” |
OceanSITES format version |
Version string |
M |
|
“CF-1.8, OceanSITES-1.4, ACDD-1.3” |
Metadata conventions followed |
Standards list |
S |
|
“RAPID26N” |
Unique platform identifier |
Free text |
M |
|
“excellent” |
Overall quality assessment |
OceanSITES QC levels |
S |
|
“Data verified against model or other contextual information” |
Processing level description |
OceanSITES levels |
S |
|
“2025-01-15T10:30:00Z” |
File creation timestamp |
ISO 8601 |
M |
|
“2025-01-15T10:30:00Z” |
Last modification timestamp |
ISO 8601 |
S |
|
“2025-01-15T10:30:00Z: Converted to AC1 using amocatlas v0.3.0” |
Processing history log |
Timestamped entries |
S |
5. Dimensions & Coordinates
Following CF conventions, dimensions are ordered as T, Z, Y, X with component dimensions leftmost:
Category |
Dimensions |
Description |
|---|---|---|
Component |
|
Transport components (optional) |
Temporal |
|
Time coordinate (unlimited) |
Vertical |
|
Vertical coordinates (optional) |
Horizontal |
|
Horizontal coordinates (optional) |
Warning
All datasets must include the TIME dimension. Other dimensions are optional depending on data type (timeSeries vs timeSeriesProfile).
Variable |
Dimension |
Attributes and Requirements |
RS |
|---|---|---|---|
|
|
Data Type: double (datetime64[ns]) Required Attributes:
|
M |
|
scalar or |
Data Type: float32 Required Attributes:
|
HD |
|
scalar or |
Data Type: float32 Required Attributes:
|
HD |
|
|
Data Type: float32 Required Attributes:
|
S |
|
|
Data Type: float32 Required Attributes:
|
S |
|
|
Data Type: float32 Required Attributes:
|
S |
6. Data Variables & QC
6.1 Transport Variables
Variable Name |
Variable Attributes |
RS |
|---|---|---|
|
|
HD |
|
|
HD |
|
|
S |
|
|
S |
6.2 Hydrographic Variables
Variable Name |
Variable Attributes |
RS |
|---|---|---|
|
|
HD |
|
|
HD |
|
|
S |
|
|
S |
6.3 Descriptive Variables
Variable Name |
Variable Attributes |
RS |
|---|---|---|
|
|
HD |
|
|
S |
Note
Requirement Status: M = Mandatory, HD = Highly Desired, S = Suggested
6.4 Variable-Level Quality Control
For variables requiring quality control, implement OceanSITES QC conventions:
QC Variable |
Dimensions |
Attributes and Values |
RS |
|---|---|---|---|
|
Same as parent variable |
Data Type: byte Required Attributes: - long_name = “Quality flag for <parameter_name>” - flag_values = [0, 1, 2, 3, 4, 7, 8, 9] - flag_meanings = “unknown good_data probably_good_data potentially_correctable_bad_data bad_data nominal_value interpolated_value missing_value” - valid_min = 0 - valid_max = 9 |
S |
|
Same as parent variable |
Data Type: float32 Required Attributes: - long_name = “Uncertainty estimate for <parameter_name>” - units = Same as parent variable - technique_title = “Description of uncertainty estimation method” |
S |
7. Conversion Tools and Implementation
7.1 Enhanced Conversion Function
To produce AC1 compliant datasets from standardized inputs:
from amocatlas.convert import to_AC1
# Convert with enhanced metadata
ds_ac1 = to_AC1(
ds_standardized,
array_metadata_yaml="metadata/rapid_array.yml",
validate=True,
gdac_compliant=True
)
7.2 Conversion Process
The conversion function performs these operations:
Input Validation: Verify standardized dataset structure
Metadata Integration: Load and apply array-specific YAML metadata
Attribute Enhancement: Add comprehensive global attributes following OceanSITES/ACDD standards
Variable Standardization: Ensure proper standard names, units, and vocabularies
Quality Control: Apply QC flags and uncertainty estimates where available
File Naming: Generate OceanSITES-compliant filename
Compliance Validation: Run CF Checker and OceanSITES validation
Output Generation: Write NetCDF4 file with optimal compression and chunking
7.3 Validation Tools
All AC datasets must pass comprehensive validation:
Validation Category |
Requirements |
|---|---|
File Naming |
Must match OceanSITES pattern: |
Global Attributes |
All mandatory (M) attributes must be present with valid values |
Coordinate Variables |
TIME dimension required; appropriate axis attributes; valid units |
Data Variables |
Valid standard_name attributes; UDUNITS-2 compliant units; appropriate _FillValue |
CF Compliance |
Must pass CF Checker with zero errors |
OceanSITES Compliance |
Must meet OceanSITES 1.4 requirements for GDAC submission |
Vocabulary Compliance |
All controlled vocabulary references must resolve to valid terms |
from amocatlas.validation import validate_AC_proposed
# Comprehensive validation
validation_result = validate_AC_proposed(
"OS_RAPID_20040402-20240327_DPR_transports_T12H.nc",
checks=["cf", "oceansites", "acdd", "amocatlas"]
)
if validation_result.is_valid:
print("Dataset is fully compliant with AC1 format")
else:
print("Validation errors:", validation_result.errors)
8. Examples and Use Cases
8.1 RAPID Transport Time Series Example
File: OS_RAPID_20040402-20240327_DPR_transports_T12H.nc
netcdf OS_RAPID_20040402-20240327_DPR_transports_T12H {
dimensions:
TIME = UNLIMITED ; // (14600 currently)
N_COMPONENT = 8 ;
LATITUDE = 1 ;
variables:
double TIME(TIME) ;
TIME:long_name = "Time" ;
TIME:standard_name = "time" ;
TIME:units = "seconds since 1970-01-01T00:00:00Z" ;
TIME:calendar = "gregorian" ;
TIME:axis = "T" ;
float LATITUDE(LATITUDE) ;
LATITUDE:long_name = "Latitude" ;
LATITUDE:standard_name = "latitude" ;
LATITUDE:units = "degrees_north" ;
LATITUDE:valid_min = -90.0f ;
LATITUDE:valid_max = 90.0f ;
LATITUDE:axis = "Y" ;
float MOC_TRANSPORT(TIME) ;
MOC_TRANSPORT:long_name = "Maximum meridional overturning circulation transport" ;
MOC_TRANSPORT:standard_name = "ocean_volume_transport_across_line" ;
MOC_TRANSPORT:units = "sverdrup" ;
MOC_TRANSPORT:coordinates = "TIME" ;
MOC_TRANSPORT:_FillValue = NaNf ;
MOC_TRANSPORT:vocabulary = "http://vocab.nerc.ac.uk/collection/P07/current/W946809H/" ;
float TRANSPORT(N_COMPONENT, TIME) ;
TRANSPORT:long_name = "Ocean volume transport components across line" ;
TRANSPORT:standard_name = "ocean_volume_transport_across_line" ;
TRANSPORT:units = "sverdrup" ;
TRANSPORT:coordinates = "TIME" ;
TRANSPORT:_FillValue = NaNf ;
string TRANSPORT_NAME(N_COMPONENT) ;
TRANSPORT_NAME:long_name = "Transport component names" ;
TRANSPORT_NAME:coordinates = "N_COMPONENT" ;
// global attributes:
:Conventions = "CF-1.8, OceanSITES-1.4, ACDD-1.3" ;
:format_version = "1.4" ;
:data_type = "OceanSITES time-series data" ;
:featureType = "timeSeries" ;
:data_mode = "D" ;
:site_code = "RAPID" ;
:array = "RAPID" ;
:platform_code = "RAPID26N" ;
:naming_authority = "AMOCatlas" ;
:id = "OS_RAPID_20040402-20240327_DPR_transports_T12H" ;
:title = "RAPID-MOCHA Transport Time Series at 26°N" ;
:summary = "Meridional overturning circulation and component transports from the RAPID mooring array at 26°N in the Atlantic Ocean. Data processed to 12-hourly resolution with comprehensive quality control." ;
:geospatial_lat_min = 26.0 ;
:geospatial_lat_max = 26.5 ;
:geospatial_lon_min = -80.0 ;
:geospatial_lon_max = -13.0 ;
:time_coverage_start = "2004-04-02T00:00:00Z" ;
:time_coverage_end = "2024-03-27T23:59:59Z" ;
:contributor_name = "Dr. David Smeed, Dr. Molly Baringer" ;
:contributor_email = "david.smeed@noc.ac.uk, molly.baringer@noaa.gov" ;
:contributor_role = "principalInvestigator, principalInvestigator" ;
:source_acknowledgement = "RAPID data were collected and made freely available by the RAPID program and the national programs that contribute to it" ;
:source_doi = "https://doi.org/10.5285/8cd7e7bb-9a20-05d8-e053-6c86abc012c2" ;
:amocatlas_version = "0.3.0" ;
:date_created = "2025-01-15T10:30:00Z" ;
:history = "2025-01-15T10:30:00Z: Converted to AC1 using amocatlas v0.3.0" ;
}
9. Reference Tables
9.1 UDUNITS-2 Compliance
All units must follow the UDUNITS-2 standard for maximum compatibility and interoperability.
Quantity |
UDUNITS Format |
Notes |
|---|---|---|
Coordinates |
||
Time |
|
ISO 8601 epoch reference (Unix timestamp) |
Latitude |
|
UDUNITS-2 standard (singular form) |
Longitude |
|
UDUNITS-2 standard (singular form) |
Depth |
|
Standard SI unit, positive downward |
Pressure |
|
Standard oceanographic unit (decibars) |
Density |
|
SI derived unit for sigma coordinates |
Physical Variables |
||
Temperature |
|
Preferred over |
Salinity |
|
Dimensionless (practical salinity scale) |
Velocity |
|
SI derived unit (not |
Transport Variables |
||
Ocean Volume Transport |
|
1 sverdrup = 10^6 m³/s (avoid |
Heat Transport |
|
1 PW = 10^15 W (preferred over |
Freshwater Transport |
|
Same as volume transport |
Warning
Use lowercase sverdrup (not Sv) to avoid confusion with the sievert radiation unit. UDUNITS-2 recognizes sverdrup as the standard oceanographic transport unit.
9.2 OceanSITES Reference table 1: data_type
The data_type global attribute should have one of the valid values listed here.
Data type |
|---|
OceanSITES profile data |
OceanSITES time-series data |
OceanSITES trajectory data |
9.3 OceanSITES Reference table 2: QC_indicator
The quality control flags indicate the data quality of the data values in a file. The byte codes in column 1 are used only in the <PARAM>_QC variables to describe the quality of each measurement, the strings in column 2 (‘meaning’) are used in the attribute <PARAM>:QC_indicator to describe the overall quality of the parameter.
When the numeric codes are used, the flag_values and flag_meanings attributes are required and should contain lists of the codes (comma-separated) and their meanings (space separated, replacing spaces within each meaning by ‘_’).
Code |
Meaning |
Comment |
|---|---|---|
0 |
unknown |
No QC was performed |
1 |
good data |
All QC tests passed. |
2 |
probably good data |
|
3 |
potentially correctable bad data |
These data are not to be used without scientific correction or re-calibration. |
4 |
bad data |
Data have failed one or more tests. |
5 |
Not used |
|
6 |
Not used. |
|
7 |
nominal value |
Data were not observed but reported. (e.g. instrument target depth.) |
8 |
interpolated value |
Missing data may be interpolated from neighboring data in space or time. |
9 |
missing value |
This is a fill value |
9.4 OceanSITES Reference table 3: Processing level
This table describes the quality control and other processing procedures applied to all the measurements of a variable. The string values are used as an overall indicator (i.e. one summarizing all measurements) in the attributes of each variable in the processing_level attribute.
Processing Level |
|---|
Raw instrument data |
Instrument data that has been converted to geophysical values |
Post-recovery calibrations have been applied |
Data has been scaled using contextual information |
Known bad data has been replaced with null values |
Known bad data has been replaced with values based on surrounding data |
Ranges applied, bad data flagged |
Data interpolated |
Data manually reviewed |
Data verified against model or other contextual information |
Other QC process applied |
9.5 OceanSITES Reference table 4: Data mode
The values for the variables “<PARAM>_DM”, the global attribute “data_mode”, and variable attributes “<PARAM>:DM_indicator” are defined as follows:
Value |
Meaning |
Description |
|---|---|---|
R |
Real-time data |
Data coming from the (typically remote) platform through a communication channel without physical access to the instruments, disassembly or recovery of the platform. Example: for a mooring with a radio communication, this would be data obtained through the radio. |
P |
Provisional data |
Data obtained after instruments have been recovered or serviced; some calibrations or editing may have been done, but the data is not thought to be fully processed. Refer to the history attribute for more detailed information. |
D |
Delayed-mode data |
Data published after all calibrations and quality control procedures have been applied on the internally recorded or best available original data. This is the best possible version of processed data. |
M |
Mixed |
This value is only allowed in the global attribute “data_mode” or in attributes to variables in the form “<PARAM>:DM_indicator”. It indicates that the file contains data in more than one of the above states. In this case, the variable(s) <PARAM>_DM specify which data is in which data mode. |
9.6 OceanSITES Reference Table 6: Identifying data variables (subset)
Parameter |
CF Standard name or suggested Long name |
|---|---|
CDIR |
direction_of_sea_water_velocity |
CNDC |
sea_water_electrical_conductivity |
CSPD |
sea_water_speed |
DEPTH |
depth |
DOX2 |
moles_of_oxygen_per_unit_mass_in_sea_water was dissolved_oxygen |
DOXY |
mass_concentration_of_oxygen_in_sea_water was dissolved_oxygen |
DOXY_TEMP |
temperature_of_sensor_for_oxygen_in_sea_water |
DYNHT |
dynamic_height |
FLU2 |
fluorescence |
HCSP |
sea_water_speed |
HEAT |
heat_content |
ISO17 |
isotherm_depth |
PCO2 |
surface_partial_pressure_of_carbon_dioxide_in_air |
PRES |
sea_water_pressure |
PSAL |
sea_water_practical_salinity |
TEMP |
sea_water_temperature |
UCUR |
eastward_sea_water_velocity |
VCUR |
northward_sea_water_velocity |
10. Metadata Requirements and YAML Integration
10.1 Array-Specific Metadata Files
Metadata are provided as enhanced YAML files for each array, defining variable mappings, unit conversions, attributes, and contributor information.
Enhanced YAML Structure (osnap_array.yml):
# Array identification
array:
name: "OSNAP"
site_code: "OSNAP"
platform_code: "OSNAP60N"
sea_area: "North Atlantic Ocean"
# Spatial coverage
geospatial:
lat_min: 59.0
lat_max: 61.0
lon_min: -45.0
lon_max: -10.0
vertical_min: 0.0
vertical_max: 3000.0
# Contributors
contributors:
- name: "Susan Lozier"
email: "susan.lozier@duke.edu"
orcid: "https://orcid.org/0000-0002-1234-5678"
role: "PI"
institution: "Duke University"
institution_ror: "https://ror.org/00py81415"
institution_role: "operator"
# Variable definitions
variables:
temp:
name: TEMPERATURE
long_name: "Sea water temperature"
standard_name: "sea_water_temperature"
units: "degree_Celsius"
vocabulary: "https://vocab.nerc.ac.uk/collection/P07/current/CFSN0335/"
valid_min: -2.0
valid_max: 40.0
sal:
name: SALINITY
long_name: "Sea water practical salinity"
standard_name: "sea_water_practical_salinity"
units: "1"
vocabulary: "http://vocab.nerc.ac.uk/collection/P07/current/IADIHDIJ/"
valid_min: 0.0
valid_max: 50.0
moc_transport:
name: MOC_TRANSPORT
long_name: "Atlantic meridional overturning circulation transport"
standard_name: "ocean_volume_transport_across_line"
units: "sverdrup"
vocabulary: "http://vocab.nerc.ac.uk/collection/P07/current/W946809H/"
# Provenance
provenance:
source_acknowledgement: "OSNAP data were collected and made freely available by the OSNAP project and all the national programs that contribute to it (www.o-snap.org)"
source_doi: "https://doi.org/10.35090/gatech/70342"
web_link: "https://www.o-snap.org/"
# Processing
processing:
qc_indicator: "excellent"
processing_level: "Data verified against model or other contextual information"
11. Future Development and Extensions
11.1 Planned Enhancements
Multi-Array Integration: Support for datasets combining multiple arrays
Real-Time Data Streams: Extensions for operational oceanography
Machine-Readable Provenance: Integration with Research Data Alliance metadata standards
Cloud-Optimized Formats: Zarr and COG variants for cloud computing
11.2 Community Integration
AC1 format is designed for:
OceanSITES GDAC Submission: Full compliance for global data archive
CMIP Integration: Compatible with climate model evaluation workflows
ARGO Coordination: Harmonized with autonomous profiling float data standards
Regional Programs: Adaptable for other ocean observing arrays globally
12. Summary and Recommendations
The AC1 format represents the next evolution of AMOCatlas data standardization, combining the proven AC1 implementation with comprehensive international standards compliance. Key benefits include:
For Data Providers: - Simplified workflow for OceanSITES GDAC submission - Enhanced discoverability through rich metadata - Maintained compatibility with existing tools
For Data Users: - Consistent interface across all AMOC arrays - Full metadata for proper data citation and attribution - Guaranteed interoperability with international tools and standards
For the Community: - Foundation for global AMOC data integration - Template for other observing array programs - Future-ready architecture for emerging requirements
We recommend adopting AC1 format for all new AMOCatlas releases while maintaining AC1 support for existing workflows. The enhanced metadata and standards compliance provide immediate value for data discovery and long-term preservation while ensuring continued scientific productivity.
—
Project Funding: AC1 format development is supported by the Horizon Europe project EPOC - Explaining and Predicting the Ocean Conveyor (Grant Agreement No. 101081012).
Funded by the European Union. Views and opinions expressed are however those of the author(s) only and do not necessarily reflect those of the European Union. Neither the European Union nor the granting authority can be held responsible for them.