Migrating from DCAT-US v1.1 to v3.0
Overview
A step-by-step guide to updating your existing v1.1 data.json file to be valid against the DCAT-US v3.0 schema.
Source
Category
Keywords
Details
THIS PAGE IS A TECHNICAL GUIDE AND NOT MEANT TO PROVIDE IMPLEMENTATION GUIDANCE FOR M25-05 OR SPECIFIC FIELD DEFINITIONS. Want to know when that happens? Email us at DataGovHelp@gsa.gov, and we’ll let you know!
See an error on this page or have other feedback? Email us at DataGovHelp@gsa.gov
Overview
This guide covers the changes needed to make your existing v1.1 data.json file valid against the v3.0 schema. Each step below describes what needs to change and why.
If your agency has a developer or IT team, there is also a Python script developed to implement these changes and verify DCAT-US3.0 validity. See Using the migration script below.
Data.gov has tested this migration against 128 federal harvest sources covering 174,847 datasets. Before transformation, 15,955 datasets had validation errors against the v1.1 schema. After running the migration script and validating against DCAT-US3.0, that number dropped to 4,641 – a 71% reduction in invalid records and a 79% reduction in total validation errors. If your agency has a developer or IT team, hand them the migration script and point them to this page.
Using the migration script
A Python script is available that transforms the fields described in this guide. If your agency has a developer or IT staff, they can use this script to convert your agency’s data.json file automatically instead of making changes manually. This can also be integrated into your catalog system to export these fields correctly at catalog (DCAT-US3.0 file) creation time.
There are two files in the GSA/dcat-us repository that relate to the transformation from DCAT-US1.1 to 3.0:
-
transforms.py contains the individual translation functions, one per field. Developers who want to integrate the migration logic into their own systems can use these functions directly and/or review the logic for implementation.
-
convert_dcat_1_1_to_3_0.py is the full conversion tool. It fetches your agency’s catalog from a URL, runs all the field transformations, validates the result against the v3.0 schema, and writes the converted catalog to a file.
Important: The conversion tool fetches your catalog from a live URL. This was designed to test current DCAT-US1.1 implementations. The script will require modification for any direct implementation in your metadata pipeline.
Your agency can use these scripts as-is, adapt them, or use them as a reference when building your own migration process. Data.gov does not run these scripts on behalf of agencies.
The current conversion tool applies the catalog-level updates in convert_dcat_1_1_to_3_0.py, then runs dataset-level transforms. The overview of each step is below.
Please note not all steps may be necessary for all agencies or metadata providers, however applying all of these changes will make the DCAT-US1.1 metadata 3.0 compliant.
Transformation Steps
Step 1 — Fix modified
If you use repeating intervals like R/P1D or R/P1Y in modified, replace them with the actual date the data last changed. Move your update frequency to accrualPeriodicity using a plain-language code.
| v1.1 | v3.0 |
|---|---|
"modified": "R/P1Y" | "modified": "2024-10-01" |
"modified": "R/P1D" | "modified": "2024-10-15" |
Add accrualPeriodicity to express update frequency:
"accrualPeriodicity": "annually"
In the current script, repeating durations are mapped to plain-language values including daily, weekly, fortnightly, monthly, quarterly, biannually, annually, and continual. If the script cannot map the duration, it falls back to irregular. ISO 8601 repeating duration format (for example, R/P1Y) is also still accepted by the v3.0 schema.
See transform_modified() in transforms.py for how this translation is implemented.
Step 2 — Fix temporal
Replace the v1.1 ISO 8601 interval string with an array of PeriodOfTime objects. At least one of startDate or endDate is required per object. Open-ended periods are valid — omit endDate for ongoing datasets. The current script also strips a repeating prefix such as R/ or R5/ before evaluating the interval.
| v1.1 | v3.0 |
|---|---|
"temporal": "2000-01-15T00:00:00Z/2010-01-15T00:00:00Z" | |
"temporal": "2020-01-01T00:00:00Z/P1Y" | |
See transform_temporal() in transforms.py for how this translation is implemented.
Step 3 — Revise spatial
Replace the v1.1 plain string or comma-separated bounding box string with an array of Location objects. The current script converts bbox values in the format <minLon>,<minLat>,<maxLon>,<maxLat> into a WKT polygon. Other strings are carried into prefLabel.
| v1.1 | v3.0 |
|---|---|
"spatial": "United States" | |
"spatial": "-125,24,-66,50" | |
See transform_spatial() in transforms.py for how this translation is implemented.
Step 4 — Fix language
Replace RFC 5646 language tags with two-letter ISO 639-1 codes. The v3.0 schema enforces a maximum length of two characters — values like en-US will fail validation.
| v1.1 | v3.0 |
|---|---|
"language": ["en-US"] | "language": ["en"] |
"language": ["es-MX"] | "language": ["es"] |
"language": ["fr-CA"] | "language": ["fr"] |
In the current conversion script, this applies to language on Dataset and nested Distribution objects.
See transform_language() in transforms.py for how this translation is implemented.
Step 5 — Manage accessLevel alongside accessRights
accessLevel is not in the v3.0 schema. Add accessRights as a free-text string alongside your existing accessLevel field. Until updated OMB guidance is issued, keep accessLevel in your records — the v3.0 schema will not reject it.
v1.1 accessLevel | v3.0 accessRights (add alongside) |
|---|---|
"public" | "public" |
"restricted public" | "Access restricted. Contact the publisher to request access." |
"non-public" | "Not available for public release. Contact the publisher for more information." |
See transform_access_rights() in transforms.py for how this translation is implemented.
Step 6 — Add license to Distribution objects
In v3.0, license is defined at the Distribution level per W3C DCAT. The recommended approach is to add license to each Distribution object. You do not need to remove it from the Dataset level during the transition period — the v3.0 schema will not reject records that include it there.
| v1.1 (on Dataset) | v3.0 (add to each Distribution) |
|---|---|
| |
If all your distributions share the same license, the conversion script copies it to each Distribution that does not already declare its own license.
See propagate_license() in transforms.py for how this translation is implemented.
Step 7 — Convert rights to an array
In v3.0, rights is an array of strings. The conversion script wraps a single string value in an array and removes rights if the value is neither a string nor an array.
| v1.1 | v3.0 |
|---|---|
"rights": "This data is in the public domain." | "rights": ["This data is in the public domain."] |
See transform_rights() in transforms.py for how this translation is implemented.
Step 8 — Upgrade describedBy and remove describedByType
The conversion script upgrades describedBy from a plain URL string to a Distribution-like object at both the Dataset level and on nested Distribution objects. If describedByType is present, it is removed and carried into mediaType.
| v1.1 | v3.0 |
|---|---|
| |
See transform_described_by() in transforms.py for how this translation is implemented.
Step 9 — Wrap publisher.subOrganizationOf in arrays
In v1.1, publisher.subOrganizationOf is typically a single Organization object. In v3.0, it must be an array of Organization objects. The conversion script walks the full parent chain and wraps each level.
| v1.1 | v3.0 |
|---|---|
| |
See transform_sub_organization_of() in transforms.py for how this translation is implemented.
Step 10 — Upgrade conformsTo on Dataset and Distribution
The conversion script upgrades conformsTo from a plain URI string to an array containing a Standard object on both the Dataset and each nested Distribution. The current script preserves the identifier but does not add a title.
| v1.1 | v3.0 |
|---|---|
"conformsTo": "https://www.iso.org/standard/53798.html" | "conformsTo": [{"@type": "Standard", "identifier": "https://www.iso.org/standard/53798.html"}] |
See transform_conforms_to() in transforms.py for how this translation is implemented.
Step 11 — Upgrade landingPage
The conversion script upgrades a plain landingPage URL string to a Document object with accessURL. If the Dataset already has a title, that title is copied into the Document.
| v1.1 | v3.0 |
|---|---|
| |
See transform_landing_page() in transforms.py for how this translation is implemented.
Step 12 — Normalize issued
In v3.0, issued must be a valid ISO date or date-time. The conversion script keeps plain dates as dates, normalizes valid date-times to UTC Zulu format, and removes issued if it cannot be parsed.
| v1.1 | v3.0 |
|---|---|
"issued": "2015-06-02T00:00:00" | "issued": "2015-06-02" |
"issued": "2018-09-28T02:00:00-04:00" | "issued": "2018-09-28T06:00:00Z" |
See transform_issued() in transforms.py for how this translation is implemented.
Step 13 — Update conformsTo on the Catalog
Change the plain string URI to a Standard object pointing to DCAT-US v3.0.
| v1.1 | v3.0 |
|---|---|
"conformsTo": "https://project-open-data.cio.gov/v1.1/schema" | |
This catalog-level change is handled by the conversion script directly. See convert_dcat_1_1_to_3_0.py for the implementation.
Step 14 — Remove @context and describedBy from the Catalog
Both fields have been removed at the catalog level in v3.0. Delete these lines from your catalog object.
// Remove these lines from your catalog: "@context": "https://project-open-data.cio.gov/v1.1/schema/catalog.jsonld", "describedBy": "https://project-open-data.cio.gov/v1.1/schema/catalog.json"
This catalog-level change is handled by the conversion script directly. See convert_dcat_1_1_to_3_0.py for the implementation.
Step 15 — Normalize catalog-level modified
If the Catalog itself has a modified value, the conversion script normalizes valid date-times to UTC Zulu format. If the value cannot be parsed as a date, the script removes it.
| v1.1 | v3.0 |
|---|---|
"modified": "2024-10-01T12:30:00-04:00" | "modified": "2024-10-01T16:30:00Z" |
"modified": "unknown" | // removed |
This catalog-level change is handled by the conversion script directly. See convert_dcat_1_1_to_3_0.py for the implementation.
You are done with the minimum migration
After completing steps 1 through 15, your records should be much closer to the current conversion script output and should validate against the v3.0 schema more consistently. Run your updated data.json against the validation script at https://harvest.data.gov/validate/ to confirm.
Changelog
| Date | Change |
|---|---|
| 2026-06-24 | Rewrote overview; aligned the migration steps to the current conversion script; removed the split between breaking and structural changes; added coverage for catalog modified, rights, describedBy, subOrganizationOf, Dataset and Distribution conformsTo, landingPage, and issued. |
| 2026-05-27 | Added deep-link anchors to all table rows. No content changes. |