- Data Discovery and Inventory
- Records Retention Schedules
- File Formats and Data Conversion
- Archival Ingest Process and Data Validation
- Business Cases
- Data in Motion
Data Discovery and Inventory
Before geospatial data can be archived and shared, it must be identified. This process is unique for each state partner, affected by the current state of geographic data and how it is retained and shared. In North Carolina, retention schedules are being reviewed to identify items that include geospatial data. This information is then compared to the data currently ingested in NC OneMap, the state's geospatial data repository. In Utah, a similar process is taking place. Kentucky's GeoNet, however, is the authoritative source for GIS data; therefore the state's discovery and inventory process is reduced to reviewing the data in the repository.
Records Retention Schedules
State agencies all have records retention schedules; however, geospatial data may not be mentioned in all relevant cases. These disparities will be identified in the data discovery and inventory process. Records retention schedules must then be updated to reflect current practices in creation and use of geospatial data, and communicated to the agencies.Learn more:
- Kentucky Geospatial Records Retention Schedule
- Michigan Geospatial Records Retention Schedule
- Maine Geospatial Records Retention Schedule
- NARA Transfer Instructions for Permanent Digital Geospatial Data Records
Retaining geospatial data, as with any other digital opjects, is only effective if the information can be found. Metadata is essential to access, as it not only provdes background - provenance - for the data, but also can confirm or disprove authenticity and accuracy. Geospatial metadata may also include relationships and context for the files, along with creation information such as date created, intended use, and creator. An additional challenge for GIS metadata is that it must be gathered from both vector and rastor data, and may significantly affect the usefulness of the information.Learn more:
- Federal Geographic Data Committee Web site: www.fgdc.gov/metadata
Includes the FGDC's Content Standard for Digital Geospatial Metadata and information about Geography Markup Language (GML) developed by Open Geospatial Consortium
- OpenGIS Standards and Specifications: www.opengeospatial.org/standards
These are technical documents that detail interfaces or encodings.
- Other open standards include:
- International Hydrographic Organization Transfer Standard for Digital Hydrographic Data (IHO 57)
- Spatial Data Standard for Facilities, Infrastructure and Environment (SDSFIE)
- Military Standard: The Interface Standard for Vector Product Format (MIL-STD-2407)
- PREMIS Preservation Metadata: www.loc.gov/standards/premis/
The PREMIS data dictionary was developed by the Preservation Metadata: Implementation Strategies Working Group, convened by OCLC and RLG, with the goal of creating a set of core preservation metadata elements with broad applicability within the digital preservation community.
- Article: "DHS Promotes Open Geospatial Data Standards", Government Technology www.govtech.com/gt/365850?topic=117676
- Article: "National Archives to Include Earth Imagery", Government Technology www.govtech.com/gt/371836?topic=117676
File Formats and Data Conversion
Geospatial data exists in multiple formats. Identifying best practices in archiving geospatial data necessitates the review and identification of preferred file formats. Possible differences between the source format and the archival format should be considered, especially in consideration of long-term preservation and access to geospatial data. If they differ, the process must accomodate data conversions.
Archival Ingest Process and Data Validation
Data validation creates information to confirm the authenticity of the data. This information can be stored with the other metadata to serve as reference for authenticity and integrity of the files. Data validation is an integral part of the archival ingest process, for which the archivist must consider storage, access, and disaster preparedness. Data validation is frequently accomplished using hash sums, which are unique identifiers created for a file. The sums are created at the bit-stream level, so that running a hash after transfer, ingest, or migration will identify any changes to the file.
Business drivers for the preservation of geospatial data are essential in order to convince those who can fund and otherwise support these initiatives that geospatial data is relevant and will be important in the future. Business cases, once identified, can be used to engage those new to geospatial data and the challenges it presents to archivists. The state partners are working to identify business cases by talking with agencies who frequently use GIS data to identify those who take advantage of superseded data. These business cases will be updated under the Using Temporal Geospatial Data page as they are discovered.
Data in Motion
The purpose of preserving geospatial data is to encourage access and enable sharing within and among state agencies and archives. To do this, best practices must be identified for "data in motion", to ensure its safety, integrity, and efficient delivery. This includes data validation, physical security, as well as appropriate file formats and permissions.