Update (2026-06-06): the previously attached SPLASH_Data_Dictionary_v1.0.pdf has been removed, and an earlier version of this post has been corrected (see note at the end).
When validated against the live SPLASH database, that draft contained several inaccuracies — for example it described an ~80-field schema with SPLASH ID / TRL / Linked OSDR Datasets fields and a “Flown / Operational / Retired” status vocabulary, none of which match the database. To avoid propagating errors, it is replaced below with facts reverse-engineered directly from the live database and submission form.
What the live SPLASH database actually looks like (verified 2026-06-06):
- 147 instruments in the public catalog, 61 fields per record (public API:
/splash-api/public/instruments). The submission schema itself defines 92 fields total — including some not exposed in the public catalog (e.g. per-platform TRL fields, cost/schedule, and collaborator fields). - Minimum required to publish — six fields are enforced by the submission form (empty submit blocks on all of them): Instrument/Platform Name, Background/Purpose, OPERATIONS, Category, Status, and Date of First Launch. (These are the form’s
requiredset; everything else is optional.) - Of those six, OPERATIONS, Category, Status, and Date of First Launch are controlled/structured (“static”) fields rather than free text.
- Status uses a clean controlled vocabulary already:
Active,Legacy,In Development. - Category is currently free text (~45 variants, many near-duplicates) — a candidate for a controlled, faceted vocabulary (science domain × hardware function × platform).
- The ~30 capability fields (Centrifuge, Imaging, Genomics, Cell Culture, …) follow a
Y/N/Mconvention, frequently with a free-text qualifier after a comma. - Physical specs (Mass, Power, Size, Temperature, Pressure, …) are free text with inconsistent units.
- The provenance fields (Sources, Study Links, Schematics) are empty in all 147 records — so no entry is currently independently verifiable. Requiring at least one source/dataset link may be a high-value standardization step.
Correction note: an earlier version of this post stated that only Instrument Name and Background/Purpose were enforced. That was wrong — the form enforces all six fields listed above. Thanks to manual testing that caught the error; corrected here for the record.
Shared as a data-grounded starting point for the Data Standardization Subgroup’s discussion. Feedback very welcome — OSDR is Open and we’re building in public. @HardwareAWG