Specform

Datasets

Datasets (DS + DatasetRef DAO)

Create snapshots, read versions, and move alias pointers safely.

Create the object

brca = sf.dataset("brca")

Add a snapshot from a CSV path

brca.add("data.csv", note="raw")

Return behavior (important)

  • add(...) returns the same DatasetRef for fluent notebook flow.
  • Use add_version(...) if you want an immutable handle back.
v = brca.add_version("data.csv", note="raw")
v.ds_id, v.fingerprint_short, v.row_count, v.presence

Load as DataFrame (non-mutating)

df = brca.df()
df_old = brca.df(version=1)
df_prev = brca.df(version="prev")

Checkpoint a DataFrame

brca.checkpoint(df, note="post-QC")

Version handles (immutable views)

v1 = brca.v(1)
v1.columns
v1.df().head()

Move the alias pointer (mutating)

brca.use(2, note="roll forward")
brca.use("prev")
brca.use("latest")

Inspect quickly

brca.head(10)
brca.describe()

Export a snapshot to CSV (non-mutating)

brca.to_csv("out.csv")               # current
brca.to_csv("v1.csv", version=1)     # old version

Verify integrity

brca.verify_current()

What verification means

It checks stored bytes against the recorded fingerprint for the current DS.