TAMPER
SIGNAL
← Docs home

Reference

Python & JS API

Everything the CLI does, you can do in code. The shape is the same in both languages: ingest a source, wrap each transform so it signs a receipt, and verify the chain. The contract for a transform is always records in, records out.

Python: receipt_step

A decorator that turns any record-to-record function into a signed pipeline stage. It verifies the chain tail before running, refuses foreign input, runs your function, then signs and appends a receipt.

from tamper_signal import receipt_step

@receipt_step(chain_dir="receipts/", key_path="keys/signing.key")
def transform_clean(records):
    return [r for r in records if r.get("campaign_name")]

The wrapped function takes and returns either a list of dicts or a pandas DataFrame. Frames are hashed as records (with NaN canonicalized to null) and pass through untouched, so you can keep working in pandas and still sign every step.

ChainTailMismatch

If the data handed to a wrapped stage does not match the previous stage's output, the wrapper raises ChainTailMismatch before running your code. That is the guard working as intended: it means something fed the stage data that did not come from the chain. Re-run the pipeline from ingest.

JavaScript: receiptStep

The same idea, async, returning a function you await.

import { receiptStep, loadCsv } from "tamper-signal";

const clean = receiptStep(
  (records) => records.filter((r) => r.campaign_name !== null),
  { chainDir: "receipts/", keyPath: "keys/signing.key", codeFile: "pipeline.js" }
);

const output = await clean(loadCsv("export.csv"));

JavaScript building blocks

The npm package exposes the pieces directly so you can compose a pipeline without the CLI. TypeScript declarations ship for every entry point.

FunctionWhat it does
loadCsv(path)Read a .csv / .tsv / .json / .ndjson file into records. (No xlsx; convert first or ingest once with Python.)
ingestFile({ file, origin, chainDir, keyPath })Reset the chain and write the source receipt. The JS counterpart to receipts ingest.
receiptStep(fn, opts)Wrap a transform so it signs and appends a receipt.
rebuildChain({ file, stages, chainDir, keyPath })Re-ingest the source and replay every stage from a clean tail. The idempotent way to rebuild when data changes.
canonicalDocument(records)Produce the canonical attested table. Write it to table.json for the data tab.
verifyChain({ chainDir, pub, data })Verify in process; returns the same structured verdict as the CLI.

Rebuilding idempotently

Pipelines re-run whenever the source changes. Appending to an existing chain would stack stale receipts, so reset first. Two equivalent patterns:

// Option 1: one call rebuilds the whole chain
await rebuildChain({
  file: "export.csv",
  stages: [clean, aggregate],
  chainDir: "receipts/",
  keyPath: "keys/signing.key",
});

// Option 2: compose by hand
await ingestFile({ file: "export.csv", origin: "export", chainDir: "receipts/", keyPath: "keys/signing.key" });
let records = loadCsv("export.csv");
records = await clean(records);
records = await aggregate(records);

Emit the data tab table

import { writeFileSync } from "node:fs";
import { canonicalDocument } from "tamper-signal";

writeFileSync(
  "public/receipts/table.json",
  JSON.stringify(canonicalDocument(finalRecords), null, 2) + "\n"
);

What a receipt holds

You will rarely open one by hand, but it helps to know the shape. A transform receipt records the link to the previous stage, the code that ran, and the totals.

{
  "kind": "transform_receipt",
  "spec_version": "1.1",
  "transform": {
    "name": "transform_clean",
    "code_hash": "<sha256 of the function source>",
    "code_file": "pipeline.py"
  },
  "input_semantic_hash": "<must equal prior output hash>",
  "output_semantic_hash": "…",
  "output_control_totals": {
    "row_count": 48190,
    "numeric_sums": { "spend_(usd)": "98342.62" },
    "null_counts": { "campaign_name": 0 }
  },
  "signature": { "alg": "ed25519", "key_fingerprint": "…", "value": "…" }
}
Column names for metric flagging

The keys under numeric_sums and null_counts are the normalized column names (lowercased, spaces to underscores). Those exact strings are what you pass to data-receipt-column when you want the signal to flag a broken metric in place. See Mounting the signal.