Structural Diff API
A REST API that compares an AI-generated transcript against its annotator post-edit — detecting row-level structural changes (splits, merges, modifications, additions, deletions), with per-column diff detail, CER/WER/SER scoring, and a composite quality grade per batch.
Quick Start
No SDK needed. Send a POST request with your two arrays of transcript rows and receive a full diff in JSON. The API is in tasting phase — request an API key to get started.
1. Verify the service is live:
curl https://structural-diff-engine.onrender.com/v1/health2. Run a comparison:
curl -X POST https://structural-diff-engine.onrender.com/v1/diff \
-H "Content-Type: application/json" \
-H "x-api-key: YOUR_API_KEY" \
-d '{
"original": [
{ "speaker": "Alice", "start_time": 0, "end_time": 1, "transcript": "Hello world" },
{ "speaker": "Bob", "start_time": 1, "end_time": 3, "transcript": "Good morning everyone" }
],
"reworked": [
{ "speaker": "Alice", "start_time": 0, "end_time": 1, "transcript": "Hello there" },
{ "speaker": "Bob", "start_time": 1, "end_time": 2, "transcript": "Good morning" },
{ "speaker": "Bob", "start_time": 2, "end_time": 3, "transcript": "everyone" }
]
}'Base URL
All endpoints are prefixed with /v1.
https://structural-diff-engine.onrender.comAuthentication
Include your API key in the x-api-key request header on every call to /v1/diff.
curl -H "x-api-key: YOUR_API_KEY" -H "Content-Type: application/json" \
-X POST https://structural-diff-engine.onrender.com/v1/diff -d '{...}'Rate Limits
Two independent tiers are enforced per API key, falling back to IP when no key is present. Exceeding either tier returns 429 Too Many Requests.
| Tier | Limit | Response header |
|---|---|---|
| Burst | 10 requests / minute | RateLimit-Limit |
| Window | 60 requests / 15 minutes | RateLimit-Remaining |
Endpoints
GET /v1/health
Lightweight liveness probe. No authentication required. Returns service version and uptime.
/v1/health· No auth{ "status": "ok", "version": "1.0.0", "uptime": 42, "timestamp": "..." }POST /v1/diff
Compare two arrays of transcript rows. Returns row-level results with quality scores. Max payload: 5 MB · Max rows: 30,000.
/v1/diff Auth requiredRequest Body
| Name | Type | Description |
|---|---|---|
original* | array | Row objects from the baseline / original version. |
reworked* | array | Row objects to compare against. |
config | object | Optional algorithm overrides. See . |
headers | string[] | Column names — required when using 2-D array input. |
columnMapping | object | Column index map for 2-D array input. See . |
Row object fields
All fields are optional except transcript. Unknown fields are passed through unchanged.
| Name | Type | Description |
|---|---|---|
transcript* | string | The text content of the row. |
speaker | string | Speaker name or ID. |
start_time | number|string | Segment start time in seconds. |
end_time | number|string | Segment end time in seconds. |
non_speech_events | string | Annotations such as [music], [laughter]. |
emotion | string | Emotion label. |
language | string | Language code (e.g. "en", "ar"). |
locale | string | Locale code (e.g. "en-US"). |
accent | string | Accent tag. |
Response Shape
All successful responses use this envelope:
{
"status": "success",
"requestId": "550e8400-e29b-41d4-a716-446655440000",
"timestamp": "2026-04-08T21:00:00.000Z",
"data": {
"results": [
{
"status": "MODIFIED",
"originalRow": { "transcript": "Hello world", ... },
"reworkedRow": { "transcript": "Hello there", ... },
"notes": "transcript changed"
},
{
"status": "SPLIT",
"originalRow": { "transcript": "Good morning everyone", ... },
"reworkedRows": [ { "transcript": "Good morning" }, { "transcript": "everyone" } ],
"notes": "split into 2 rows"
}
],
"scores": { "CER": 0.12, "WER": 0.18, "SER": 0.33, "cerT": 0.12, "werT": 0.18 },
"composite": { "score": 3.8, "grade": "B", "label": "Good" },
"meta": { "originalRows": 2, "reworkedRows": 3, "headers": [...] }
}
}Diff statuses
| Status | Meaning |
|---|---|
| UNCHANGED | Row is identical in both versions. |
| MODIFIED | Row exists in both versions but content changed. |
| ADDED | Row is only present in the reworked version. |
| DELETED | Row is only present in the original version. |
| SPLIT | One original row was divided into two or more reworked rows. |
| MERGED | Two or more original rows were combined into one reworked row. |
Scores
| Name | Type | Description |
|---|---|---|
CER | number | Character Error Rate across all columns (0–1, lower is better). |
WER | number | Word Error Rate across all columns (0–1). |
SER | number | Segmentation Error Rate — proportion of rows that were split or merged. |
cerT | number | CER computed on the transcript column only. |
werT | number | WER computed on the transcript column only. |
Composite grade
| Name | Type | Description |
|---|---|---|
score | number | Weighted quality score (1.0–5.0, higher is better). |
grade | string | Letter grade: A, B, C, D, or F. |
label | string | Human-readable label — e.g. "Excellent", "Good", "Needs Work". |
Config Options
Pass a config object in the request body to override algorithm defaults. All fields are optional.
| Name | Type | Description |
|---|---|---|
simpleMode | boolean | Disable split and merge detection. Pure row-by-row diff. Default: false. |
enableSplits | boolean | Enable split row detection. Default: true. |
enableMerges | boolean | Enable merge row detection. Default: true. |
enableCER | boolean | Compute Character Error Rate. Default: true. |
enableWER | boolean | Compute Word Error Rate. Default: true. |
enableSER | boolean | Compute Segmentation Error Rate. Default: true. |
stripDiacritics | boolean | Normalise Arabic/accented characters before comparison. Default: true. |
positionalMode | boolean | Compare rows strictly by position, skipping alignment. Default: false. |
ignoreColNames | string[] | Column names excluded from MODIFIED detection. Default: []. |
enableInlineDiff | boolean | Include transcriptDiff on MODIFIED rows. Set false to skip char-level diff and reduce response size. Default: true. |
structuralTransforms | TransformRule[] | Pre-comparison find/replace rules applied to both sides before similarity scoring (max 20 rules). Default: []. |
Column Mapping
When original / reworked are 2-D arrays (arrays of arrays) instead of objects, supply headers and/or columnMapping to tell the engine which index carries each field.
{
"original": [[0, 1, "Alice", "Hello world"]],
"headers": ["start_time", "end_time", "speaker", "transcript"],
"columnMapping": { "transcript": 3, "speaker": 2, "start_time": 0, "end_time": 1 }
}| Name | Type | Description |
|---|---|---|
transcript* | integer | 0-based column index of the transcript field. |
speaker | integer | 0-based column index of the speaker field. |
start_time | integer | 0-based column index of start time. |
end_time | integer | 0-based column index of end time. |
nse | integer | 0-based column index of non-speech events. |
extraCols | integer[] | Additional column indices to include (max 20). |
Error Reference
All errors use a uniform envelope:
{
"status": "error",
"requestId": "550e8400-...",
"timestamp": "2026-04-08T21:00:00.000Z",
"error": {
"code": "VALIDATION_ERROR",
"message": "Validation failed",
"details": [{ "field": "original", "message": "\"original\" is required" }]
}
}| HTTP | Code | Cause |
|---|---|---|
| 400 | BAD_REQUEST | Malformed JSON body |
| 401 | UNAUTHORIZED | Missing or invalid x-api-key header |
| 404 | NOT_FOUND | Unknown endpoint |
| 413 | PAYLOAD_TOO_LARGE | Request body exceeds 5 MB |
| 422 | VALIDATION_ERROR | Body failed schema validation (see details array) |
| 429 | RATE_LIMIT_EXCEEDED | Burst or window rate limit hit |
| 500 | INTERNAL_ERROR | Unexpected server or engine error |
Request Tracing
Provide an x-request-id header to correlate requests across your system. Alphanumeric characters, hyphens, and underscores only, max 64 characters. The value is echoed back in the response headers.
curl -H "x-request-id: job-2026-01-batch-3" \
-H "x-api-key: YOUR_KEY" \
-X POST https://structural-diff-engine.onrender.com/v1/diff -d '{...}'Get API Access
The API is available to agencies and teams in tasting phase. Keys are provisioned individually. Reach out to receive your key and start integrating.