Config Parameters
Know exactly which flag to flip and why.
The default config works well for most transcripts. These parameters exist to handle specific annotation workflows: Arabic QA, positional-only comparison, metadata column exclusion, and structural detection control. Each section below shows the exact input/output difference a flag produces.
When to customize the config
Start with no config. Run a diff and inspect the results. Only reach for a config flag when you see a specific problem:
stripDiacriticsArabic transcripts where diacritic additions inflate MODIFIED countsimpleModePure content QA — you know the annotator made no structural changesignoreColNamesMetadata columns (confidence score, category) differ between QA layers but aren't the comparison targetpositionalModeDebugging unexpected alignments, or processing very large uniform datasetsenableSplits: falseProject guidelines prohibit splits at this annotation layerenableInlineDiff: falseLarge batches where only statuses and scores are needed — suppress transcript diff computation for speedstructuralTransformsRows have ID prefixes, URLs, or phone formats that vary between layers but aren't part of the transcript contentsimpleMode
By default the engine runs an 8-pass alignment algorithm that matches rows by similarity across the full transcript, even if they moved positions. simpleMode disables this: row 0 is compared to row 0, row 1 to row 1, strictly by position.
Default (simpleMode: false): the engine detects that one long segment was split into two and labels it SPLIT.
Original
{
"original": [
{ "speaker": "Candidate", "words": "For new users we relied on content-based filtering. For new items we used metadata clustering to find similar items." }
],
"reworked": [
{ "speaker": "Candidate", "words": "For new users, we relied on content-based filtering." },
{ "speaker": "Candidate", "words": "For new items, we used metadata clustering to find similar items." }
]
}Reworked
/* config: {} (default) */API Result
{
"results": [
{
"status": "SPLIT",
"notes": "split into 2 rows",
"originalRow": { "words": "For new users we relied on content-based filtering..." },
"reworkedRows": [
{ "words": "For new users, we relied on content-based filtering." },
{ "words": "For new items, we used metadata clustering..." }
]
}
]
}With simpleMode: true: the engine compares row 0 to row 0 (finds a mismatch → MODIFIED) and sees an extra row in reworked (→ ADDED). The structural intent is lost, but every character change is visible.
{
"results": [
{
"status": "MODIFIED",
"notes": "words changed",
"snapData": ["Candidate", "For new users we relied on content-based filtering..."],
"currData": ["Candidate", "For new users, we relied on content-based filtering."],
"transcriptDiff": [
{ "type": "equal", "value": "For new users" },
{ "type": "insert", "value": "," },
{ "type": "equal", "value": " we relied on content-based filtering." }
]
},
{
"status": "ADDED",
"notes": "new row in reworked",
"currData": ["Candidate", "For new items, we used metadata clustering..."]
}
]
}Use when you're confident the annotator made zero structural changes — only text corrections and punctuation. Also useful when you want raw character diffs without any structural interpretation.
enableSplits / enableMerges
Finer-grained alternatives to simpleMode. Instead of disabling all structural detection, you can disable only one type.
enableSplits: false — SPLIT candidates are instead emitted as MODIFIED (truncated match) + ADDED (leftover rows). Use when your annotation guidelines at this layer prohibit splits, so surfacing them as individual changes is more actionable.
enableMerges: false — MERGE candidates become MODIFIED (first original row) + DELETED (absorbed originals). Use when merges are not permitted at this layer and you want each deleted row flagged explicitly.
{
"config": {
"enableSplits": false,
"enableMerges": true
}
}These flags are most useful in multi-layer QA pipelines where each layer has its own permitted operations. Disabling an operation you don't expect to see makes unexpected structural changes surface as distinct ADDED/DELETED flags instead of being silently grouped.
stripDiacritics
Before comparison, the engine normalises Arabic and accented characters by stripping diacritical marks. For Arabic this includes harakat (short vowels: fathah, dammah, kasrah), tanwin, shadda, sukun, and hamza variants (U+064B–U+065F, U+0670). For Latin text it strips combining accent characters (U+0300–U+036F). This flag is ON by default.
Common Arabic QA scenario: an annotator normalises the text per written Arabic style guides (adding harakat, normalising hamza). With the default (stripDiacritics: true), only lexical and segmentation differences are counted. Override to false when diacritical accuracy is itself a QA criterion.
Default behavior (stripDiacritics: true — no config needed): مرحبا → مرحباً is UNCHANGED because diacritical marks are stripped before comparison, making the stripped forms identical.
Original
{
"original": [{ "speaker": "المذيع", "transcript": "مرحبا بكم في نشرة الاخبار" }],
"reworked": [{ "speaker": "المذيع", "transcript": "مرحباً بكم في نشرة الأخبار" }]
}Reworked
/* config: {} (default — stripDiacritics: true) */API Result
{ "status": "UNCHANGED", "notes": "high similarity match (diacritics stripped)" }With stripDiacritics: false (override): مرحبا → مرحباً is MODIFIED because the ً mark is no longer stripped — raw character differences are flagged.
{ "status": "MODIFIED", "notes": "transcript changed",
"transcriptDiff": [
{ "type": "EQUAL", "text": "مرحب" },
{ "type": "DELETE", "text": "ا" },
{ "type": "INSERT", "text": "اً" },
{ "type": "EQUAL", "text": " بكم في نشرة ال" },
{ "type": "DELETE", "text": "ا" },
{ "type": "INSERT", "text": "أ" },
{ "type": "EQUAL", "text": "خبار" }
]
}{ "config": { "stripDiacritics": false } }The default (true) works for most Arabic transcript QA. Override with stripDiacritics: false only when you are explicitly verifying that an annotator correctly added or removed diacritical marks — i.e., when diacritical precision is a tracked quality criterion.
positionalMode
Skips the similarity-based alignment algorithm entirely. Each original row at index N is compared to the reworked row at index N. If the arrays are different lengths, extra rows are ADDED or DELETED.
Default: if an annotator corrected a sentence and it moved from position 4 to position 6, the engine will still match them (MODIFIED). With positionalMode, row 4 in original is compared to row 4 in reworked — which may be a completely different sentence — producing a confusing MODIFIED with a large diff.
Use for debugging: run positionalMode and compare it to default results to understand which rows the alignment matched. Also useful for very uniform datasets (e.g., word-by-word alignment ground truth) where positional matching is the ground truth.
{ "config": { "positionalMode": true } }ignoreColNames
An array of column names to exclude from MODIFIED detection. A row is only MODIFIED if a non-ignored column changed. The ignored columns are still included in the response (snapData / currData) but do not trigger MODIFIED status.
Scenario: your data has a confidence column set by the annotation tool. QA Layer 1 might record confidence: 0.88 while QA Layer 2 records confidence: 0.91 for the same utterance. Without ignoreColNames, every such row is MODIFIED even if the transcript is identical. With ignoreColNames: ["confidence"], those rows are UNCHANGED as expected.
Without ignoreColNames
Original
{
"original": [
{ "transcript": "The patient reports mild chest pain.", "speaker": "Doctor", "confidence": 0.88, "category": "symptom" }
],
"reworked": [
{ "transcript": "The patient reports mild chest pain.", "speaker": "Doctor", "confidence": 0.94, "category": "complaint" }
]
}Reworked
/* config: {} */API Result
{ "status": "MODIFIED", "notes": "confidence, category changed" }With ignoreColNames
{
// request: { "config": { "ignoreColNames": ["confidence", "category"] } }
"status": "UNCHANGED", "notes": "exact match (after ignoring confidence, category)"
}Use whenever your schema includes metadata columns that change independently of transcript content: confidence scores, reviewer IDs, batch numbers, internal category tags, auto-generated timestamps.
enableInlineDiff
Controls whether the engine computes a character-level inline diff for MODIFIED rows. When enabled (default), each MODIFIED row in the response includes a transcriptDiff array that you can use to render highlighted changes in your review UI. Disabling it skips the diff computation entirely.
With enableInlineDiff: false, MODIFIED rows still appear in results (status and notes are unchanged), but the transcriptDiff field is absent. Use this when you only need status counts and scores and want to reduce response payload size.
{ "config": { "enableInlineDiff": false } }Each transcriptDiff segment has the shape { type: "EQUAL" | "INSERT" | "DELETE", text: string }. Reconstruct the original by joining all non-INSERT spans; reconstruct the reworked by joining all non-DELETE spans. Note: type values are UPPERCASE.
// transcriptDiff format — type is UPPERCASE, field is "text"
[
{ "type": "EQUAL", "text": "Hello " },
{ "type": "DELETE", "text": "world" },
{ "type": "INSERT", "text": "there" }
]Disable (enableInlineDiff: false) when processing large batches where you only need CER/WER/SER scores and status counts, not the per-character diff. This reduces both server CPU and network payload. Re-enable for interactive review UIs where editors need to see exactly what changed.
structuralTransforms
An array of find/replace rules applied to the transcript text BEFORE the similarity scoring algorithm runs. This lets the engine align rows that differ only in predictable, non-content prefixes or formats (e.g., ID tags, URL prefixes, phone number formats).
Each rule: { find: string, replace: string, isRegex: boolean }. Plain string rules do a literal find-replace. Regex rules (isRegex: true) support standard JavaScript regex syntax (case-insensitive). Up to 20 rules per request.
{
"config": {
"structuralTransforms": [
{ "find": "^ID-\\d+:\\s*", "replace": "", "isRegex": true },
{ "find": "https?://[^\\s]+", "replace": "[URL]", "isRegex": true }
]
}
}Use when your original and reworked data share a common schema but rows include auto-generated IDs, batch prefixes, or formatting that the annotator changed as part of their work. Without transforms, the alignment algorithm treats rows with different prefixes as entirely different — potentially producing false ADDED/DELETED pairs instead of MODIFIED.
Expert similarity & timing thresholds
These seven numbers control the matching algorithm's sensitivity. The defaults are tuned for standard-length transcription segments (5–30 seconds, 10–60 words). Adjust them only when you've looked at the raw similarity scores and know the default thresholds produce wrong matches.
| Parameter | Default | When to adjust · Effect |
|---|---|---|
SIM_CONFIDENTnumber (0–1) | 0.70 | Two rows this similar or closer are a definite match — committed in the high-similarity pass. Raise to require very close text matches before committing. Lower if you have very short utterances that can't achieve high similarity. |
SIM_MODERATEnumber (0–1) | 0.40 | Plausible match — accepted when timing also confirms. Lower if annotators rewrite sentences significantly while keeping the same meaning. |
SIM_WEAKnumber (0–1) | 0.20 | Tentative match — only accepted with very strong timing evidence. Lower to 0.10–0.15 for very short segments (single words, disfluencies) that can't achieve 0.20 similarity. |
TIME_EXACT_TOLnumber (s) | 0.05 | Timestamps ≤ this apart count as exact match. Increase to 0.5–1.0 if your annotation tool rounds timestamps to whole seconds. |
TIME_FUZZY_TOLnumber (s) | 2.5 | Timestamps ≤ this apart count as fuzzy match. Increase when annotators shift segment boundaries significantly. |
SPLIT_COMBINED_MINnumber (0–1) | 0.35 | Min combined text score to accept a SPLIT detection. Raise to reduce false splits. Lower if your content has very short target segments. |
MERGE_COMBINED_MINnumber (0–1) | 0.65 | Min combined text score to accept a MERGE detection. Raise to reduce false merges. Lower for datasets with many legitimate merges. |
CHAR_DIFF_LIMITinteger (100–50000) | 1500 | Max combined character length before falling back to word-level diff. Increase for batches with very long segments (300-word utterances). Decrease to force word-level diffs for all segments and save CPU on massive batches. |
{
"config": {
"SIM_WEAK": 0.15,
"TIME_EXACT_TOL": 1.0,
"SPLIT_COMBINED_MIN": 0.70
}
}