Structural Diff API›درس تطبيقي كامل

درس تطبيقي كامل

نص حقيقي. تحرير حقيقي. diff حقيقي.

يستخدم هذا الدرس نصاً حقيقياً لبودكاست بأسماء أعمدة غير قياسية لعرض حلقة التكامل الكاملة: تكييف تنسيق البيانات، استدعاء الـ API، وتفسير كل حقل في الاستجابة. حالة الاستخدام هي منسق QA يراجع عمل تحرير المُدقِّق قبل قبول دفعة.

السيناريو

مشاريع تدقيق الذكاء الاصطناعي للنسخ النصي والترجمة لها عادةً طبقات QA متعددة. كل طبقة نقطة تسليم حيث يراجع إنسان أو يُحرِّر مخرجات الطبقة السابقة. يحتاج المنسق إلى سجل موضوعي لما تغيّر في كل تسليم.

الطبقة 0 — مخرج الذكاء الاصطناعي

تُصدِّر المنصة نصاً مُولَّداً بالذكاء الاصطناعي. هذا هو المصفوفة الأصلية.

الطبقة 1 — تحرير المُدقِّق

يُحرِّر مُدقِّق النص فق أسلوب المشروع (الترقيم، تنسيق الأرقام، التصحيحات الحرفية، معرّفات المتحدثين، الطوابع الزمنية). هذه المصفوفة المُعادة.

الطبقة 2 — منسق QA

يستدعي المنسق POST /v1/diff بالمصفوفتين. يُرفق تقرير الـ diff بالدفعة قبل التسليم للعميل.

في هذا الدرس، الأصل نص بودكاست من 9 صفوف حول الترجمة الآلية (EP101). النسخة المُعادة تعكس تصحيحات تدقيق نموذجية: توحيد الترقيم، تنسيق الأرقام، تقسيم هيكلي واحد، ودمج هيكلي واحد.

البيانات

يستخدم النص التجريبي أسماء الأعمدة المُصدَّرة من منصة تدقيق نموذجية. هذه لا تتطابق مع أسماء الحقول القياسية للـ API:

text

Platform column   API field
─────────────────────────────
Talker          → speaker
BeginTime       → start_time
FinishTime      → end_time
Utterance       → transcript
NoiseTag        → non_speech_events
Mood            → emotion
Show            (passed through unchanged)
Subject         (passed through unchanged)

يحتاج الـ API إلى transcript واختيارياً speaker وstart_time وend_time لتشغيل خوارزمية المحاذاة. الحقول المجهولة (Show، Subject) تُمرَّر دون تغيير.

تكييف أسماء الأعمدة

إعادة تعيين المفاتيح مرة واحدة قبل استدعاء الـ API. تحتاج فقط لتعيين الحقول التي يستخدمها المحرك؛ الباقي يمر تلقائياً.

JavaScript

const KEY_MAP = {
  Talker:    'speaker',
  BeginTime: 'start_time',
  FinishTime:'end_time',
  Utterance: 'transcript',
  NoiseTag:  'non_speech_events',
  Mood:      'emotion',
}

const adapt = row =>
  Object.fromEntries(
    Object.entries(row).map(([k, v]) => [KEY_MAP[k] ?? k, v])
  )

const originalAdapted = originalData.map(adapt)
const reworkedAdapted = reworkedData.map(adapt)

Python

python

KEY_MAP = {
    "Talker":    "speaker",
    "BeginTime": "start_time",
    "FinishTime": "end_time",
    "Utterance": "transcript",
    "NoiseTag":  "non_speech_events",
    "Mood":      "emotion",
}

def adapt(row):
    return {KEY_MAP.get(k, k): v for k, v in row.items()}

original_adapted = [adapt(r) for r in original_data]
reworked_adapted = [adapt(r) for r in reworked_data]

الطلب

بعد تكييف أسماء الأعمدة، يبدو جسم طلب POST الكامل هكذا (مختصر للقراءة):

Original array (9 rows — AI output)

json

[
  { "speaker": "Sarah Mitchell", "start_time": 0.00,  "end_time": 4.20,  "transcript": "Welcome to The Language Lab, the podcast where we break down how AI is changing the way we communicate.",                          "non_speech_events": "[intro jingle]", "emotion": "warm"       },
  { "speaker": "Sarah Mitchell", "start_time": 4.20,  "end_time": 8.00,  "transcript": "Today we have two fantastic guests joining us to talk about machine translation and quality assurance.",                             "non_speech_events": "",               "emotion": "enthusiastic" },
  { "speaker": "James Park",     "start_time": 8.00,  "end_time": 11.50, "transcript": "Thanks Sarah glad to be here I have been looking forward to this conversation for weeks.",                                           "non_speech_events": "",               "emotion": "friendly"   },
  { "speaker": "Elena Rossi",    "start_time": 11.50, "end_time": 15.00, "transcript": "Same here this is such an important topic right now especially with how fast the field is evolving.",                                 "non_speech_events": "",               "emotion": "engaged"    },
  { "speaker": "Sarah Mitchell", "start_time": 15.00, "end_time": 19.80, "transcript": "James lets start with you. Your team recently published a paper on neural machine translation for low resource languages.",           "non_speech_events": "",               "emotion": "curious"    },
  { "speaker": "James Park",     "start_time": 19.80, "end_time": 26.50, "transcript": "Yes so our main finding was that back translation combined with careful data augmentation can boost BLEU scores by up to twelve points for languages with under fifty thousand parallel sentences.", "non_speech_events": "", "emotion": "analytical" },
  { "speaker": "James Park",     "start_time": 26.50, "end_time": 30.00, "transcript": "The trick is selecting the right seed data and not just throwing everything at the model.",                                          "non_speech_events": "",               "emotion": "technical"  },
  { "speaker": "Elena Rossi",    "start_time": 30.00, "end_time": 34.50, "transcript": "That resonates with our work at the localization lab where we focus on Arabic dialect adaptation.",                                   "non_speech_events": "",               "emotion": "thoughtful" },
  { "speaker": "Elena Rossi",    "start_time": 34.50, "end_time": 39.00, "transcript": "Standard Arabic models completely fail when you feed them Tunisian or Moroccan dialect input.",                                       "non_speech_events": "",               "emotion": "concerned"  }
]

Reworked array (9 rows — annotator post-edit)

json

[
  { "speaker": "Sarah Mitchell", "start_time": 0.00,  "end_time": 4.20,  "transcript": "Welcome to The Language Lab, the podcast where we break down how AI is changing the way we communicate.",                 "non_speech_events": "[intro jingle]", "emotion": "warm"        },
  { "speaker": "Sarah Mitchell", "start_time": 4.20,  "end_time": 8.00,  "transcript": "Today, we have two fantastic guests joining us to talk about machine translation and quality assurance.",                  "non_speech_events": "",               "emotion": "enthusiastic" },
  { "speaker": "James Park",     "start_time": 8.00,  "end_time": 11.50, "transcript": "Thanks, Sarah. Glad to be here — I've been looking forward to this conversation for weeks.",                              "non_speech_events": "",               "emotion": "friendly"    },
  { "speaker": "Elena Rossi",    "start_time": 11.50, "end_time": 15.00, "transcript": "Same here. This is such an important topic right now, especially with how fast the field is evolving.",                    "non_speech_events": "",               "emotion": "engaged"     },
  { "speaker": "Sarah Mitchell", "start_time": 15.00, "end_time": 19.80, "transcript": "James, let's start with you. Your team recently published a paper on neural machine translation for low-resource languages.", "non_speech_events": "",               "emotion": "curious"     },
  { "speaker": "James Park",     "start_time": 19.80, "end_time": 23.20, "transcript": "Yes, so our main finding was that back-translation combined with careful data augmentation can boost BLEU scores by up to 12 points.", "non_speech_events": "", "emotion": "analytical"  },
  { "speaker": "James Park",     "start_time": 23.20, "end_time": 26.50, "transcript": "This holds for languages with under 50,000 parallel sentences.",                                                           "non_speech_events": "",               "emotion": "analytical"  },
  { "speaker": "James Park",     "start_time": 26.50, "end_time": 30.00, "transcript": "The trick is selecting the right seed data and not just throwing everything at the model.",                                "non_speech_events": "",               "emotion": "technical"   },
  { "speaker": "Elena Rossi",    "start_time": 30.00, "end_time": 39.00, "transcript": "That resonates with our work at the localization lab where we focus on Arabic dialect adaptation — standard Arabic models completely fail when you feed them Tunisian or Moroccan dialect input.", "non_speech_events": "", "emotion": "thoughtful" }
]

curl -X POST https://structural-diff-engine.onrender.com/v1/diff \
  -H "Content-Type: application/json" \
  -H "x-api-key: YOUR_API_KEY" \
  -H "x-request-id: batch-ep101-layer1-layer2-qa" \
  -d '{
    "original": <originalAdapted array>,
    "reworked": <reworkedAdapted array>
  }'

للمجموعة الكاملة (9 صفوف أصلية، 9 صفوف مُعادة بعد التقسيم والدمج)، الجسم حوالي 8 كيلوبايت — أقل بكثير من الحد الأقصى 5 ميجابايت. أضِف x-request-id لربط هذا الاستدعاء بمعرّف الدفعة في نظامك.

استعراض الاستجابة

مصفوفة results تحتوي على إدخال لكل صف أصلي، بالإضافة إلى الصفوف المصدرية لأي دمج. إليك ما يُعيده كل صف ولماذا:

json

{
  "status": "success",
  "requestId": "batch-ep101-layer1-layer2-qa",
  "data": {
    "results": [
      { "status": "UNCHANGED", "notes": "exact match",          ... },  // row 0 — Sarah intro
      { "status": "MODIFIED",  "notes": "transcript changed",   ... },  // row 1 — comma added
      { "status": "MODIFIED",  "notes": "transcript changed",   ... },  // row 2 — James punctuation
      { "status": "MODIFIED",  "notes": "transcript changed",   ... },  // row 3 — Elena punctuation
      { "status": "MODIFIED",  "notes": "transcript changed",   ... },  // row 4 — Sarah question
      { "status": "SPLIT",     "notes": "split into 2 rows",    ... },  // row 5 — James finding (split)
      { "status": "UNCHANGED", "notes": "exact match",          ... },  // row 6 — James trick
      { "status": "MERGED",    "notes": "merged from 2 rows",   ... },  // rows 7+8 — Elena merged
      { "status": "MERGED",    "notes": "Source row 1/2 ...",   ... },  // ← trace entry, skip in counts
      { "status": "MERGED",    "notes": "Source row 2/2 ...",   ... }   // ← trace entry, skip in counts
    ],
    "scores": {
      "CER": 0.09,
      "WER": 0.14,
      "SegER": 0.22,
      "SER": 0.44,
      "cerT": 0.09,
      "werT": 0.14
    },
    "composite": {
      "score": 3.9,
      "grade": "B",
      "label": "Good"
    },
    "meta": {
      "originalRows": 9,
      "reworkedRows": 9
    }
  }
}

UNCHANGED

الصف 0 — مقدمة سارة

تطابق دقيق. لم يلمس المُدقِّق هذا الصف.

MODIFIED

الصف 1 — "Today we have..."

أضاف المُدقِّق فاصلة بعد "Today". transcriptDiff يُبرز الإدراج.

MODIFIED

الصف 2 — شكر جيمس

تصحيحات ترقيم متعددة. هيكل الجملة بأكمله حُرِّر لأسلوب حرفي.

MODIFIED

الصف 3 — "Same here" لإيلينا

أضاف المُدقِّق نقطة بعد "Same here" وفاصلة قبل "especially". معيار ترقيم حرفي كلاسيكي.

MODIFIED

الصف 4 — سؤال سارة

فاصلة بعد "James"، تصحيح apostrophe في "let's"، شرطة في "low-resource". ثلاثة تصحيحات مميزة في صف واحد.

SPLIT

الصف 5 — نتيجة جيمس الطويلة

الأصل: 46 كلمة في جملة واحدة. المُعاد: مقسَّم إلى مقطعين عند حد طبيعي. "twelve" → "12" أيضاً. يزيد SER.

UNCHANGED

الصف 6 — "The trick" لجيمس

كان مُشكَّلاً جيداً بالفعل. تركه المُدقِّق دون تغيير.

MERGED

الصفوف 7+8 — تصريحان لإيلينا

دمج المُدقِّق سطرَي إيلينا المتتاليين في مقطع واحد بشرطة em. الصفّان الأصليان ممتصان.

قراءة الدرجات

يُعطيك كائن scores ملخصاً كمياً للدفعة كاملة. إليك كيفية تفسير الأرقام في سياق هذا الدرس:

CER≈ 0.09

CER ≈ 0.09 — حوالي 9% من الحروف تغيرت. منخفض ومتوقع: تصحيحات الترقيم والأرقام هي تغييرات قصيرة على مستوى الحروف في مقاطع طويلة.

WER≈ 0.14

WER ≈ 0.14 — حوالي 14% من الكلمات تغيرت. أعلى من CER لأن استبدال كلمة ("twelve" → "12") يُعدّ تغيير كلمة كامل، والتغييرات الهيكلية تمس كل حدود المقطع.

SegER≈ 0.22

SegER ≈ 0.22 — حوالي 22% من الصفوف الأصلية تغيرت هيكلياً (1 split + 1 merge من 9 = 2/9). نموذجي لأول مرور تدقيق على مخرج ذكاء اصطناعي.

SER≈ 0.67

SER ≈ 0.67 — حوالي 67% من الصفوف القابلة للمقارنة (UNCHANGED + MODIFIED) تحتوي على تعديل واحد على الأقل: 4 MODIFIED من 6 صفوف قابلة. يقيس تكرار التحرير باستقلالية عن الأحداث الهيكلية.

CompositeB / Good (3.9)

الدرجة المركّبة B / "Good" (درجة ≈ 3.9) — المعادلة الموزونة تُكافئ معدل تحرير منخفض وتُعاقب التغييرات الهيكلية. الدرجة B تُخبر المنسق: قام المُدقِّق بتصحيحات حقيقية (لم يوافق فقط)، لكن الحجم الإجمالي مضبوط.

تفسير الدرجات في سياق QA التدقيق: A = مخرج ذكاء اصطناعي شبه مثالي. B = مرور QA سليم، تصحيحات متوقعة. C = مطلوب تحرير كبير. D/F = الدفعة قد تحتاج إعادة تدقيق كاملة.

درس تطبيقي كامل · Structural Diff API · تطوير Mohamed Yaakoubi

سياسة الخصوصية شروط الخدمة ← العودة إلى Structural Diff API