Leveraging AI to Categorize Sensitive Refugee Call Transcripts: A Step-by-Step Approach
Learn how we built an AI system to categorize sensitive refugee call transcripts using advanced techniques like Knowledge Distillation and multilingual model training.

Data Strategist | UN & Enterprise AI

Leveraging AI to Categorize Sensitive Refugee Call Transcripts: A Step-by-Step Approach
#Introduction: The Humanitarian Crisis Meets AI Innovation
#In the world of humanitarian work, time is often a matter of life and death. Every day, thousands of refugees reach out for help—seeking medical assistance, cash aid, resettlement support, employment opportunities, or shelter. These requests pour into helplines, creating an overwhelming volume of transcripts that need to be categorized quickly and accurately.
But here's the catch: handling these sensitive transcripts isn't just about speed—it's about precision, privacy, and empathy. Refugee data is deeply personal, often multilingual (Arabic with mixed English), and frequently spans multiple categories like "Cash Assistance" and "Medical Help." Traditional manual methods are slow, error-prone, and don't scale well. Enter AI-powered classification systems—a game-changer in streamlining operations while safeguarding human dignity.
The Challenges We Faced
#Before diving into the solution, let's acknowledge the hurdles:
- Data Privacy: Refugee data contains personally identifiable information (PII). Mishandling it could lead to breaches of trust and legal consequences.
- Language Complexity: Most transcripts are in Arabic, with occasional English phrases. This requires models capable of understanding mixed-language inputs.
- Multiclass Categorization: Many transcripts fall under multiple labels simultaneously, making binary classification approaches insufficient.
- Model Efficiency: The system needs to be lightweight, accurate, and scalable to handle real-world demands without breaking the bank.
Step 1: Preparing and Cleaning Data – Laying the Foundation
#1.1 Collecting Real Transcripts
#We started with 1,200 anonymized refugee call transcripts, each labeled with one or more predefined categories:
- Cash Assistance
- Medical Help
- Resettlement Requests
- Employment Issues
- Shelter and Housing
With over 40+ categories, the dataset was rich but limited in size. To build a robust model, we needed more data—but collecting additional real transcripts posed ethical and logistical challenges.
1.2 Generating Synthetic Data Using GPT-4
#To augment our dataset, we turned to GPT-4 to generate 5,000 synthetic transcripts. This technique ensures data diversity while preserving privacy—a win-win scenario. Here's how we did it:
import openai
import json
openai.api_key = "YOUR_OPENAI_KEY"
def generate_synthetic_data(real_transcripts, num_samples=5000):
"""Uses GPT-4 to generate synthetic refugee transcripts."""
prompt = f"Generate {num_samples} refugee call transcripts with categories:\n" \
"1. Cash Assistance\n2. Medical Help\n3. Resettlement...\n\n"
prompt += "\n".join([f"- {t['transcript']}" for t in real_transcripts[:10]]) # Use real examples for context
response = openai.ChatCompletion.create(
model="gpt-4o",
messages=[{"role": "system", "content": prompt}],
temperature=0.7
)
return json.loads(response["choices"][0]["message"]["content"])
# Generate & Save Synthetic Data
synthetic_data = generate_synthetic_data(real_transcripts)
with open("synthetic_transcripts.json", "w") as f:
json.dump(synthetic_data, f)1.3 Merging & Preprocessing Data
#Once we combined real and synthetic transcripts, we cleaned the data by removing duplicates, noise, and any remaining sensitive information. The result? A pristine dataset ready for training.
Step 2: Soft Labeling Using AI – Knowledge Distillation at Work
#Training a high-performing model from scratch would have been costly and time-consuming. Instead, we employed Knowledge Distillation, a powerful technique where a smaller "student" model learns from a larger "teacher" model.
2.1 Using GPT-4 to Generate Soft Labels
#Rather than manually labeling all 6,200 transcripts (real + synthetic), we leveraged GPT-4 to assign soft labels—probabilistic predictions across multiple categories. For example, a transcript might receive scores like:
- Cash Assistance: 0.85
- Medical Help: 0.60
- Resettlement: 0.10
Here's the code snippet for generating soft labels:
def categorize_transcript(transcript):
"""Uses GPT-4 to classify a refugee transcript into multiple categories."""
prompt = f"Categorize this refugee transcript:\n\n{transcript}\n\n" \
"Return a JSON list of categories."
response = openai.ChatCompletion.create(
model="gpt-4o",
messages=[{"role": "system", "content": prompt}],
temperature=0.3
)
return json.loads(response["choices"][0]["message"]["content"])
# Generate Soft Labels
soft_labels = [{"transcript": t, "categories": categorize_transcript(t)} for t in all_transcripts]
with open("soft_labels.json", "w") as f:
json.dump(soft_labels, f)Step 3: Training a Multilingual AI Model – Bridging Languages and Cultures
#3.1 Selecting the Right Model
#For multilingual tasks involving Arabic and English, we tested two Transformer-based models:
- distilbert-base-multilingual-cased: Lightweight but weaker on Arabic comprehension.
- bert-base-multilingual-cased: Heavier but excels in Arabic.
After rigorous testing, we chose bert-base-multilingual-cased for its superior performance in handling Arabic text.
3.2 Training the Model
#We fine-tuned BERT for multi-label classification using PyTorch. Below is the architecture of our custom RefugeeCallClassifier:
import torch
import torch.nn as nn
from transformers import AutoModel, AutoTokenizer
class RefugeeCallClassifier(nn.Module):
def __init__(self, num_classes):
super(RefugeeCallClassifier, self).__init__()
self.bert = AutoModel.from_pretrained("bert-base-multilingual-cased")
self.dropout = nn.Dropout(0.3)
self.fc = nn.Linear(self.bert.config.hidden_size, num_classes)
self.sigmoid = nn.Sigmoid() # Multi-label classification
def forward(self, input_ids, attention_mask):
outputs = self.bert(input_ids=input_ids, attention_mask=attention_mask)
x = self.dropout(outputs.last_hidden_state[:, 0, :])
x = self.fc(x)
return self.sigmoid(x)
# Initialize & Train Model
model = RefugeeCallClassifier(num_classes=42)
optimizer = torch.optim.AdamW(model.parameters(), lr=2e-5)
loss_fn = nn.BCELoss() # Binary cross-entropy for multi-label classificationStep 4: Model Deployment & Evaluation – From Lab to Real World
#4.1 Evaluating Performance
#Post-training, we evaluated the model on unseen transcripts. Here's how we implemented the evaluation function:
def evaluate_model(transcript):
"""Uses trained model to predict refugee call category."""
inputs = tokenizer(
transcript, padding="max_length", truncation=True, max_length=512, return_tensors="pt"
)
with torch.no_grad():
outputs = model(inputs["input_ids"], inputs["attention_mask"])
predictions = (outputs > 0.5).int().tolist()[0]
return [CATEGORIES[i] for i, pred in enumerate(predictions) if pred == 1]
# Test Model
sample_transcript = "أحتاج إلى المساعدة المالية لدفع إيجار منزلي."
print(evaluate_model(sample_transcript))4.2 Deploying the Model via API
#Finally, we deployed the model using FastAPI, enabling seamless integration into existing workflows:
from fastapi import FastAPI
app = FastAPI()
@app.post("/predict/")
def predict_category(request: dict):
transcript = request["transcript"]
return {"predicted_categories": evaluate_model(transcript)}
# Run API: uvicorn app:app --reloadConclusion: Empowering Humanity Through Technology
#Our AI-driven system exemplifies how cutting-edge technology can address real-world challenges responsibly. By combining:
- Synthetic data generation for scalability,
- Knowledge Distillation for efficiency,
- Multilingual model training for inclusivity, and
- Secure API deployment for accessibility,
we've created a tool that empowers humanitarian workers to focus on what matters most: helping refugees rebuild their lives.
Call to Action
#Are you passionate about leveraging AI for social good? Share your thoughts in the comments below! Let's collaborate to create impactful solutions that make the world a better place.
💡 Follow me for more AI-driven humanitarian innovations. Together, we can drive change—one line of code at a time.
Want More Insights?
Subscribe to receive articles on AI strategy, data innovation, and humanitarian technology delivered to your inbox.