AI data labeling & digital CX. Collapsed like Appen.
TELUS Digital (formerly TELUS International) provided AI data annotation and digital customer experience services — essentially training data for machine learning models, alongside Appen. When LLMs became capable of generating synthetic training data and self-supervised learning reduced the need for massive labeled datasets, the entire industry imploded. The stock fell 38% in a single day in May 2024, then another 20% the next day, after management admitted that AI-based delivery caused 'complete eradication of margin yields.' A securities class action lawsuit followed. Like Appen, it's a cautionary tale of AI eating its own supply chain.
LLMs now generate their own training data and use smaller, targeted datasets. The massive human data labeling industry that TELUS Digital relied on is structurally dead — their own AI services cannibalized their higher-margin legacy work.
Peak valuation, strong data annotation demand from AI labs
Self-supervised learning and synthetic data reduce labeling demand
Stock crashes 38% + 20% in two days; management admits margin eradication
Securities class action lawsuit filed over AI-related omissions
Stock -60%+ from peak; company pivots desperately to 'AI-powered services'
Replace human data labeling with AI-powered annotation and synthetic data generation. Use frontier models to label, classify, and generate training data at 100x the speed and 1/10th the cost of human annotation teams.
Audit your current data labeling pipeline: volume, cost, quality, turnaround time
Create detailed annotation guidelines with 10-20 gold-standard examples
Build an LLM-powered labeling pipeline using Claude or GPT-4 API
Validate against a human-labeled gold set — target 90%+ agreement
For training data gaps: generate synthetic examples with controlled diversity
Use Argilla for human-in-the-loop QA on edge cases
You are an expert data annotator. Label the following {{data_type}} according to this schema: {{annotation_schema}} Examples of correct labels: {{examples}} Data to label: {{data}} Provide labels in JSON format. Include a confidence score (0-1) for each label. Flag any ambiguous cases.
Generate {{count}} synthetic {{data_type}} examples for training a {{model_purpose}} model. Requirements: - Match the distribution of this real data sample: {{sample}} - Include edge cases and rare categories - Vary length, complexity, and style - Ensure labels are accurate - Output as {{format}} with columns: {{columns}}
Monitor model performance to ensure synthetic/AI-labeled data maintains quality