← Back to main

From Annotation to Application

Building a Pipeline for Metaphor Language

Metaphors shape the way we speak about almost everything. We talk about burning ambition, heavy responsibilities, or stabbing guilt. These expressions are vivid and powerful, but they are also challenging for researchers and systems that try to analyse them.

Metaphors matter in everyday communication, and also in specific contexts such as health. Patients often describe their experiences in metaphorical terms, and those words influence how their suffering is perceived and managed. Metaphors also matter in politics, migration debates, and cultural narratives. They are everywhere, and they carry meaning.

To capture these meanings, we need more than one tool. We need a way to collect and structure human annotations, a way to test automatic detection, and a way to communicate insights clearly to wider audiences.

This is why I have been building three connected tools: Annotation App (schema-driven datasets), Metaphor Tagger (automatic detection), and Explain My Pain (human-facing reports). Together, they form a pipeline that bridges linguistics, NLP, and intercultural communication.

Step 1: Annotation App

Most annotation platforms are designed by engineers for engineers. They are good at collecting labels but rarely capture the nuances that matter in linguistic research.

The Annotation App is different because it is built from a linguistic schema. That means:

This approach comes directly from my academic background: categories are not neutral. They shape what researchers can see and what models can learn. The app can be adapted to pain language, but also to political discourse, emotional expression, or intercultural shifts.

Workflow
         ┌──────────────────────┐
         │   Raw Text Data      │
         │ (interviews, notes,  │
         │  surveys, corpora)   │
         └─────────┬────────────┘
                   ▼
         ┌──────────────────────┐
         │  Annotation App      │
         │  - Import CSV/JSONL  │
         │  - Apply schema      │
         │  - Human annotation  │
         │  - Export dataset    │
         └─────────┬────────────┘

Step 2: Metaphor Tagger

The Metaphor Tagger is not a generic keyword matcher. It is rooted in linguistic metaphor theory.

Pain language is one effective case study, but the same method applies to political metaphors, educational discourse, or climate communication.

Workflow
         ┌──────────────────────┐
         │  Annotation App      │
         └─────────┬────────────┘
                   │ Gold-standard annotations
                   ▼
         ┌──────────────────────┐
         │   Metaphor Tagger    │
         │  - Auto-detect spans │
         │  - Assign categories │
         └─────────┬────────────┘
                   │ System-generated annotations
                   ▼
         ┌──────────────────────────────┐
         │    Evaluation & Comparison   │
         │  - Tagger vs Human           │
         │  - Metrics: Precision, Recall│
         │    F1, κ                     │
         └──────────────────────────────┘

Step 3: Explain My Pain

Explain My Pain turns model outputs into meaningful feedback for patients and clinicians.

This is a case study of how research tools can become accessible applications. The same idea could extend to public discourse (Explain My Politics) or climate narratives (Explain My Climate).

Why all three matter together

What makes this pipeline unique is not just the technology, but the linguistics behind it. Schemas are grounded in theory. Taxonomies are derived from metaphor research. Outputs are designed with intercultural and communicative applications in mind.

Future directions

The next step is localisation. By annotating parallel English and Spanish corpora, we can explore how metaphors shift across languages and cultures. Does a fire metaphor carry the same weight in Spanish as in English? Do registers change when descriptions are translated for clinical or political contexts?

Workflow with localisation
         ┌─────────────────────────┐
         │  Raw EN/ES Data         │
         │ (parallel translations, │
         │  interviews, corpora)   │
         └───────────┬─────────────┘
                     ▼
         ┌─────────────────────────┐
         │   Annotation App        │
         │  - Schema with fields   │
         │    phenomenon, severity │
         │    translation_shift    │
         │    register             │
         │  - Parallel_id links    │
         └───────────┬─────────────┘
                     ▼
         ┌─────────────────────────┐
         │   Metaphor Tagger       │
         │  - Run on EN corpus     │
         │  - Run on ES corpus     │
         └───────────┬─────────────┘
                     ▼
         ┌───────────────────────────────────┐
         │   Evaluation & Intercultural      │
         │   Analysis                        │
         │  - Compare EN vs ES annotations   │
         │  - Identify translation shifts    │
         │  - Report intercultural insights  │
         └───────────────────────────────────┘

Conclusion

Metaphors are not only figures of speech; they are linguistic realities that shape how we experience the world. People give meaning to their experiences through metaphors, and those metaphors carry weight in health, politics, and everyday life.

With Annotation App, Metaphor Tagger, and Explain My Pain, we can capture, analyse, and share metaphors in ways that are systematic, scalable, and meaningful.

This pipeline is not only about technology. It is about making language visible, comparable, and usable across contexts and cultures.