通用

omnicaptions-translate

data/skills-content.json#omnicaptions-translate

name: omnicaptions-translate description: Use when translating captions/captions to another language. Supports bilingual output and context-aware translation. Default uses Claude native, Gemini API optional. allowed-tools: Bash(omnicaptions:*), Read, Write, Glob

Caption Translation

Default: Claude native translation (no API key needed)

Use Gemini API only when user explicitly requests it.

Default Workflow (Claude)

Read the caption file
Translate using Claude's native ability
Write output with _Claude_{lang} suffix

Gemini API (Optional)

Use CLI when user requests Gemini:

omnicaptions translate input.srt -l zh --bilingual

Output: input_Gemini_zh.srt

When to Use

Translate SRT/VTT/ASS to another language
Generate bilingual captions (original + translation)
Translate YouTube video transcripts
Need context-aware translation (not line-by-line)

When NOT to Use

Need transcription (use /omnicaptions:transcribe)
Just format conversion without translation (use /omnicaptions:convert)

Setup

pip install https://github.com/lattifai/omni-captions-skills/raw/main/packages/lattifai_captions-0.1.0.tar.gz
pip install https://github.com/lattifai/omni-captions-skills/raw/main/packages/omnicaptions-0.1.0.tar.gz

API Key

Priority: GEMINI_API_KEY env → .env file → ~/.config/omnicaptions/config.json

If not set, ask user: Please enter your Gemini API key (get from https://aistudio.google.com/apikey):

Then run with -k <key>. Key will be saved to config file automatically.

Context-Aware Translation

LLM-based translation is superior to traditional machine translation because it understands context across multiple lines:

Why Context Matters

Approach	Problem	Result
Line-by-line	No context	Robotic, disconnected translations
Batch + Context	Sees surrounding lines	Natural, coherent dialogue

How It Works

┌─────────────────────────────────────────┐
│  Batch size: 30 lines                   │
│  Context: 5 lines before/after          │
├─────────────────────────────────────────┤
│  [5 previous lines] → context           │
│  [30 current lines] → translate         │
│  [5 next lines]     → preview           │
└─────────────────────────────────────────┘

Benefits:

Speaker continuity - maintains character voice
Split sentences - handles dialogue spanning multiple lines
Idioms & culture - adapts cultural references naturally
Pronoun resolution - correct he/she/they based on context

Advanced Features

Bilingual Output

# Original + Translation (for language learning)
omnicaptions translate input.srt -l zh --bilingual

Output example:

1
00:00:01,000 --> 00:00:03,500
Welcome to the show.
欢迎来到节目。

2
00:00:03,500 --> 00:00:06,000
Thank you for having me.
感谢邀请我。

Custom Glossary (Coming Soon)

For domain-specific or branded content:

# Use glossary for consistent terminology
omnicaptions translate input.srt -l zh --glossary terms.json

Glossary format:

{
  "API": "接口",
  "Token": "令牌",
  "Machine Learning": "机器学习"
}

Benefits:

Terminology consistency - "one term, one translation"
Brand compliance - use official product names
Domain accuracy - medical, legal, technical terms

Best Practices

1. Provide Context for Better Quality

For specialized content, use custom prompts:

from omnicaptions import GeminiCaption

gc = GeminiCaption()
gc._translation_prompt = """
You are translating captions for a medical documentary.
Use formal Chinese medical terminology.
Glossary: {glossary}
"""
gc.translate("input.srt", "output.srt", "zh")

2. Choose the Right Model

Model	Best For
`gemini-3-flash-preview`	Fast, everyday content
`gemini-3-pro-preview`	Complex, nuanced content

3. Review Bilingual Output

Bilingual captions let viewers verify translation quality - ideal for:

Language learners
Quality assurance
Accessibility

CLI Usage

# Translate (auto-output to same directory)
omnicaptions translate input.srt -l zh         # → ./input_Gemini_zh.srt

# Specify output file or directory
omnicaptions translate input.srt -o output/ -l zh   # → output/input_Gemini_zh.srt
omnicaptions translate input.srt -o zh.srt -l zh    # → zh.srt

# Bilingual output (original + translation)
omnicaptions translate input.srt -l zh --bilingual

# Specify model
omnicaptions translate input.vtt -l ja -m gemini-3-pro-preview

Option	Description
`-k, --api-key`	Gemini API key (auto-prompted if missing)
`-o, --output`	Output file or directory (default: same dir as input)
`-l, --language`	Target language code (required)
`--bilingual`	Output both original and translation
`-m, --model`	Model name (default: gemini-3-flash-preview)
`-v, --verbose`	Verbose output

Language Codes

Language	Code
Chinese (Simplified)	`zh`
Chinese (Traditional)	`zh-TW`
Japanese	`ja`
Korean	`ko`
English	`en`
Spanish	`es`
French	`fr`
German	`de`

Supported Formats

All formats from lattifai-captions: SRT, VTT, ASS, TTML, JSON, Gemini MD, etc.

Common Mistakes

Mistake	Fix
No API key	Use `-k YOUR_KEY` or follow the prompt
Wrong language code	Use ISO codes: zh, ja, en, etc.
Lost formatting	ASS styles preserved; SRT basic only
Inconsistent terms	Use glossary for technical content

References

Caption LLM Translator - Context window approach
Caption Translator - Batch processing
Captions.Translate.Agent - Multi-agent workflow

Related Skills

Skill	Use When
`/omnicaptions:transcribe`	Need transcript first
`/omnicaptions:LaiCut`	Align timing before translation
`/omnicaptions:convert`	Convert format after translation
`/omnicaptions:download`	Download captions to translate

Workflow Examples

Important: Generate bilingual captions AFTER LaiCut alignment.

File naming convention - preserve language tag and processing chain:

video.en.vtt → video.en_LaiCut.json → video.en_LaiCut.srt → video.en_LaiCut_Claude_zh.srt → video.en_LaiCut_Claude_zh_Color.ass

翻译方式	后缀	示例
Claude (默认)	`_Claude_zh`	`video.en_LaiCut_Claude_zh.srt`
Gemini API	`_Gemini_zh`	`video.en_LaiCut_Gemini_zh.srt`

# 1. LaiCut 对齐 (保留词级时间)
omnicaptions LaiCut video.mp4 video.en.vtt
# → video.en_LaiCut.json

# 2. 转换为 SRT (翻译用，文件小)
omnicaptions convert video.en_LaiCut.json -o video.en_LaiCut.srt

# 3a. Claude 翻译 (默认)
# → video.en_LaiCut_Claude_zh.srt

# 3b. 或 Gemini 翻译
omnicaptions translate video.en_LaiCut.srt -l zh --bilingual
# → video.en_LaiCut_Gemini_zh.srt

# 4. 转换为带颜色的 ASS
omnicaptions convert video.en_LaiCut_Claude_zh.srt -o video.en_LaiCut_Claude_zh_Color.ass \
  --line1-color "#00FF00" --line2-color "#FFFF00"

Large JSON Files

LaiCut outputs JSON with word-level timing. For translation, convert to SRT first (much smaller):

# JSON (word-level, ~150KB) → SRT (segment-level, ~15KB)
omnicaptions convert video.en_LaiCut.json -o video.en_LaiCut.srt

Why? JSON preserves word timing for karaoke, but translation only needs segment text. SRT is 10-20x smaller.

Claude Translation Rules (Default)

Preserve format exactly - Keep all timing codes, formatting tags, style definitions
Context-aware - Consider surrounding lines for coherent dialogue
Speaker consistency - Maintain character voice and tone
Cultural adaptation - Adapt idioms and references naturally
Large files - Process in batches of 100 lines to maintain quality

Claude vs Gemini

Feature	Claude (Default)	Gemini API
API Key	None needed	Required
Invocation	Skill (Read/Write)	CLI command
Output suffix	`_Claude_{lang}`	`_Gemini_{lang}`
Best for	Most tasks	Large files, automation

SKILL.md 原文

---
name: omnicaptions-translate
description: 
---

---
name: omnicaptions-translate
description: Use when translating captions/captions to another language. Supports bilingual output and context-aware translation. Default uses Claude native, Gemini API optional.
allowed-tools: Bash(omnicaptions:*), Read, Write, Glob
---

# Caption Translation

**Default: Claude native translation** (no API key needed)

Use Gemini API only when user explicitly requests it.

## Default Workflow (Claude)

1. Read the caption file
2. Translate using Claude's native ability
3. Write output with `_Claude_{lang}` suffix

## Gemini API (Optional)

Use CLI when user requests Gemini:

```bash
omnicaptions translate input.srt -l zh --bilingual
```

Output: `input_Gemini_zh.srt`

## When to Use

- Translate SRT/VTT/ASS to another language
- Generate bilingual captions (original + translation)
- Translate YouTube video transcripts
- Need context-aware translation (not line-by-line)

## When NOT to Use

- Need transcription (use `/omnicaptions:transcribe`)
- Just format conversion without translation (use `/omnicaptions:convert`)

## Setup

```bash
pip install https://github.com/lattifai/omni-captions-skills/raw/main/packages/lattifai_captions-0.1.0.tar.gz
pip install https://github.com/lattifai/omni-captions-skills/raw/main/packages/omnicaptions-0.1.0.tar.gz
```

## API Key

Priority: `GEMINI_API_KEY` env → `.env` file → `~/.config/omnicaptions/config.json`

If not set, ask user: `Please enter your Gemini API key (get from https://aistudio.google.com/apikey):`

Then run with `-k <key>`. Key will be saved to config file automatically.

## Context-Aware Translation

LLM-based translation is superior to traditional machine translation because it understands context across multiple lines:

### Why Context Matters

| Approach | Problem | Result |
|----------|---------|--------|
| Line-by-line | No context | Robotic, disconnected translations |
| **Batch + Context** | Sees surrounding lines | Natural, coherent dialogue |

### How It Works

```
┌─────────────────────────────────────────┐
│  Batch size: 30 lines                   │
│  Context: 5 lines before/after          │
├─────────────────────────────────────────┤
│  [5 previous lines] → context           │
│  [30 current lines] → translate         │
│  [5 next lines]     → preview           │
└─────────────────────────────────────────┘
```

Benefits:
- **Speaker continuity** - maintains character voice
- **Split sentences** - handles dialogue spanning multiple lines
- **Idioms & culture** - adapts cultural references naturally
- **Pronoun resolution** - correct he/she/they based on context

## Advanced Features

### Bilingual Output

```bash
# Original + Translation (for language learning)
omnicaptions translate input.srt -l zh --bilingual
```

Output example:
```srt
1
00:00:01,000 --> 00:00:03,500
Welcome to the show.
欢迎来到节目。

2
00:00:03,500 --> 00:00:06,000
Thank you for having me.
感谢邀请我。
```

### Custom Glossary (Coming Soon)

For domain-specific or branded content:

```bash
# Use glossary for consistent terminology
omnicaptions translate input.srt -l zh --glossary terms.json
```

Glossary format:
```json
{
  "API": "接口",
  "Token": "令牌",
  "Machine Learning": "机器学习"
}
```

Benefits:
- **Terminology consistency** - "one term, one translation"
- **Brand compliance** - use official product names
- **Domain accuracy** - medical, legal, technical terms

## Best Practices

### 1. Provide Context for Better Quality

For specialized content, use custom prompts:

```python
from omnicaptions import GeminiCaption

gc = GeminiCaption()
gc._translation_prompt = """
You are translating captions for a medical documentary.
Use formal Chinese medical terminology.
Glossary: {glossary}
"""
gc.translate("input.srt", "output.srt", "zh")
```

### 2. Choose the Right Model

| Model | Best For |
|-------|----------|
| `gemini-3-flash-preview` | Fast, everyday content |
| `gemini-3-pro-preview` | Complex, nuanced content |

### 3. Review Bilingual Output

Bilingual captions let viewers verify translation quality - ideal for:
- Language learners
- Quality assurance
- Accessibility

## CLI Usage

```bash
# Translate (auto-output to same directory)
omnicaptions translate input.srt -l zh         # → ./input_Gemini_zh.srt

# Specify output file or directory
omnicaptions translate input.srt -o output/ -l zh   # → output/input_Gemini_zh.srt
omnicaptions translate input.srt -o zh.srt -l zh    # → zh.srt

# Bilingual output (original + translation)
omnicaptions translate input.srt -l zh --bilingual

# Specify model
omnicaptions translate input.vtt -l ja -m gemini-3-pro-preview
```

| Option | Description |
|--------|-------------|
| `-k, --api-key` | Gemini API key (auto-prompted if missing) |
| `-o, --output` | Output file or directory (default: same dir as input) |
| `-l, --language` | Target language code (required) |
| `--bilingual` | Output both original and translation |
| `-m, --model` | Model name (default: gemini-3-flash-preview) |
| `-v, --verbose` | Verbose output |

## Language Codes

| Language | Code |
|----------|------|
| Chinese (Simplified) | `zh` |
| Chinese (Traditional) | `zh-TW` |
| Japanese | `ja` |
| Korean | `ko` |
| English | `en` |
| Spanish | `es` |
| French | `fr` |
| German | `de` |

## Supported Formats

All formats from `lattifai-captions`: SRT, VTT, ASS, TTML, JSON, Gemini MD, etc.

## Common Mistakes

| Mistake | Fix |
|---------|-----|
| No API key | Use `-k YOUR_KEY` or follow the prompt |
| Wrong language code | Use ISO codes: zh, ja, en, etc. |
| Lost formatting | ASS styles preserved; SRT basic only |
| Inconsistent terms | Use glossary for technical content |

## References

- [Caption LLM Translator](https://github.com/yigitkonur/caption-llm-translator) - Context window approach
- [Caption Translator](https://github.com/rockbenben/caption-translator) - Batch processing
- [Captions.Translate.Agent](https://github.com/captionsdog/Captions.Translate.Agent) - Multi-agent workflow

## Related Skills

| Skill | Use When |
|-------|----------|
| `/omnicaptions:transcribe` | Need transcript first |
| `/omnicaptions:LaiCut` | Align timing before translation |
| `/omnicaptions:convert` | Convert format after translation |
| `/omnicaptions:download` | Download captions to translate |

### Workflow Examples

**Important**: Generate bilingual captions AFTER LaiCut alignment.

**File naming convention** - preserve language tag and processing chain:
```
video.en.vtt → video.en_LaiCut.json → video.en_LaiCut.srt → video.en_LaiCut_Claude_zh.srt → video.en_LaiCut_Claude_zh_Color.ass
```

| 翻译方式 | 后缀 | 示例 |
|----------|------|------|
| Claude (默认) | `_Claude_zh` | `video.en_LaiCut_Claude_zh.srt` |
| Gemini API | `_Gemini_zh` | `video.en_LaiCut_Gemini_zh.srt` |

```bash
# 1. LaiCut 对齐 (保留词级时间)
omnicaptions LaiCut video.mp4 video.en.vtt
# → video.en_LaiCut.json

# 2. 转换为 SRT (翻译用，文件小)
omnicaptions convert video.en_LaiCut.json -o video.en_LaiCut.srt

# 3a. Claude 翻译 (默认)
# → video.en_LaiCut_Claude_zh.srt

# 3b. 或 Gemini 翻译
omnicaptions translate video.en_LaiCut.srt -l zh --bilingual
# → video.en_LaiCut_Gemini_zh.srt

# 4. 转换为带颜色的 ASS
omnicaptions convert video.en_LaiCut_Claude_zh.srt -o video.en_LaiCut_Claude_zh_Color.ass \
  --line1-color "#00FF00" --line2-color "#FFFF00"
```

### Large JSON Files

LaiCut outputs JSON with word-level timing. **For translation, convert to SRT first** (much smaller):

```bash
# JSON (word-level, ~150KB) → SRT (segment-level, ~15KB)
omnicaptions convert video.en_LaiCut.json -o video.en_LaiCut.srt
```

Why? JSON preserves word timing for karaoke, but translation only needs segment text. SRT is 10-20x smaller.

## Claude Translation Rules (Default)

1. **Preserve format exactly** - Keep all timing codes, formatting tags, style definitions
2. **Context-aware** - Consider surrounding lines for coherent dialogue
3. **Speaker consistency** - Maintain character voice and tone
4. **Cultural adaptation** - Adapt idioms and references naturally
5. **Large files** - Process in batches of 100 lines to maintain quality

## Claude vs Gemini

| Feature | Claude (Default) | Gemini API |
|---------|------------------|------------|
| API Key | None needed | Required |
| Invocation | Skill (Read/Write) | CLI command |
| Output suffix | `_Claude_{lang}` | `_Gemini_{lang}` |
| Best for | Most tasks | Large files, automation |

来源： lattifai/omni-captions-skills | 协议: MIT