Generate Social Clip

This skill is available as a machine-readable YAML playbook — append .md to any docs page for raw markdown, or install the Framelane skill. Drive the calls through the authenticated action MCP at https://mcp.framelane.io/mcp. Load it into any MCP-capable agent to get the complete workflow without writing integration code.

What this skill does

Given a source video URL and a time range, orchestrates:

Trim — in_point / out_point on the video element
Crop to vertical — crop_left / crop_right center-crop for 16:9 → 9:16 (auto-computed when omitted)
Transcribe — POST /v1/tasks/transcribe with word-level timestamps
Segment — group words into caption lines (sentence breaks or pauses > 0.8s)
Style — optional opening title with difference blend motion; caption lines cycle color → box → glow
Render — POST /v1/renders at 1080×1920 using the TikTok Captions recipe as the composition template

Example user prompt

Take the section from 00:22:43 to 00:44:56, crop to vertical, add an opening title with Difference blend, then burn in word-timed captions — mix color, box, and glow styles across different fonts.

The agent converts timecodes to seconds (1363 → 2696), runs this skill, and posts the resulting RenderRequest.

Load via MCP

@framelane-skill:generate-social-clip

Inputs

Name	Required	Description
`source_url`	required	URL of the source video
`start_seconds`	required	Clip start in source seconds (convert `HH:MM:SS` first)
`end_seconds`	required	Clip end in source seconds
`opening_title_text`	optional	Display title at clip start with `difference` blend motion
`opening_title_duration`	optional	Title visibility in seconds. Default: `3`.
`crop_left`	optional	Left crop fraction (0–1). Auto-computed for `16:9` when omitted (~`0.342`).
`crop_right`	optional	Right crop fraction. Defaults to `crop_left`.
`source_aspect_ratio`	optional	Used for auto-crop. Default: `"16:9"`.

Output

Name	Description
`clip_url`	Signed URL of the rendered 1080×1920 mp4
`render_id`	Render job ID for status polling

Timecode conversion

Convert HH:MM:SS to seconds before calling the skill:

00:22:43 → 22×60 + 43 = 1363
00:44:56 → 44×60 + 56 = 2696
duration → 2696 − 1363 = 1333 seconds

Vertical crop (16:9 → 9:16)

When crop_left / crop_right are omitted, center-crop a landscape source:

keep_fraction = (9/16) / (source_width / source_height)
crop_each_side = (1 − keep_fraction) / 2

For 16:9 source: crop_left = crop_right ≈ 0.342.

Caption line rules

After transcription, the agent:

Filters words to [start_seconds, end_seconds] and offsets to composition time (0 = clip start)
Groups words into lines — split on . ? ! or a gap > 0.8s between words
Skips words inside opening_title_duration when a title is set
Creates one text element per line — all reuse "id": "t1"
Sets each line’s time to the first word’s start and duration to last_word.end − time
Cycles word_animation.style: color → box → glow → color …

Word timestamps in word_animation.words are absolute composition seconds — the same clock as each text element’s time. See Word Animation Examples.

Render composition template

Follow the TikTok Captions recipe for element shapes. Minimal skeleton:

{
  "width": 1080,
  "height": 1920,
  "duration": 1333,
  "frame_rate": 30,
  "elements": [
    {
      "type": "video",
      "id": "vid1",
      "source_url": "https://cdn-user.framelane.io/uploads/.../podcast.mp4",
      "in_point": 1363,
      "out_point": 2696,
      "crop_left": 0.342,
      "crop_right": 0.342,
      "volume": 50
    },
    {
      "type": "text",
      "id": "title",
      "text": "Albert",
      "font_family": "Alfa Slab One",
      "text_color": "#FFFFFF",
      "font_size": 260,
      "x": "50%",
      "y": "70%",
      "time": 0,
      "duration": 3,
      "motion": [{ "type": "difference", "time": 0, "duration": 3 }]
    },
    {
      "type": "text",
      "id": "t1",
      "text": "When I came to you with those calculations",
      "time": 4.496,
      "duration": 2.493,
      "word_animation": {
        "style": "color",
        "words": [
          { "text": "When", "start": 4.496, "end": 4.577 }
        ]
      }
    }
  ]
}

Each additional caption line is another text element with the same "id": "t1" and the next style in the cycle.

MCP Server

Agent Skills

Generate Social Clip

What this skill does

Example user prompt

Load via MCP

Inputs

Output

Timecode conversion

Vertical crop (16:9 → 9:16)

Caption line rules

Render composition template

​What this skill does

​Example user prompt

​Load via MCP

​Inputs

​Output

​Timecode conversion

​Vertical crop (16:9 → 9:16)

​Caption line rules

​Render composition template

What this skill does

Example user prompt

Load via MCP

Inputs

Output

Timecode conversion

Vertical crop (16:9 → 9:16)

Caption line rules

Render composition template