Skip to content
中文

Riffusion API Notes

Overview

Riffusion is a music-generation API that can create songs from lyrics, themes, or existing audio. This page explains the main workflow, how the endpoints relate to each other, and when to use each creation mode.

Core Workflow

The standard workflow usually has three stages:

  1. submit a music-generation task through professional mode or inspiration mode
  2. query the result by task ID until the track is ready
  3. optionally upload audio and perform secondary creation through transform modes

Authentication

All requests require an authorization header:

text
Authorization: Bearer your-key

Main Endpoint Guidance

1. Music Generation

You can generate music in two ways:

  • Professional mode: /riffusion/generate Provide structured lyrics, tags, and a title. Best when you already know what you want to create.
  • Inspiration mode: /riffusion/generate/topic Provide only a theme, and the system generates lyrics and music automatically. Best when you want fast creative exploration.

Both modes return two task IDs. Each task corresponds to an independent generated track so you can compare alternative versions.

2. Querying Results

After submission, call /riffusion/feed/{riff_id1},{riff_id2} with the returned task IDs.

Important fields:

  • status: processing state such as process or success
  • process: progress value from 0 to 100
  • generations[].audio_url: whether audio is already available
  • audio_url: direct URL of the finished track

Polling with a reasonable backoff strategy is recommended until the status becomes success.

3. Audio Upload and Secondary Creation

Riffusion supports uploading an existing audio file and then transforming it:

  1. call /riffusion/upload with an audio file URL
  2. get the returned id and use it as audio_upload_id
  3. choose a transform mode for secondary creation

4. Transform Modes

Set transform in the morph object:

  • cover: keep the original style while replacing lyrics and vocals
  • extend: continue generating from an existing song; requires crop_end_at
  • inpaint: replace part of a song; requires replace_start_at and replace_end_at
  • swap_vocals: replace only vocals and keep the instrumental
  • swap_sound: replace only the instrumental and keep the vocals

Each mode can use either a generated riff_id or an uploaded audio_upload_id as the source material.

Creative Controls

The morph object exposes fine-grained controls:

  • normalized_lyrics_strength: lyric influence (0-1)
  • normalized_sound_prompt_strength: sound-prompt influence (0-1)
  • normalized_weirdness: creative weirdness (0-1)
  • normalized_cover_strength: how faithfully cover mode follows the original track (0-1)
  • normalized_variation_strength: variation strength (0-1)

Higher values strengthen the corresponding effect; lower values weaken it.

Typical Scenarios

  1. create an original song from lyrics or a theme
  2. adapt an existing song with a different style or language through cover
  3. continue or partially replace a track with extend or inpaint
  4. keep the original style but replace vocals with swap_vocals
  5. keep the lyrics but replace accompaniment with swap_sound

Best Practices

  1. structured lyrics using markers like [Verse] and [Chorus] usually produce better results
  2. use exponential backoff while polling task results
  3. start with inspiration mode for rough ideas, then move to professional mode for refinement
  4. adjust normalized_* values in small increments and compare results
  5. keep generated song IDs so later transform modes can reuse them