Skip to content

Instantly share code, notes, and snippets.

@n-studio
Last active January 10, 2026 14:39
Show Gist options
  • Select an option

  • Save n-studio/c86f6c1db3309c21551a0b5665a79c0b to your computer and use it in GitHub Desktop.

Select an option

Save n-studio/c86f6c1db3309c21551a0b5665a79c0b to your computer and use it in GitHub Desktop.
LLTS format specs

Letter‑Aligned Lyric Timing Specification (LLTS)

Format name: Letter‑Aligned Lyric Timing Specification Short name: LLTS File extension: .llts

1. Purpose and Scope

LLTS defines a copyright‑safe interchange format for precise lyric timing, including sub‑word (letter‑range) alignment, without containing any lyric text. The format is intended for karaoke engines, DAWs, captioning tools, and research systems that require deterministic alignment while relying on user‑supplied lyrics obtained separately.

LLTS files contain only:

  • Timing metadata
  • Numeric character ranges
  • A cryptographic hash identifying the intended lyric text

No copyrighted expression is included.


2. Design Principles

  1. No lyrical content: No letters, words, or substrings are stored.
  2. Deterministic alignment: Character ranges reference an external lyric string.
  3. Version safety: A cryptographic hash ensures the correct lyric variant is used.
  4. Irreversibility: Data cannot be used to reconstruct lyrics.
  5. Implementation neutrality: Usable across platforms and languages.

3. External Lyric Requirement

An LLTS file is valid only when paired with an external lyric string supplied by the user or host application.

The host application MUST:

  • Normalize the external lyric string using the algorithm in §6
  • Compute its hash using the parameters in §7
  • Compare the result to the lyrics_hash field

If the hashes do not match, the LLTS file MUST be rejected or treated as unaligned.


4. Data Model Overview

LLTS is a UTF‑8 encoded text file using JSON syntax.

Top‑level object:

  • llts_version
  • track
  • lyrics_hash
  • hash_algorithm
  • normalization
  • timing

5. Header Fields

5.1 llts_version

String. Semantic version of the specification.

Example:

"llts_version": "1.0"

5.2 track

Object containing optional identification metadata.

Allowed fields (all OPTIONAL):

  • title (string)
  • artist (string)
  • duration_ms (integer)

These fields are informational and not used for validation.


5.3 lyrics_hash

String. Hex‑encoded cryptographic hash of the normalized external lyrics.

Example:

"lyrics_hash": "9f2c8b7e3a0d…"

5.4 hash_algorithm

String identifier of the hash function.

RECOMMENDED:

  • SHA-256

Other algorithms MAY be supported if explicitly declared.


5.5 normalization

Object declaring the normalization rules applied prior to hashing.

This object describes the algorithm but does not include the normalized text.

Example:

"normalization": {
  "unicode": "NFC",
  "case": "lower",
  "whitespace": "collapse",
  "line_endings": "lf",
  "punctuation": "remove"
}

6. Normalization Algorithm (Normative)

The following steps MUST be applied in order to the external lyric string prior to hashing:

  1. Convert text to Unicode NFC form
  2. Convert all letters to lowercase
  3. Normalize all line endings (CR, CRLF, LF) to LF (U+000A)
  4. Collapse multiple consecutive line breaks into a single LF
  5. Replace any remaining sequence of whitespace characters (including tabs) with a single ASCII space (U+0020)
  6. Remove punctuation characters (Unicode General Category P*)
  7. Trim leading and trailing whitespace

The resulting normalized string:

  • Has deterministic handling of line returns
  • Preserves line boundaries via single LF characters
  • Is used only as input to the hash function

The normalized string MUST NOT be stored or transmitted.


7. Timing Entries

7.1 Structure

timing is an array of timing objects. Each object represents a time‑aligned character range.

Required fields:

  • start_char
  • end_char
  • start_ms
  • end_ms

Optional fields:

  • line_id
  • voice
  • layer

Example:

{
  "start_char": 134,
  "end_char": 138,
  "start_ms": 12340,
  "end_ms": 12520,
  "line_id": 5,
  "voice": "lead"
}

7.2 Character Indexing Rules

  • Indexing is zero‑based
  • Indices refer to the normalized lyric string
  • The normalized string includes LF characters that mark line boundaries
  • Ranges are [start_char, end_char) (end exclusive)
  • Ranges MUST NOT overlap unless explicitly supported by the host application

7.3 Timing Rules

  • Times are expressed in milliseconds relative to audio start
  • end_ms MUST be greater than start_ms
  • Timing entries MAY be contiguous or gapped

8. Line Structure

8.1 Line Table

LLTS defines lyric lines explicitly using a lines array.

Each line object defines a continuous character span in the normalized lyric string.

Required fields:

  • line_id (integer)
  • start_char
  • end_char

Optional fields:

  • voice
  • description

Example:

{
  "line_id": 5,
  "start_char": 120,
  "end_char": 160,
  "voice": "duet_a"
}

Line spans MUST align with LF boundaries in the normalized lyric string.


9. Voice and Role Annotation

The optional voice field classifies who performs the lyric segment.

RECOMMENDED enumerations:

  • lead
  • duet_a
  • duet_b
  • group
  • backing
  • spoken

Custom values MAY be used if documented by the application.

The optional layer field MAY be used to express overlapping voices or harmonies.


10. Validation Rules (Normative)

An LLTS file is valid if:

  1. The file parses as valid JSON
  2. llts_version is supported
  3. lyrics_hash matches the hash of the normalized external lyrics
  4. All line spans fall within the lyric string length
  5. All timing ranges fall within their referenced line spans
  6. Timing values are monotonically non‑decreasing per layer

9. Security and Copyright Considerations

  • LLTS files contain no copyrighted text
  • Hashes are one‑way and non‑reversible
  • Character ranges are meaningless without the external lyric string
  • The format does not enable lyric reconstruction

Implementations SHOULD require users to supply lyrics obtained lawfully.


10. Extensibility

Future versions MAY add optional fields, including:

  • Phoneme class labels (non‑textual)
  • Confidence scores
  • Multiple language tracks

No extension may include lyric text unless explicitly licensed.


11. Example (Complex, Normative)

The following example demonstrates all major LLTS features:

  • Explicit line boundaries
  • Deterministic normalization assumptions
  • Multiple voices (lead, duet, backing)
  • Overlapping layers
  • Per-letter timing

This example assumes the user supplies the correct external lyrics whose normalized form hashes to the value shown.

{
  "llts_version": "1.0",
  "track": {
    "title": "Example Song",
    "artist": "Example Artist",
    "duration_ms": 215000
  },
  "lyrics_hash": "4d7c2a9e6f3b8c1a0e9f4d2c7b1a6e5f8c9d0a1b2c3d4e5f6a7b8c9d0e1f",
  "hash_algorithm": "SHA-256",
  "normalization": {
    "unicode": "NFC",
    "case": "lower",
    "whitespace": "collapse",
    "line_endings": "lf",
    "punctuation": "remove"
  },

  "lines": [
    {
      "line_id": 0,
      "start_char": 0,
      "end_char": 22,
      "voice": "lead",
      "description": "Verse 1 – lead vocal"
    },
    {
      "line_id": 1,
      "start_char": 23,
      "end_char": 47,
      "voice": "duet_a",
      "description": "Chorus – first singer"
    },
    {
      "line_id": 2,
      "start_char": 23,
      "end_char": 47,
      "voice": "duet_b",
      "description": "Chorus – second singer"
    },
    {
      "line_id": 3,
      "start_char": 48,
      "end_char": 72,
      "voice": "backing",
      "description": "Backing vocals"
    }
  ],

  "timing": [
    {
      "start_char": 0,
      "end_char": 4,
      "start_ms": 1200,
      "end_ms": 1450,
      "line_id": 0,
      "voice": "lead",
      "layer": 0
    },
    {
      "start_char": 4,
      "end_char": 9,
      "start_ms": 1450,
      "end_ms": 1800,
      "line_id": 0,
      "voice": "lead",
      "layer": 0
    },
    {
      "start_char": 23,
      "end_char": 28,
      "start_ms": 32000,
      "end_ms": 33500,
      "line_id": 1,
      "voice": "duet_a",
      "layer": 0
    },
    {
      "start_char": 23,
      "end_char": 28,
      "start_ms": 32000,
      "end_ms": 33500,
      "line_id": 2,
      "voice": "duet_b",
      "layer": 1
    },
    {
      "start_char": 48,
      "end_char": 52,
      "start_ms": 60000,
      "end_ms": 62000,
      "line_id": 3,
      "voice": "backing",
      "layer": 0
    },
    {
      "start_char": 52,
      "end_char": 58,
      "start_ms": 62000,
      "end_ms": 65000,
      "line_id": 3,
      "voice": "backing",
      "layer": 0
    }
  ]
}

12. License

This specification is licensed under the Apache License, Version 2.0 (the "License").

You may not use this specification except in compliance with the License. You may obtain a copy of the License at:

http://www.apache.org/licenses/LICENSE-2.0

Unless required by applicable law or agreed to in writing, this specification is distributed on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the specific language governing permissions and limitations under the License.

Copyright © The LLTS Contributors.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment