Harmony: OpenAI's Renderer for GPT-OSS Response Format

Introduction

OpenAI Harmony is a dedicated library for rendering and parsing the harmony response format, crucial for interacting with OpenAI's gpt-oss open-weight model series. The gpt-oss models are trained on this specific format, which defines conversation structures, generates reasoning output, and structures function calls. While users interacting with gpt-oss through an API or providers like HuggingFace, Ollama, or vLLM might not need to handle this format directly, developers building custom inference solutions will find Harmony essential for ensuring correct model operation. The format is designed to mimic the OpenAI Responses API, making it familiar to those who have used that API before.

Installation

Harmony offers robust support for both Python and Rust, allowing developers to integrate it seamlessly into their projects.

Python

Install the package from PyPI:

pip install openai-harmony
# or if you are using uv
uv pip install openai-harmony

For comprehensive documentation, please refer to the official Python documentation.

Rust

Add the dependency to your Cargo.toml:

[dependencies]
openai-harmony = { git = "https://github.com/openai/harmony" }

For comprehensive documentation, please refer to the official Rust documentation.

Examples

Here are examples demonstrating how to use Harmony in both Python and Rust to render and parse conversations.

Python Example

from openai_harmony import (
    load_harmony_encoding,
    HarmonyEncodingName,
    Role,
    Message,
    Conversation,
    DeveloperContent,
    SystemContent,
)
enc = load_harmony_encoding(HarmonyEncodingName.HARMONY_GPT_OSS)
convo = Conversation.from_messages([
    Message.from_role_and_content(
        Role.SYSTEM,
        SystemContent.new(),
    ),
    Message.from_role_and_content(
        Role.DEVELOPER,
        DeveloperContent.new().with_instructions("Talk like a pirate!")
    ),
    Message.from_role_and_content(Role.USER, "Arrr, how be you?"),
])
tokens = enc.render_conversation_for_completion(convo, Role.ASSISTANT)
print(tokens)
# Later, after the model responded …
parsed = enc.parse_messages_from_completion_tokens(tokens, role=Role.ASSISTANT)
print(parsed)

Rust Example

use openai_harmony::chat::{Message, Role, Conversation};
use openai_harmony::{HarmonyEncodingName, load_harmony_encoding};

fn main() -> anyhow::Result<()> {
    let enc = load_harmony_encoding(HarmonyEncodingName::HarmonyGptOss)?;
    let convo =
        Conversation::from_messages([Message::from_role_and_content(Role::User, "Hello there!")]);
    let tokens = enc.render_conversation_for_completion(&convo, Role::Assistant, None)?;
    println!("{:?}", tokens);
    Ok(())
}

Why Use Harmony?

Harmony offers several compelling advantages for developers working with gpt-oss models:

Consistent formatting: A shared implementation for both rendering and parsing ensures token-sequences remain loss-free, maintaining data integrity.
Blazing fast performance: The core logic, built in Rust, provides high performance for demanding applications.
First-class Python support: Enjoy seamless integration with Python projects, including easy installation via pip, comprehensive typed stubs, and 100% test parity with the Rust suite.

Harmony: OpenAI's Renderer for GPT-OSS Response Format

Summary

Repository Info

Tags