Prompts

Prompt engineering is a core activity in AI engineering. Braintrust allows you to create prompts, test them out in the playground, use them in your code, update them, and track their performance over time. Our goal is to provide a world-class authoring experience in Braintrust, seamlessly, securely, and reliably integrate them into your code, and debug issues as they arise.

Creating a prompt

To create a prompt, visit the prompts tab in a project, and click the "+ Prompt" button. Pick a name and unique slug for your prompt. The slug is an immutable identifier that you can use to reference it in your code. As you change the prompt's name, description, or contents, its slug stays constant.

Create a prompt

Prompts can use mustache templating syntax to refer to variables. These variables are substituted automatically in the playground and using the .build() function in your code. More on that below.

Updating a prompt

Each prompt change is versioned, e.g. 5878bd218351fb8e. You can use this identifier to pin a specific version of the prompt in your code.

Update a prompt

You can use this identifier to refer to a specific version of the prompt in your code.

Testing in the playground

While developing a prompt, it can be useful to test it out on real-world data in the Playground. You can open a prompt in the playground, tweak it, and save a new version once you're ready.

Playground

Using prompts in your code

Executing directly

In Braintrust, a prompt is a simple function that can be invoked directly through the SDK and REST API. When invoked, prompt functions leverage the proxy to access a wide range of providers and models with managed secrets, and are automatically traced and logged to your Braintrust project. All functions are fully managed and versioned via the UI and API.

Functions are a broad concept that encompass prompts, code snippets, HTTP endpoints, and more. When using the functions API, you can use a prompt’s slug or ID as the function’s slug or ID, respectively.

 
import { invoke } from "braintrust";
 
async function main() {
  const result = await invoke({
    projectName: "your project name",
    slug: "your prompt slug",
    input: {
      // These variables map to the template parameters in your prompt.
      question: "1+1",
    },
  });
  console.log(result);
}
 
main();

The return value, result, is a string unless you have tool calls, in which case it returns the arguments of the first tool call. In typescript, you can assert this by using the schema argument, which ensures your code matches a particular zod schema:

import { invoke } from "braintrust";
import { z } from "zod";
 
async function main() {
  const result = await invoke({
    projectName: "your project name",
    slug: "your prompt slug",
    input: {
      question: "1+1",
    },
    schema: z.string()
  });
  console.log(result);
}
 
main();

Streaming

You can also stream results in an easy-to-parse format.

import { invoke } from "braintrust";
 
async function main() {
  const result = await invoke({
    projectName: "your project name",
    slug: "your prompt slug",
    input: {
      question: "1+1",
    },
    stream: true,
  });
 
  for await (const chunk of result) {
    console.log(chunk);
    // { type: "text_delta", data: "The answer "}
    // { type: "text_delta", data: "is 2"}
  }
}
 
main();

Vercel AI SDK

If you're using Next.js and the Vercel AI SDK, you can use the Braintrust adapter by installing the @braintrust/vercel-ai-sdk package and converting the stream to Vercel's format:

import { invoke } from "braintrust";
import { BraintrustAdapter } from "@braintrust/vercel-ai-sdk";
 
export async function POST(req: Request) {
  const stream = invoke({
    projectName: "your project name",
    slug: "your prompt slug",
    input: await req.json(),
    stream: true,
  });
 
  return BraintrustAdapter.toAIStreamResponse(stream);
}

Logging

Any invoke requests you make will be logged using the active logging state, just like a function decorated with @traced or wrapTraced. You can also pass in the parent argument, which is a string that you can derive from span.export() while doing distributed tracing.

Fetching in code

If you'd like to run prompts directly, you can fetch them using the Braintrust SDK. The loadPrompt()/load_prompt() function loads a prompt into a simple format that you can pass along to the OpenAI client. Prompts are cached upon initial load for fast subsequent retrieval operations.

import { OpenAI } from "openai";
import { initLogger, loadPrompt, wrapOpenAI } from "braintrust";
 
const logger = initLogger({ projectName: "your project name" });
 
// wrapOpenAI will make sure the client tracks usage of the prompt.
const client = wrapOpenAI(
  new OpenAI({
    apiKey: process.env.OPENAI_API_KEY,
  }),
);
 
async function runPrompt() {
  // Replace with your project name and slug
  const prompt = await loadPrompt({
    projectName: "your project name",
    slug: "your prompt slug",
    defaults: {
      // Parameters to use if not specified
      model: "gpt-3.5-turbo",
      temperature: 0.5,
    },
  });
 
  // Render with parameters
  return client.chat.completions.create(
    prompt.build({
      question: "1+1",
    }),
  );
}

If you need to use another model provider, then you can use the Braintrust proxy to access a wide range of models using the OpenAI format. You can also grab the messsages and other parameters directly from the returned object to use a model library of your choice.

Pinning a specific version

To pin a specific version of a prompt, use the loadPrompt()/load_prompt() function with the version identifier.

const prompt = await loadPrompt({
  projectName: "your project name",
  slug: "your prompt slug",
  version: "5878bd218351fb8e",
});

Versioning with git

Git sync is coming soon.

You can also download prompts to your local filesystem and ensure a specific version is used via version control. In addition to removing any risk around runtime performance, this approach also allows you to review changes to prompts in pull requests.

braintrust prompts pull (<project name> [<prompt slug>]) | <prompt id>

Prompts are stored in a simple, mustache-compatble format, so you can also edit them directly in your favorite text editor.

{{!
metadata:
    origin:
        id: <system generated id>
        version: <system generated version>
    parameters:
        model: gpt-3.5-turbo
        temperature: 0.5
}}
 
{{! system }}
 
You are a calculator
 
{{! user }}
 
{{input}}

If you update the plaintext file, you can push it back to Braintrust using the push command.

braintrust prompts push <path to prompt file>

To use a local prompt in your code, you can just import it directly.

import calculator from "./calculator.prompt";
 
async function runPrompt() {
  const client = wrapOpenAI(
    new OpenAI({
      apiKey: process.env.OPENAI_API_KEY,
    }),
  );
 
  // Build with parameters
  return client.chat.completions.create(
    calculator.build({
      question: "1+1",
    }),
  );
}

Deployment strategies

It is often useful to use different versions of a prompt in different environments. For example, you might want to use the latest version locally and in staging, but pin a specific version in production. This is simple to setup by conditionally passing a version to loadPrompt()/load_prompt() based on the environment.

const prompt = await loadPrompt({
  projectName: "your project name",
  slug: "your prompt slug",
  version:
    process.env.NODE_ENV === "production" ? "5878bd218351fb8e" : undefined,
});

Chat vs. completion format

In Python, prompt.build() returns a dictionary with chat or completion parameters, depending on the prompt type. In Typescript, however, prompt.build() accepts an additional parameter (flavor) to specify the format. This allows prompt.build to be used in a more type-safe manner. When you specify a flavor, the SDK also validates that the parameters are correct for that format.

const chatParams = prompt.build(
  {
    question: "1+1",
  },
  {
    // This is the default
    flavor: "chat",
  },
);
 
const completionParams = prompt.build(
  {
    input: "1+1",
  },
  {
    // Pass "completion" to get completion-shaped parameters
    flavor: "completion",
  },
);

Opening from traces

When you use a prompt in your code, Braintrust automatically links spans to the prompt used to generate them. This allows you to click to open a span in the playground, and see the prompt that generated it alongside the input variables. You can even test and save a new version of the prompt directly from the playground.

Open from traces

This workflow is very powerful. It effectively allows you to debug, iterate, and publish changes to your prompts directly within Braintrust. And because Braintrust flexibly allows you to load the latest prompt, a specific version, or even a version controlled artifact, you have a lot of control over how these updates propagate into your production systems.

Using the API

The full lifecycle of prompts -- creating, retrieving, modifying, etc. -- can be managed through the REST API. See the API docs for more details.