LLM observability installation (beta)

Last updated:

|Edit this page

🚧 Note: LLM observability is currently considered in beta. To access it, enable the feature preview in your PostHog account.

We are keen to gather as much feedback as possible so if you try this out please let us know. You can email peter@posthog.com, send feedback via the in-app support panel, or use one of our other support options.

LLM observability gives you x-ray vision into your LLM applications. Here's what you can track:

  • Every conversation (inputs, outputs, and tokens) 🗣️
  • Model performance (cost, latency and error rates) 🤖
  • Full traces for when you need to go detective mode 🔍
  • How much each chat/user/organization is costing you 💰

The best part? All this data gets sent as regular PostHog events, where you can slice, dice, and analyze it in dashboards, insights, and alerts. And because we charge the same as regular PostHog events, it's roughly 10x cheaper than other LLM observability tools.

Observability installation

Setting up observability starts with installing the PostHog SDK for your language. LLM observability works best with our Python and Node SDKs.

pip install posthog

Note: You can use LLM observability with any of our SDKs, however you will need to capture the data manually via the capture method. See schema in the manual capture section.

The rest of the setup depends on the LLM platform you're using. These SDKs do not proxy your calls, they only fire off an async call to PostHog in the background to send the data.

Start by installing the OpenAI SDK:

pip install openai

In the spot where you initialize the OpenAI SDK, import PostHog and our OpenAI wrapper, initialize PostHog with your project API key and host (from your project settings), and pass it to our OpenAI wrapper.

from posthog.ai.openai import OpenAI
import posthog
posthog.project_api_key = "<ph_project_api_key>"
posthog.host = "https://us.i.posthog.com"
client = OpenAI(
api_key="your_openai_api_key",
posthog_client=posthog
)

Note: This also works with the AsyncOpenAI client.

Now, when you use the OpenAI SDK, it automatically captures many properties into PostHog including $ai_input, $ai_input_tokens, $ai_latency, $ai_model, $ai_model_parameters, $ai_output_choices, and $ai_output_tokens.

You can also capture or modify additional properties with the distinct ID, trace ID, properties, groups, and privacy mode parameters.

response = client.chat.completions.create(
model="gpt-3.5-turbo",
messages=[
{"role": "user", "content": "Tell me a fun fact about hedgehogs"}
],
posthog_distinct_id="user_123", # optional
posthog_trace_id="trace_123", # optional
posthog_properties={"conversation_id": "abc123", "paid": True}, # optional
posthog_groups={"company": "company_id_in_your_db"}, # optional
posthog_privacy_mode=False # optional
)
print(response.choices[0].message.content)

Notes:

  • This also works with responses where stream=True.
  • If you want to capture LLM events anonymously, don't pass a distinct ID to the request. See our docs on anonymous vs identified events to learn more.

Embeddings

PostHog can also capture embedding generations as $ai_embedding events. Just make sure to use the same posthog.ai.openai client to do so:

Python
response = client.embeddings.create(
input="The quick brown fox",
model="text-embedding-3-small",
posthog_distinct_id="user_123", # optional
posthog_trace_id="trace_123", # optional
posthog_properties={"key": "value"} # optional
posthog_groups={"company": "company_id_in_your_db"} # optional
posthog_privacy_mode=False # optional
)

Privacy mode

To avoid storing potentially sensitive prompt and completion data, you can enable privacy mode. This excludes the $ai_input and $ai_output_choices properties from being captured.

This can be done either by setting the privacy_mode config option in the SDK like this:

import posthog
posthog.project_api_key = "<ph_project_api_key>"
posthog.host = "https://us.i.posthog.com"
posthog.privacy_mode = True

It can also be on at the request level by including setting the privacy_mode parameter to True in the request. The exact setup depends on the LLM platform you're using:

client.chat.completions.create(
model="gpt-3.5-turbo",
messages=[...],
posthog_privacy_mode=True
)

Manual capture

If you're using a different SDK or the API, you can manually capture the data by calling the capture method.

A generation is a single call to an LLM.

Event Name: $ai_generation

PropertyDescription
$ai_trace_idThe trace ID (a UUID to group AI events) like conversation_id
$ai_modelThe model used
gpt-3.5-turbo
$ai_providerThe LLM provider
$ai_inputList of messages
[{"role": "user", "content": "Tell me a fun fact about hedgehogs"}]
$ai_input_tokensThe number of tokens in the input (often found in response.usage)
$ai_output_choicesList of choices
[{"role": "assistant", "content": "Hedgehogs are small mammals with spines on their back."}]
$ai_output_tokensThe number of tokens in the output (often found in response.usage)
$ai_latencyThe latency of the LLM call (ms)
$ai_http_statusThe HTTP status code of the response
$ai_base_urlThe base URL of the LLM provider
$ai_is_errorBoolean to indicate if the request was an error
$ai_errorThe error message or object

Example

Terminal
curl -X POST "https://us.i.posthog.com/capture/" \
-H "Content-Type: application/json" \
-d '{
"api_key": "<ph_project_api_key>",
"event": "$ai_generation",
"properties": {
"distinct_id": "distinct_id_of_your_user",
"$ai_trace_id": "trace_id",
"$ai_model": "gpt-3.5-turbo",
"$ai_provider": "openai",
"$ai_input": "[{\"role\": \"user\", \"content\": \"Tell me a fun fact about hedgehogs\"}]",
"$ai_input_tokens": 100,
"$ai_output_choices": "[{\"role\": \"assistant\", \"content\": \"Hedgehogs are small mammals with spines on their back.\"}]",
"$ai_output_tokens": 100,
"$ai_latency": 100,
"$ai_http_status": 200,
"$ai_base_url": "https://api.openai.com/v1"
},
"timestamp": "2025-01-30T12:00:00Z"
}'

Questions? Ask Max AI.

It's easier than reading through 579 docs articles.

Community questions

Was this page useful?

Next article

LLM observability dashboard (beta)

The LLM observability dashboard provides an overview of your LLM usage and performance. It includes insights on: Users Traces Costs Generations Latency It can be filtered like any dashboard in PostHog, including by event, person, and group properties. Our observability SDKs autocapture especially useful properties like provider, tokens, cost, model, and more. This dashboard is a great starting point for understanding your LLM usage and performance. You can use it to answer questions like: Are…

Read next article