Get started with LangSmith
LangSmith is a platform for building production-grade LLM applications. It allows you to closely monitor and evaluate your application, so you can ship quickly and with confidence. LangChain's open source frameworks langchain and langgraph work seemlessly with LangSmith but are not necessary - LangSmith works on its own!
1. Install LangSmith
- Python
- TypeScript
pip install -U langsmith openai
yarn add langsmith openai
2. Create an API key
To create an API key head to the Settings page. Then click Create API Key.
3. Set up your environment
- Shell
export LANGCHAIN_TRACING_V2=true
export LANGCHAIN_API_KEY=<your-api-key>
# The below examples use the OpenAI API, though it's not necessary in general
export OPENAI_API_KEY=<your-openai-api-key>
4. Log your first trace
LangSmith + LangChain OSS
We provide multiple ways to log traces to LangSmith. Below, we'll highlight
how to use traceable()
. See more on the Annotate code for tracing page.
- Python
- TypeScript
import openai
from langsmith import wrappers, traceable
# Auto-trace LLM calls in-context
client = wrappers.wrap_openai(openai.Client())
@traceable # Auto-trace this function
def pipeline(user_input: str):
result = client.chat.completions.create(
messages=[{"role": "user", "content": user_input}],
model="gpt-4o-mini"
)
return result.choices[0].message.content
pipeline("Hello, world!")
# Out: Hello there! How can I assist you today?
import { OpenAI } from "openai";
import { traceable } from "langsmith/traceable";
import { wrapOpenAI } from "langsmith/wrappers";
// Auto-trace LLM calls in-context
const client = wrapOpenAI(new OpenAI());
// Auto-trace this function
const pipeline = traceable(async (user_input) => {
const result = await client.chat.completions.create({
messages: [{ role: "user", content: user_input }],
model: "gpt-4o-mini",
});
return result.choices[0].message.content;
});
await pipeline("Hello, world!")
// Out: Hello there! How can I assist you today?
- View a sample output trace.
- Learn more about tracing in the observability tutorials, conceptual guide and how-to guides.
5. Run your first evaluation
Evaluation requires a system to test, data to serve as test cases, and optionally evaluators to grade the results. Here we use a built-in accuracy evaluator.
- Python
- TypeScript
from langsmith import Client, traceable
client = Client()
# Define dataset: these are your test cases
dataset = client.create_dataset(
"Sample Dataset",
description="A sample dataset in LangSmith.",
)
client.create_examples(
inputs=[
{"postfix": "to LangSmith"},
{"postfix": "to Evaluations in LangSmith"},
],
outputs=[
{"response": "Welcome to LangSmith"},
{"response": "Welcome to Evaluations in LangSmith"},
],
dataset_id=dataset.id,
)
# Define an interface to your application (tracing optional)
@traceable
def dummy_app(inputs: dict) -> dict:
return {"response": "Welcome " + inputs["postfix"]}
# Define your evaluator(s)
def exact_match(outputs: dict, reference_outputs: dict) -> bool:
return outputs["response"] == reference_outputs["response"]
# Run the evaluation
experiment_results = client.evaluate(
dummy_app, # Your AI system goes here
data=dataset, # The data to predict and grade over
evaluators=[exact_match], # The evaluators to score the results
experiment_prefix="sample-experiment", # The name of the experiment
metadata={"version": "1.0.0", "revision_id": "beta"}, # Metadata about the experiment
max_concurrency=4, # Add concurrency.
)
# Analyze the results via the UI or programmatically
# If you have 'pandas' installed you can view the results as a
# pandas DataFrame by uncommenting below:
# experiment_results.to_pandas()
import { Client } from "langsmith";
import { EvaluationResult, evaluate } from "langsmith/evaluation";
const client = new Client();
// Define dataset: these are your test cases
const datasetName = "Sample Dataset";
const dataset = await client.createDataset(datasetName, {
description: "A sample dataset in LangSmith.",
});
await client.createExamples({
inputs: [
{ postfix: "to LangSmith" },
{ postfix: "to Evaluations in LangSmith" },
],
outputs: [
{ response: "Welcome to LangSmith" },
{ response: "Welcome to Evaluations in LangSmith" },
],
datasetId: dataset.id,
});
// Define your evaluator(s)
const exactMatch = async ({ outputs, referenceOutputs }: {
outputs?: Record<string, any>;
referenceOutputs?: Record<string, any>;
}): Promise<EvaulationResult> => {
return {
key: "exact_match",
score: outputs?.response === referenceOutputs?.response,
};
};
// Run the evaluation
const experimentResults = await evaluate(
(inputs: { postfix: string }) => ({ response: `Welcome ${inputs.postfix}` }),
{
data: datasetName,
evaluators: [exactMatch],
metadata: { version: "1.0.0", revision_id: "beta" },
maxConcurrency: 4,
}
);
- Click the link printed out by your evaluation run to access the LangSmith experiments UI, and explore the results of your evaluation.
- Learn more about evaluation in the tutorials, conceptual guide, and how-to guides.