Skip to content

Your First API Call

The Claude API is how your code talks to Claude directly. No browser, no chat interface — your program sends text and gets text back. This is the foundation of everything in the Building section.


Why Use the API vs Claude.ai or Claude Code?

Section titled “Why Use the API vs Claude.ai or Claude Code?”
ToolBest For
Claude.aiInteractive conversations, one-off tasks
Claude CodeBuilding projects with AI assistance
The APIEmbedding Claude in your own apps, automating at scale, shipping to users

The API is for when you want to build a product, automate a workflow that runs without you, or embed AI into something you’re creating for others. If you’re doing any of those things, the API is the right tool.


Go to console.anthropic.com and sign up. You’ll need to verify your email and add a payment method before you can make API calls. Check console.anthropic.com for any introductory credits — you pay for what you use beyond that.

  1. Log in to console.anthropic.com
  2. Click API Keys in the left sidebar
  3. Click Create Key
  4. Give it a name (e.g., “my-first-project”)
  5. Copy the key immediately — you won’t be able to see it again

!!! warning “Keep your API key secret” Your API key is like a password. Anyone who has it can make API calls that charge your account. Never put it directly in your code. Never commit it to a git repository. Store it in an environment variable.

```bash
# Add to your shell profile (~/.bashrc, ~/.zshrc) or a .env file
export ANTHROPIC_API_KEY="sk-ant-..."
```

You pay per token — a token is roughly 3-4 characters of text. Every API call uses input tokens (what you send) and output tokens (what Claude responds with).

ModelInput (per million tokens)Output (per million tokens)Best For
Haiku 4.5~$0.80~$4High-volume tasks, speed matters
Sonnet 4.6~$3~$15Most everyday use cases
Opus 4.6~$15~$75Complex reasoning, highest quality

Check anthropic.com/pricing for current rates — prices change.

For context: a typical API call (asking Claude to summarize a short article) might use 500 input tokens and 300 output tokens. At Sonnet pricing, that’s about $0.006 — less than a cent. Costs only become meaningful at scale or with very long inputs.

Which model to use:

  • Haiku — classification, routing decisions, simple data extraction, anything where you’re making many calls and speed matters
  • Sonnet — the daily driver. Most tasks. Good balance of quality and cost.
  • Opus — hard reasoning problems, complex analysis, situations where you’d rather pay more and get it right

Terminal window
pip install anthropic
import anthropic
client = anthropic.Anthropic() # Reads ANTHROPIC_API_KEY from environment
message = client.messages.create(
model="claude-sonnet-4-5",
max_tokens=1024,
messages=[
{"role": "user", "content": "What are three good questions to ask in a job interview?"}
]
)
print(message.content[0].text)

Line by line:

client = anthropic.Anthropic()

Creates the API client. It automatically reads your ANTHROPIC_API_KEY environment variable. No need to pass it explicitly.

message = client.messages.create(
model="claude-sonnet-4-5",

Calls the Messages API. model specifies which Claude to use.

max_tokens=1024,

Sets the maximum length of Claude’s response in tokens. Claude will stop generating before this limit even if it hasn’t finished — so set it high enough for your expected output. 1024 is a good starting point for most tasks.

messages=[
{"role": "user", "content": "..."}
]

The conversation history. Each message has a role (user or assistant) and content. For a simple one-shot call, you just send one user message.

print(message.content[0].text)

Accesses the response text. message.content is a list (to support future features), so you index into [0] to get the first (and usually only) content block.


Your First API Call in JavaScript/TypeScript

Section titled “Your First API Call in JavaScript/TypeScript”
Terminal window
npm install @anthropic-ai/sdk
import Anthropic from "@anthropic-ai/sdk";
const client = new Anthropic(); // Reads ANTHROPIC_API_KEY from environment
const message = await client.messages.create({
model: "claude-sonnet-4-5",
max_tokens: 1024,
messages: [
{
role: "user",
content: "What are three good questions to ask in a job interview?",
},
],
});
console.log(message.content[0].text);

The structure is identical to Python — same API, same response format, just JavaScript syntax.


Every API call sends a list of messages representing the conversation so far. Roles are user (human input) or assistant (Claude’s previous responses).

messages=[
{"role": "user", "content": "What's the capital of France?"},
{"role": "assistant", "content": "The capital of France is Paris."},
{"role": "user", "content": "What's the population?"}
]

To build a multi-turn conversation, you keep the full history and append each new message. Claude uses the history to maintain context.

System prompt — instructions that apply to the entire conversation, before the user says anything:

message = client.messages.create(
model="claude-sonnet-4-5",
max_tokens=1024,
system="You are a concise assistant. Respond in bullet points only. Never use more than 5 bullets.",
messages=[{"role": "user", "content": "Explain machine learning."}]
)

max_tokens caps the output length. If you set it too low, Claude’s response gets cut off mid-sentence. Rules of thumb:

  • Short factual answers: 256–512
  • Summaries, emails, short documents: 1024–2048
  • Long-form content, code: 4096+

Temperature controls how creative vs. predictable the output is. Range is 0–1.

TemperatureBehavior
0Deterministic — same input, same output every time
0.3Focused, consistent — good for factual tasks
0.7Balanced (default) — good for most things
1.0Creative, varied — good for brainstorming
message = client.messages.create(
model="claude-sonnet-4-5",
max_tokens=1024,
temperature=0, # Fully deterministic
messages=[{"role": "user", "content": "Extract the date from this invoice: ..."}]
)

Use low temperature when you need consistent, predictable output (data extraction, classification). Use higher temperature for creative tasks.

By default, the API waits until Claude finishes generating and returns everything at once. Streaming sends chunks as they’re generated — useful for chat interfaces where you want the text to appear word by word.

with client.messages.stream(
model="claude-sonnet-4-5",
max_tokens=1024,
messages=[{"role": "user", "content": "Write a short story about a lighthouse keeper."}]
) as stream:
for text in stream.text_stream:
print(text, end="", flush=True)

Same cost, better user experience for interactive applications.


You can give Claude a set of tools — functions you define — and Claude will decide when to use them.

The pattern: you define a tool (its name, description, and parameters), include it in your API call, and if Claude decides it needs to use the tool, it returns a structured tool call instead of a text response. Your code executes the actual function and sends the result back to Claude, which then continues.

This is how you connect Claude to real-world data. Examples: a weather lookup tool, a database query tool, a calculator, a web search.

tools = [{
"name": "get_stock_price",
"description": "Get the current stock price for a ticker symbol",
"input_schema": {
"type": "object",
"properties": {
"ticker": {
"type": "string",
"description": "Stock ticker symbol, e.g. AAPL"
}
},
"required": ["ticker"]
}
}]
message = client.messages.create(
model="claude-sonnet-4-5",
max_tokens=1024,
tools=tools,
messages=[{"role": "user", "content": "What's Apple's current stock price?"}]
)

If Claude decides to use the tool, message.stop_reason will be "tool_use" and message.content will contain the tool call parameters. Your code then runs the actual lookup and continues the conversation.

Tool use is an advanced topic — the Anthropic documentation on tool use has complete examples.