Python 3.10 or higher is required to use the xAI SDK.
Install from PyPI with pip. Alternatively you can also use uv.
pip install xai-sdk
To use the xAI SDK, you need to instantiate either a synchronous or asynchronous client. By default, the SDK looks for an environment variable named XAI_API_KEY for authentication. If this variable is set, you can instantiate the clients without explicitly passing the API key:
from xai_sdk import Client, AsyncClient
import os
import key
api_key = key.api_key
# model = "grok-3-mini"
os.environ['XAI_API_KEY'] = api_key
# Synchronous client
sync_client = Client()
# Asynchronous client
async_client = AsyncClient()
If you prefer to explicitly pass the API key, you can do so using os.getenv or by loading it from a .env file using the python-dotenv package:
from dotenv import load_dotenv
from xai_sdk import Client, AsyncClient
import os
load_dotenv()
api_key = os.getenv("XAI_API_KEY")
print(api_key)
sync_client = Client(api_key=api_key)
async_client = AsyncClient(api_key=api_key)
Multi-Turn Chat (Synchronous)
xAI SDK supports multi-turn conversations with a simple append method to manage conversation history, making it ideal for interactive applications. First, create a chat instance, start appending messages to it, and finally call sample to yield a response from the model. While the underlying APIs are still stateless, this approach makes it easy to manage the message history.
from xai_sdk import Client
from xai_sdk.chat import system, user
from dotenv import load_dotenv
load_dotenv()
client = Client()
chat = client.chat.create(
model="grok-3-mini",
messages=[system("You are a pirate assistant.")]
)
while True:
prompt = input("You: ")
if prompt.lower() == "exit":
break
chat.append(user(prompt))
response = chat.sample()
print(f"Grok: {response.content}")
chat.append(response)
You: where is paris
Grok: Arrr, matey! Paris, ye say? That swashbucklin’ city be the capital of France, tucked away in the heart of Europe. It’s a treasure trove of fancy lights, art, and them Eiffel Tower thingies, but mind yer gold when ye dock there – them French folk might try to outwit a pirate like yerself!
If ye need more details on chartin’ a course, just holler, arr! 🏴☠️
You: ze mulin ruuge
Grok: Arrr, matey! I think ye mean “Le Moulin Rouge,” that swashbucklin’ cabaret in Paris! It’s a legendary spot, opened back in 1889, famous for its high-kickin’ can-can dances, flashy lights, and that big red windmill on the roof. Aye, it’s a den of merriment where sailors, scallywags, and fancy folk alike go to let loose – though it might lack the rum and parrots we’re used to on the high seas!
If ye’re chartin’ a course to Paris for some adventure, just give the word, arr! 🏴☠️
You: the naked gals!
Grok: Arrr, matey! Whoa there, let’s hoist the sails carefully on this one! I reckon ye’re referrin’ to the lively dancers at Le Moulin Rouge and their famous can-can routines. Them performers are all about high-energy kicks, frilly skirts, and puttin’ on a spectacular show that lit up Paris back in the day. It’s more about the flair, music, and fun than anythin’ else – no need to walk the plank into uncharted waters, savvy?
Multi-Turn Chat (Asynchronous)
For async usage, simply import AsyncClient instead of Client.
import asyncio
from xai_sdk import AsyncClient
from xai_sdk.chat import system, user
from dotenv import load_dotenv
load_dotenv()
async def main():
client = AsyncClient()
chat = client.chat.create(
model="grok-3-mini",
messages=[system("You are a pirate assistant.")]
)
while True:
prompt = input("You: ")
if prompt.lower() == "exit":
break
chat.append(user(prompt))
response = await chat.sample()
print(f"Grok: {response.content}")
chat.append(response)
if __name__ == "__main__":
asyncio.run(main())
Streaming
The xAI SDK supports streaming responses, allowing you to process model outputs in real-time, which is ideal for interactive applications like chatbots. The stream method returns a tuple containing response and chunk. The chunks contain the text deltas from the stream, while the response variable automatically accumulates the response as the stream progresses.
from xai_sdk import Client
from xai_sdk.chat import user
from dotenv import load_dotenv
load_dotenv()
client = Client()
chat = client.chat.create(model="grok-3-mini")
while True:
prompt = input("You: ")
if prompt.lower() == "exit":
break
chat.append(user(prompt))
print("Grok: ", end="", flush=True)
for response, chunk in chat.stream():
print(chunk.content, end="", flush=True)
print()
chat.append(response)
Image Understanding
You can interleave images and text together, making tasks like image understanding easy.
from xai_sdk import Client
from xai_sdk.chat import image, user
from dotenv import load_dotenv
import os
load_dotenv()
api_key = os.getenv('XAI_API_KEY_grok-2-vision')
if not api_key:
raise ValueError("API key not found in environment")
client = Client(api_key=api_key)
chat = client.chat.create(model="grok-2-vision")
chat.append(
user(
"Which animal looks happier in these images?",
image("https://images.unsplash.com/photo-1561037404-61cd46aa615b"), # Puppy
image("https://images.unsplash.com/photo-1514888286974-6c03e2ca1dba") # Kitten
)
)
response = chat.sample()
print(f"Grok: {response.content}")
