Modes of Interaction¶

The toolkit provides three primary methods for interacting with LLMs, each suited to different use cases.

chat() - Text Responses¶

Use chat() for simple text-based responses:

response = await ait.chat(
    template="Write a haiku about {{ topic }}",
    topic="mountains"
)

print(response.content)  # Plain text response
print(response.completion)  # Raw OpenAI completion object

Use when:

You need unstructured text output
The response format is flexible
You're doing creative or open-ended tasks

asend() - Structured Responses¶

Use asend() for type-safe, structured outputs:

class Analysis(BaseModel):
    sentiment: str
    confidence: float

response = await ait.asend(
    response_model=Analysis,
    template="Analyze sentiment: {{ text }}",
    text="This product is amazing!"
)

print(response.content.sentiment)    # Type-safe access
print(response.content.confidence)   # IDE autocomplete works

Use when:

You need predictable, structured data
Type safety matters
You're building APIs or data pipelines
You want IDE autocomplete and type checking

stream() - Streaming Responses¶

Use stream() for real-time text generation:

async for chunk in ait.stream(
    template="Write a story about {{ topic }}",
    topic="time travel"
):
    print(chunk.content, end="", flush=True)

Use when:

Building interactive UIs
Showing progress to users
Reducing perceived latency
Processing long-form content incrementally

run_task() - Validated Execution¶

Use run_task() for high-reliability structured outputs with validation:

result = await ait.run_task(
    template="Extract key points from: {{ article }}",
    response_model=Summary,
    kwargs=dict(article=article_text),
    config=SingleShotValidationConfig(
        issues=["All main points are captured"]
    )
)

Use when:

Output correctness is critical
You need automatic retries
Quality assurance is important
Building production systems

See Running Tasks for details on validation.

Embeddings¶

Generate vector embeddings for semantic search or similarity:

vector = await ait.embed("Machine learning is fascinating")
# Returns: list[float] with embedding dimensions

Use when:

Building semantic search
Doing similarity comparisons
Creating recommendation systems
Clustering or classification tasks

Comparison¶

Method	Output Type	Validation	Streaming	Use Case
`chat()`	Text	No	No	Simple text generation
`asend()`	Structured	No	No	Type-safe data extraction
`stream()`	Text	No	Yes	Interactive UIs
`run_task()`	Structured	Yes	No	Production workflows
`embed()`	Vector	N/A	No	Semantic operations

Alternative Models¶

When using asend(), you can leverage alternative models for load balancing:

ait = PyAIToolkit(
    main_model_config=LLMConfig(model="gpt-4"),
    alternative_models_configs=[
        LLMConfig(model="gpt-4"),
        LLMConfig(model="claude-3-sonnet")
    ]
)

# Randomly selects from alternative_models_configs
response = await ait.asend(response_model=MyModel, ...)

Note that chat() and stream() always use the main model, while asend() randomly selects from alternative models if provided.