Skip to content

Modes of Interaction

The toolkit provides three primary methods for interacting with LLMs, each suited to different use cases.

chat() - Text Responses

Use chat() for simple text-based responses:

response = await ait.chat(
    template="Write a haiku about {{ topic }}",
    topic="mountains"
)

print(response.content)  # Plain text response
print(response.completion)  # Raw OpenAI completion object

Use when:

  • You need unstructured text output
  • The response format is flexible
  • You're doing creative or open-ended tasks

asend() - Structured Responses

Use asend() for type-safe, structured outputs:

class Analysis(BaseModel):
    sentiment: str
    confidence: float

response = await ait.asend(
    response_model=Analysis,
    template="Analyze sentiment: {{ text }}",
    text="This product is amazing!"
)

print(response.content.sentiment)    # Type-safe access
print(response.content.confidence)   # IDE autocomplete works

Use when:

  • You need predictable, structured data
  • Type safety matters
  • You're building APIs or data pipelines
  • You want IDE autocomplete and type checking

stream() - Streaming Responses

Use stream() for real-time text generation:

async for chunk in ait.stream(
    template="Write a story about {{ topic }}",
    topic="time travel"
):
    print(chunk.content, end="", flush=True)

Use when:

  • Building interactive UIs
  • Showing progress to users
  • Reducing perceived latency
  • Processing long-form content incrementally

run_task() - Validated Execution

Use run_task() for high-reliability structured outputs with validation:

result = await ait.run_task(
    template="Extract key points from: {{ article }}",
    response_model=Summary,
    kwargs=dict(article=article_text),
    config=SingleShotValidationConfig(
        issues=["All main points are captured"]
    )
)

Use when:

  • Output correctness is critical
  • You need automatic retries
  • Quality assurance is important
  • Building production systems

See Running Tasks for details on validation.

Embeddings

Generate vector embeddings for semantic search or similarity:

vector = await ait.embed("Machine learning is fascinating")
# Returns: list[float] with embedding dimensions

Use when:

  • Building semantic search
  • Doing similarity comparisons
  • Creating recommendation systems
  • Clustering or classification tasks

Comparison

Method Output Type Validation Streaming Use Case
chat() Text No No Simple text generation
asend() Structured No No Type-safe data extraction
stream() Text No Yes Interactive UIs
run_task() Structured Yes No Production workflows
embed() Vector N/A No Semantic operations

Alternative Models

When using asend(), you can leverage alternative models for load balancing:

ait = PyAIToolkit(
    main_model_config=LLMConfig(model="gpt-4"),
    alternative_models_configs=[
        LLMConfig(model="gpt-4"),
        LLMConfig(model="claude-3-sonnet")
    ]
)

# Randomly selects from alternative_models_configs
response = await ait.asend(response_model=MyModel, ...)

Note that chat() and stream() always use the main model, while asend() randomly selects from alternative models if provided.