Large Language Models: From Fundamentals to Production

0 %

Course content

API Integration Patterns

Choosing Your Approach

Approach	When to Use	Trade-offs
API (Claude, OpenAI)	Most applications, especially starting out	Easy, reliable, but vendor dependency
Self-hosted (vLLM, TGI)	Data privacy, high volume, customization	Full control, but operational overhead
Hybrid	Route by complexity/sensitivity	Best of both, but more complexity

API Best Practices

Streaming

Always use streaming for user-facing applications. It dramatically improves perceived latency:

import anthropic

client = anthropic.Anthropic()

with client.messages.stream(
    model="claude-opus-4-6",
    max_tokens=1024,
    messages=[{"role": "user", "content": "Explain RAG"}]
) as stream:
    for text in stream.text_stream:
        print(text, end="", flush=True)

Retry Logic

Implement exponential backoff for rate limits (429) and server errors (500+)
Set reasonable timeouts (30-120s depending on task complexity)
Use circuit breakers for sustained failures

Batching

For non-real-time workloads, use batch APIs (available from Anthropic and OpenAI) for 50% cost savings.

Architecture Pattern: LLM Gateway

Client → API Gateway → LLM Router → Claude API
                                    → OpenAI API (fallback)
                                    → Local Model (sensitive data)
         └→ Cache Layer (Redis)
         └→ Rate Limiter
         └→ Usage Tracker

🌼 Daisy+ in Action: Layered API Architecture

Daisy+ uses a layered API architecture: Odoo's XML-RPC for internal operations, a FastAPI REST layer for external integrations (with Redis caching and model-level invalidation), and MCP for AI-native interactions. API keys protect all external endpoints, and the cache-aside pattern keeps responses fast without stale data. This three-layer approach means every type of client — human, traditional software, or AI agent — has an optimized integration path.

Large Language Models: From Fundamentals to Production

Completed

API Integration Patterns

API Integration Patterns

Choosing Your Approach

API Best Practices

Streaming

Retry Logic

Batching

Architecture Pattern: LLM Gateway

🌼 Daisy+ in Action: Layered API Architecture

Follow us