Notes & Resources

Keep track of your AI/LLM experiments, findings, and resources.

GPT-4o Test Results

Tested GPT-4o on complex reasoning tasks. Results show strong performance on multi-step problems. Response time ~1-2s for typical queries. Good at following system prompts. Tested temperature variations from 0 to 1.

openai

gpt-4o

results

RAG Pipeline Notes

Key findings from RAG testing: - chunk size matters (256-512 tokens works well) - top-k=3 gives good context without too much noise - cosine similarity threshold of 0.7+ gives relevant results - need to handle long documents with sliding window chunking

rag

embeddings

architecture

Model Comparison: Claude vs GPT

Claude 3.5 Sonnet vs GPT-4o comparison: - Claude tends to be more verbose in explanations - GPT-4o better at code generation tasks - Both perform similarly on summarization - Claude has a larger context window (200k vs 128k tokens) - Pricing: similar cost per token

comparison

claude

openai

Select a note to view or click "New Note" to create one