Usage
This section provides detailed examples of using chat2llms to compare responses from large language models (LLMs). The examples include basic usage, and command-line interface (CLI).
Basic Comparison
Compare responses from two mock LLM clients (DeepSeek and Grok) for a simple prompt:
from chat2llms.analyzer import AnswerAnalyzer
from chat2llms.model_response import OpenAIResponse, GeminiResponse
from chat2llms.base_client import BaseClient
# Initialize clients
gemini = BaseClient("gemini")
deepseek = BaseClient("deepseek")
# Create responses
question = "What is 2 + 2?"
gemini_response = GeminiResponse(gemini)
deepseek_response = OpenAIResponse(deepseek)
# Analyze differences
analyzer = AnswerAnalyzer(gemini_response, deepseek_response, question)
print(f"Similarity: {analyzer.compute_similarity():.2f}")
print(f"semantic_sim: {analyzer.compute_semantic_similarity():.2f}")
print(analyzer.highlight_differences())
Output:
Text Similarity: 0.09
Semantic Similarity: 0.77
Response 1 (gemini-1.5-pro):
2 + 2 = 4
Response 2 (deepseek-reasoner):
The sum of 2 and 2 is calculated as follows:
**Step 1:** Start with the number 2.
**Step 2:** Add 2 to it.
**Step 3:** Combining the quantities results in 4.
Command-Line Interface
chat2llms provides a Command-Line Interface (CLI) for quick comparisons.
After installing (Refer to Installation) the package, run:
chat2llms --model1 openai --model2 gemini --prompt "Solve 2 + 2"
Output:
=== Prompt ===
Solve 2 + 2
=== OPENAI Response ===
2 + 2 = 4
=== GEMINI Response ===
2 + 2 = 4
=== Text Similarity ===
0.9473684210526315
=== Semantic Similarity ===
0.9281893463830239
=== Highlight of Differences ===
Response 1 (gpt-3.5-turbo):
2 + 2 = 4
Response 2 (gemini-1.5-pro):
2 + 2 = 4