Base Models (≤3B Parameters)
Our base models are optimized for efficiency and speed, making them perfect for rapid prototyping and development. With up to 3B parameters, these models deliver quick responses while maintaining good quality output.
Available Models
TinyLlama 1.1B Chat
- Parameters: 1.1B
- Context Window: 2048 tokens
- Provider: TinyLlama
- License: Apache 2.0
- Key Features:
- Fast inference
- Efficient architecture
- Optimized for chat
- Low resource requirements
Phi-2
- Parameters: 2.7B
- Context Window: 2048 tokens
- Provider: Microsoft
- License: MIT License
- Key Features:
- Strong code generation
- Technical understanding
- Educational content
- Research applications
Gemma-2B
- Parameters: 2B
- Context Window: 8192 tokens
- Provider: Google
- License: Apache 2.0
- Key Features:
- Google's Gemini technology
- Balanced performance
- Instruction following
- General capabilities
Configuration & Setup
OpenAI SDK Setup
import OpenAI from 'openai';
const openai = new OpenAI({
apiKey: 'your-neuredge-key',
baseURL: 'https://api.neuredge.dev/v1/'
});
Native SDK Setup
import { Neuredge } from '@neuredge/sdk';
const neuredge = new Neuredge({
apiKey: 'your-api-key'
});
Real-World Applications & Examples
1. Rapid Prototyping & Development
Chat Bot Prototype
const response = await openai.chat.completions.create({
model: '@cf/tinyllama/tinyllama-1.1b-chat-v1.0',
messages: [
{
role: 'system',
content: 'You are a friendly AI assistant helping customers with product inquiries.'
},
{
role: 'user',
content: 'Do you have this shirt in blue color?'
}
],
temperature: 0.7,
max_tokens: 150
});
Use Cases:
- Quick MVP development
- Testing dialogue flows
- UI/UX prototypes
- Concept validation
- Fast iteration cycles
2. Educational Applications
Learning Assistant
const response = await openai.chat.completions.create({
model: '@cf/microsoft/phi-2',
messages: [
{
role: 'system',
content: 'You are an educational assistant helping students understand concepts.'
},
{
role: 'user',
content: 'Explain how photosynthesis works in simple terms.'
}
],
temperature: 0.5,
max_tokens: 200
});
Use Cases:
- Concept explanations
- Study guides
- Quiz generation
- Interactive learning
- Homework assistance
3. Code Generation & Review
Code Assistant
const response = await openai.chat.completions.create({
model: '@cf/microsoft/phi-2',
messages: [
{
role: 'system',
content: 'You are a coding assistant. Focus on clean, well-documented code.'
},
{
role: 'user',
content: \`Write a JavaScript function that:
1. Takes an array of numbers
2. Removes duplicates
3. Sorts in descending order
4. Returns the top 3 values\`
}
],
temperature: 0.3,
max_tokens: 300
});
Use Cases:
- Quick code snippets
- Code reviews
- Documentation
- Testing scripts
- Learning exercises
4. Content Summarization
Quick Summary Generator
const response = await openai.chat.completions.create({
model: '@cf/google/gemma-2b-it',
messages: [
{
role: 'system',
content: 'Create concise summaries while retaining key information.'
},
{
role: 'user',
content: \`Summarize this article in 2-3 sentences:
${articleText}\`
}
],
temperature: 0.4,
max_tokens: 100
});
Use Cases:
- News summaries
- Document preview
- Content briefs
- Meeting notes
- Quick updates
Integration Examples
Express.js Chat API
import express from 'express';
import OpenAI from 'openai';
const app = express();
app.use(express.json());
const openai = new OpenAI({
apiKey: process.env.NEUREDGE_API_KEY,
baseURL: 'https://api.neuredge.dev/v1/'
});
app.post('/chat', async (req, res) => {
try {
const { message } = req.body;
const response = await openai.chat.completions.create({
model: '@cf/tinyllama/tinyllama-1.1b-chat-v1.0',
messages: [
{
role: 'user',
content: message
}
]
});
res.json({ reply: response.choices[0].message.content });
} catch (error) {
res.status(500).json({ error: error.message });
}
});
React Component Generator
import OpenAI from 'openai';
const openai = new OpenAI({
apiKey: process.env.NEXT_PUBLIC_NEUREDGE_KEY,
baseURL: 'https://api.neuredge.dev/v1/'
});
export async function generateComponent(spec) {
const response = await openai.chat.completions.create({
model: '@cf/microsoft/phi-2',
messages: [
{
role: 'system',
content: 'Generate React components using modern best practices.'
},
{
role: 'user',
content: `Create a React component for: ${spec}`
}
],
temperature: 0.3
});
return response.choices[0].message.content;
}
Best Practices
-
Performance Optimization
- Keep prompts concise
- Use appropriate max_tokens
- Implement caching
- Batch similar requests
-
Quality Control
- Validate outputs
- Handle edge cases
- Provide fallbacks
- Monitor response quality
-
Resource Management
- Track token usage
- Implement rate limiting
- Use efficient batching
- Cache common responses
Token Management
Plan | Monthly Token Quota |
---|---|
Free Tier | 300K tokens |
$29 Plan | 3M tokens |
$49 Plan | 4.5M tokens |
When to Use
✅ Ideal For:
- Rapid prototyping
- Development testing
- Simple interactions
- Quick responses
- Learning projects
❌ Consider Alternatives When:
- Complex reasoning needed
- High accuracy required
- Long context needed
- Production deployment
Getting Started
To begin using our base models: