Skip to main content

Base Models (≤3B Parameters)

Our base models are optimized for efficiency and speed, making them perfect for rapid prototyping and development. With up to 3B parameters, these models deliver quick responses while maintaining good quality output.

Available Models

TinyLlama 1.1B Chat

  • Parameters: 1.1B
  • Context Window: 2048 tokens
  • Provider: TinyLlama
  • License: Apache 2.0
  • Key Features:
    • Fast inference
    • Efficient architecture
    • Optimized for chat
    • Low resource requirements

Phi-2

  • Parameters: 2.7B
  • Context Window: 2048 tokens
  • Provider: Microsoft
  • License: MIT License
  • Key Features:
    • Strong code generation
    • Technical understanding
    • Educational content
    • Research applications

Gemma-2B

  • Parameters: 2B
  • Context Window: 8192 tokens
  • Provider: Google
  • License: Apache 2.0
  • Key Features:
    • Google's Gemini technology
    • Balanced performance
    • Instruction following
    • General capabilities

Configuration & Setup

OpenAI SDK Setup

import OpenAI from 'openai';

const openai = new OpenAI({
apiKey: 'your-neuredge-key',
baseURL: 'https://api.neuredge.dev/v1/'
});

Native SDK Setup

import { Neuredge } from '@neuredge/sdk';

const neuredge = new Neuredge({
apiKey: 'your-api-key'
});

Real-World Applications & Examples

1. Rapid Prototyping & Development

Chat Bot Prototype

const response = await openai.chat.completions.create({
model: '@cf/tinyllama/tinyllama-1.1b-chat-v1.0',
messages: [
{
role: 'system',
content: 'You are a friendly AI assistant helping customers with product inquiries.'
},
{
role: 'user',
content: 'Do you have this shirt in blue color?'
}
],
temperature: 0.7,
max_tokens: 150
});

Use Cases:

  • Quick MVP development
  • Testing dialogue flows
  • UI/UX prototypes
  • Concept validation
  • Fast iteration cycles

2. Educational Applications

Learning Assistant

const response = await openai.chat.completions.create({
model: '@cf/microsoft/phi-2',
messages: [
{
role: 'system',
content: 'You are an educational assistant helping students understand concepts.'
},
{
role: 'user',
content: 'Explain how photosynthesis works in simple terms.'
}
],
temperature: 0.5,
max_tokens: 200
});

Use Cases:

  • Concept explanations
  • Study guides
  • Quiz generation
  • Interactive learning
  • Homework assistance

3. Code Generation & Review

Code Assistant

const response = await openai.chat.completions.create({
model: '@cf/microsoft/phi-2',
messages: [
{
role: 'system',
content: 'You are a coding assistant. Focus on clean, well-documented code.'
},
{
role: 'user',
content: \`Write a JavaScript function that:
1. Takes an array of numbers
2. Removes duplicates
3. Sorts in descending order
4. Returns the top 3 values\`
}
],
temperature: 0.3,
max_tokens: 300
});

Use Cases:

  • Quick code snippets
  • Code reviews
  • Documentation
  • Testing scripts
  • Learning exercises

4. Content Summarization

Quick Summary Generator

const response = await openai.chat.completions.create({
model: '@cf/google/gemma-2b-it',
messages: [
{
role: 'system',
content: 'Create concise summaries while retaining key information.'
},
{
role: 'user',
content: \`Summarize this article in 2-3 sentences:
${articleText}\`
}
],
temperature: 0.4,
max_tokens: 100
});

Use Cases:

  • News summaries
  • Document preview
  • Content briefs
  • Meeting notes
  • Quick updates

Integration Examples

Express.js Chat API

import express from 'express';
import OpenAI from 'openai';

const app = express();
app.use(express.json());

const openai = new OpenAI({
apiKey: process.env.NEUREDGE_API_KEY,
baseURL: 'https://api.neuredge.dev/v1/'
});

app.post('/chat', async (req, res) => {
try {
const { message } = req.body;
const response = await openai.chat.completions.create({
model: '@cf/tinyllama/tinyllama-1.1b-chat-v1.0',
messages: [
{
role: 'user',
content: message
}
]
});

res.json({ reply: response.choices[0].message.content });
} catch (error) {
res.status(500).json({ error: error.message });
}
});

React Component Generator

import OpenAI from 'openai';

const openai = new OpenAI({
apiKey: process.env.NEXT_PUBLIC_NEUREDGE_KEY,
baseURL: 'https://api.neuredge.dev/v1/'
});

export async function generateComponent(spec) {
const response = await openai.chat.completions.create({
model: '@cf/microsoft/phi-2',
messages: [
{
role: 'system',
content: 'Generate React components using modern best practices.'
},
{
role: 'user',
content: `Create a React component for: ${spec}`
}
],
temperature: 0.3
});

return response.choices[0].message.content;
}

Best Practices

  1. Performance Optimization

    • Keep prompts concise
    • Use appropriate max_tokens
    • Implement caching
    • Batch similar requests
  2. Quality Control

    • Validate outputs
    • Handle edge cases
    • Provide fallbacks
    • Monitor response quality
  3. Resource Management

    • Track token usage
    • Implement rate limiting
    • Use efficient batching
    • Cache common responses

Token Management

PlanMonthly Token Quota
Free Tier300K tokens
$29 Plan3M tokens
$49 Plan4.5M tokens

When to Use

Ideal For:

  • Rapid prototyping
  • Development testing
  • Simple interactions
  • Quick responses
  • Learning projects

Consider Alternatives When:

  • Complex reasoning needed
  • High accuracy required
  • Long context needed
  • Production deployment

Getting Started

To begin using our base models: