Skip to main content

Overview

Our text generation capabilities provide state-of-the-art language models for a wide range of applications. With OpenAI-compatible endpoints and multiple model sizes, we offer the flexibility to choose the right model for your specific needs.

OpenAI Compatibility

Our text generation models are fully compatible with OpenAI's Chat API, allowing for seamless migration from OpenAI's GPT models. Simply update the base URL and model name:

import OpenAI from 'openai';

const openai = new OpenAI({
apiKey: 'your-neuredge-key',
baseURL: 'https://api.neuredge.dev/v1/'
});

const completion = await openai.chat.completions.create({
model: '@cf/meta/llama-3.1-70b-instruct', // Use any Neuredge model
messages: [
{role: 'system', content: 'You are a helpful assistant.'},
{role: 'user', content: 'What is quantum computing?'}
]
});

Available Models

CategoryParametersModelsUse CasePerformance
Base Models≤3BTinyLlama, Phi-2, Gemma-2BRapid prototyping, DevelopmentGood
Small Models3.1B-8BLlama-3.1-8B, Mistral-7B, DeepSeek CoderProduction applicationsBetter
Medium Models8.1B-20BQwen-14B, DeepSeek MathSpecialized tasksGreat
XLarge Models40B+Llama-3.1-70BMaximum performanceBest

Model Selection Guide

  • Base Models (≤3B)

    • Ideal for: Development, testing, prototyping
    • Best for: Quick iterations, simple tasks
    • Trade-offs: Lower accuracy for faster speed
  • Small Models (3.1B-8B)

    • Ideal for: Production applications
    • Best for: Most business use cases
    • Trade-offs: Balanced performance and speed
  • Medium Models (8.1B-20B)

    • Ideal for: Specialized tasks
    • Best for: Domain-specific applications
    • Trade-offs: Better quality with moderate speed
  • XLarge Models (40B+)

    • Ideal for: Maximum performance needs
    • Best for: Complex reasoning, research
    • Trade-offs: Highest quality but slower

Core Capabilities

1. Chat Completions

const response = await openai.chat.completions.create({
model: '@cf/meta/llama-3.1-8b-instruct',
messages: [
{
role: 'system',
content: 'You are a knowledgeable assistant specializing in technology.'
},
{
role: 'user',
content: 'Explain how blockchain works in simple terms.'
}
],
temperature: 0.7,
max_tokens: 500
});

2. Streaming Responses

const stream = await openai.chat.completions.create({
model: '@cf/meta/llama-3.1-8b-instruct',
messages: [{
role: 'user',
content: 'Write a story about space exploration'
}],
stream: true
});

for await (const chunk of stream) {
process.stdout.write(chunk.choices[0]?.delta?.content || '');
}

3. Function Calling (Available on select models)

const response = await openai.chat.completions.create({
model: '@cf/meta/llama-3.1-70b-instruct',
messages: [{
role: 'user',
content: 'What\'s the weather like in London?'
}],
functions: [{
name: 'get_weather',
description: 'Get current weather for a location',
parameters: {
type: 'object',
properties: {
location: {
type: 'string',
description: 'City name'
}
},
required: ['location']
}
}]
});

Integration Examples

Express.js Chat API

import express from 'express';
import OpenAI from 'openai';

const app = express();
app.use(express.json());

const openai = new OpenAI({
apiKey: process.env.NEUREDGE_API_KEY,
baseURL: 'https://api.neuredge.dev/v1/'
});

app.post('/chat', async (req, res) => {
try {
const { messages, stream = false } = req.body;

if (stream) {
const stream = await openai.chat.completions.create({
model: '@cf/meta/llama-3.1-8b-instruct',
messages,
stream: true
});

res.setHeader('Content-Type', 'text/event-stream');
for await (const chunk of stream) {
res.write(`data: ${JSON.stringify(chunk)}\n\n`);
}
res.end();
} else {
const completion = await openai.chat.completions.create({
model: '@cf/meta/llama-3.1-8b-instruct',
messages
});

res.json(completion);
}
} catch (error) {
res.status(500).json({ error: error.message });
}
});

Next.js API Route with Streaming

// pages/api/chat.js
import { OpenAIStream, StreamingTextResponse } from 'ai';
import OpenAI from 'openai';

const openai = new OpenAI({
apiKey: process.env.NEUREDGE_API_KEY,
baseURL: 'https://api.neuredge.dev/v1/'
});

export default async function handler(req, res) {
const { messages } = await req.json();

const response = await openai.chat.completions.create({
model: '@cf/meta/llama-3.1-8b-instruct',
messages,
stream: true
});

const stream = OpenAIStream(response);
return new StreamingTextResponse(stream);
}

Best Practices

  1. Model Selection

    • Choose based on task complexity
    • Consider latency requirements
    • Balance cost and performance
    • Test different sizes
  2. Prompt Engineering

    • Be specific and clear
    • Use system messages effectively
    • Include examples when needed
    • Consider temperature setting
  3. Performance Optimization

    • Implement streaming for long responses
    • Cache common responses
    • Use appropriate max_tokens
    • Monitor token usage
  4. Error Handling

    • Implement retry logic
    • Handle rate limits
    • Provide fallbacks
    • Monitor responses

Token Management

PlanMonthly Token Quota (Base Models)
Free Tier300K tokens
$29 Plan3M tokens
$49 Plan4.5M tokens

Note: Token quotas vary by model size. See individual model pages for details.

When to Use Each Model Size

Base Models (≤3B)

Ideal For:

  • Development and testing
  • Quick prototypes
  • Simple interactions
  • Learning and exploration

Small Models (3.1B-8B)

Ideal For:

  • Production applications
  • Customer support
  • Content generation
  • General business use

Medium Models (8.1B-20B)

Ideal For:

  • Specialized tasks
  • Technical content
  • Mathematical computations
  • Domain expertise

XLarge Models (40B+)

Ideal For:

  • Complex reasoning
  • Research applications
  • Professional content
  • Maximum accuracy needs

Getting Started

  1. Install the SDK:
npm install @neuredge/sdk
# or
pip install neuredge-sdk
  1. Initialize the client:
// Using OpenAI SDK
import OpenAI from 'openai';

const openai = new OpenAI({
apiKey: 'your-neuredge-key',
baseURL: 'https://api.neuredge.dev/v1/'
});

// Or using native SDK
import { Neuredge } from '@neuredge/sdk';

const neuredge = new Neuredge({
apiKey: 'your-api-key'
});

Learn more: