Overview
Text embeddings convert text into dense numerical vectors that capture semantic meaning. These vectors enable powerful applications like semantic search, content recommendations, and document analysis.
OpenAI Compatibility
Our embeddings models are compatible with OpenAI's embeddings API, allowing you to use them as a drop-in replacement for models like text-embedding-ada-002
. Simply update the base URL and model name:
import OpenAI from 'openai';
const openai = new OpenAI({
apiKey: 'your-neuredge-key',
baseURL: 'https://api.neuredge.dev/v1/'
});
const embedding = await openai.embeddings.create({
model: '@cf/baai/bge-base-en-v1.5', // Use any Neuredge embedding model
input: 'Your text here'
});
Available Models
Model | Dimensions | Use Case | Performance | Speed |
---|---|---|---|---|
BGE Small | 384 | Fast retrieval, Mobile/Edge | Good | Fastest |
BGE Base | 768 | General purpose, Production | Better | Fast |
BGE Large | 1024 | Research, Legal, Medical | Best | Moderate |
Model Selection Guide
-
BGE Small (384d)
- Ideal for: Mobile apps, Edge devices, High-throughput systems
- Best for: Basic similarity search, Quick prototypes
- Trade-offs: Lower semantic resolution for better speed
-
BGE Base (768d)
- Ideal for: Production systems, General applications
- Best for: Most business use cases
- Trade-offs: Balanced performance and speed
-
BGE Large (1024d)
- Ideal for: Research, Legal analysis, Medical applications
- Best for: Maximum semantic accuracy
- Trade-offs: Higher resource usage for better quality
Common Applications
1. Semantic Search
// Generate embedding for search query
const response = await openai.embeddings.create({
model: '@cf/baai/bge-base-en-v1.5',
input: searchQuery
});
// Search using vector similarity
const results = await db.query(`
SELECT id, content,
1 - (embedding <=> $1) as similarity
FROM documents
WHERE 1 - (embedding <=> $1) > 0.7
ORDER BY similarity DESC
LIMIT 5
`, [response.data[0].embedding]);
2. Content Recommendations
// Generate embeddings for user preferences
const userEmbedding = await openai.embeddings.create({
model: '@cf/baai/bge-base-en-v1.5',
input: userPreferences
});
// Find similar content
const recommendations = await db.query(`
SELECT id, title,
1 - (embedding <=> $1) as relevance
FROM content
WHERE category = ANY($2)
ORDER BY relevance DESC
LIMIT 10
`, [userEmbedding.data[0].embedding, userCategories]);
3. Duplicate Detection
async function findDuplicates(texts, threshold = 0.95) {
const response = await openai.embeddings.create({
model: '@cf/baai/bge-base-en-v1.5',
input: texts
});
const embeddings = response.data;
const duplicates = [];
for (let i = 0; i < embeddings.length; i++) {
for (let j = i + 1; j < embeddings.length; j++) {
const similarity = cosineSimilarity(
embeddings[i].embedding,
embeddings[j].embedding
);
if (similarity > threshold) {
duplicates.push([i, j]);
}
}
}
return duplicates;
}
Vector Database Integration
PostgreSQL (pgvector)
import { Pool } from 'pg';
import pgvector from 'pgvector/pg';
// Setup
const pool = new Pool();
await pool.query('CREATE EXTENSION IF NOT EXISTS vector');
// Create table
await pool.query(`
CREATE TABLE IF NOT EXISTS documents (
id SERIAL PRIMARY KEY,
content TEXT,
embedding vector(768)
);
`);
// Create index
await pool.query(`
CREATE INDEX ON documents
USING ivfflat (embedding vector_cosine_ops)
WITH (lists = 100);
`);
Supabase
import { createClient } from '@supabase/supabase-js';
const supabase = createClient(
process.env.SUPABASE_URL,
process.env.SUPABASE_KEY
);
// Store embeddings
await supabase
.from('documents')
.insert({
content: text,
embedding: embedding
});
// Search
const { data, error } = await supabase.rpc(
'match_documents',
{
query_embedding: searchEmbedding,
match_threshold: 0.7,
match_count: 5
}
);
Best Practices
-
Text Preprocessing
- Clean and normalize text
- Remove irrelevant information
- Handle special characters
- Consider text length
-
Vector Operations
- Use cosine similarity
- Normalize vectors
- Use appropriate indexes
- Implement caching
-
Performance Optimization
- Batch embedding requests
- Use appropriate model size
- Implement rate limiting
- Monitor usage
-
Application Design
- Choose right vector DB
- Plan for scaling
- Consider hybrid search
- Implement fallbacks
Quotas and Limits
Plan | Monthly Token Quota |
---|---|
Free Tier | 300K tokens |
$29 Plan | 3M tokens |
$49 Plan | 4.5M tokens |
Rate Limits
- Batch size: 128 inputs per request
- Max tokens per input: 8192
- Concurrent requests: Based on plan
Error Handling
try {
const response = await openai.embeddings.create({
model: '@cf/baai/bge-base-en-v1.5',
input: texts
});
} catch (error) {
if (error.code === 'token_limit_exceeded') {
// Handle token limit
} else if (error.code === 'rate_limit_exceeded') {
// Implement backoff strategy
} else {
// Handle other errors
}
}
Getting Started
- Install the SDK:
npm install @neuredge/sdk
# or
pip install neuredge-sdk
- Set up your environment:
NEUREDGE_API_KEY=your-api-key
- Initialize the client:
import { Neuredge } from '@neuredge/sdk';
const neuredge = new Neuredge({
apiKey: process.env.NEUREDGE_API_KEY
});
Learn more: