Skip to main content

Overview

Text embeddings convert text into dense numerical vectors that capture semantic meaning. These vectors enable powerful applications like semantic search, content recommendations, and document analysis.

OpenAI Compatibility

Our embeddings models are compatible with OpenAI's embeddings API, allowing you to use them as a drop-in replacement for models like text-embedding-ada-002. Simply update the base URL and model name:

import OpenAI from 'openai';

const openai = new OpenAI({
apiKey: 'your-neuredge-key',
baseURL: 'https://api.neuredge.dev/v1/'
});

const embedding = await openai.embeddings.create({
model: '@cf/baai/bge-base-en-v1.5', // Use any Neuredge embedding model
input: 'Your text here'
});

Available Models

ModelDimensionsUse CasePerformanceSpeed
BGE Small384Fast retrieval, Mobile/EdgeGoodFastest
BGE Base768General purpose, ProductionBetterFast
BGE Large1024Research, Legal, MedicalBestModerate

Model Selection Guide

  • BGE Small (384d)

    • Ideal for: Mobile apps, Edge devices, High-throughput systems
    • Best for: Basic similarity search, Quick prototypes
    • Trade-offs: Lower semantic resolution for better speed
  • BGE Base (768d)

    • Ideal for: Production systems, General applications
    • Best for: Most business use cases
    • Trade-offs: Balanced performance and speed
  • BGE Large (1024d)

    • Ideal for: Research, Legal analysis, Medical applications
    • Best for: Maximum semantic accuracy
    • Trade-offs: Higher resource usage for better quality

Common Applications

// Generate embedding for search query
const response = await openai.embeddings.create({
model: '@cf/baai/bge-base-en-v1.5',
input: searchQuery
});

// Search using vector similarity
const results = await db.query(`
SELECT id, content,
1 - (embedding <=> $1) as similarity
FROM documents
WHERE 1 - (embedding <=> $1) > 0.7
ORDER BY similarity DESC
LIMIT 5
`, [response.data[0].embedding]);

2. Content Recommendations

// Generate embeddings for user preferences
const userEmbedding = await openai.embeddings.create({
model: '@cf/baai/bge-base-en-v1.5',
input: userPreferences
});

// Find similar content
const recommendations = await db.query(`
SELECT id, title,
1 - (embedding <=> $1) as relevance
FROM content
WHERE category = ANY($2)
ORDER BY relevance DESC
LIMIT 10
`, [userEmbedding.data[0].embedding, userCategories]);

3. Duplicate Detection

async function findDuplicates(texts, threshold = 0.95) {
const response = await openai.embeddings.create({
model: '@cf/baai/bge-base-en-v1.5',
input: texts
});

const embeddings = response.data;
const duplicates = [];

for (let i = 0; i < embeddings.length; i++) {
for (let j = i + 1; j < embeddings.length; j++) {
const similarity = cosineSimilarity(
embeddings[i].embedding,
embeddings[j].embedding
);
if (similarity > threshold) {
duplicates.push([i, j]);
}
}
}

return duplicates;
}

Vector Database Integration

PostgreSQL (pgvector)

import { Pool } from 'pg';
import pgvector from 'pgvector/pg';

// Setup
const pool = new Pool();
await pool.query('CREATE EXTENSION IF NOT EXISTS vector');

// Create table
await pool.query(`
CREATE TABLE IF NOT EXISTS documents (
id SERIAL PRIMARY KEY,
content TEXT,
embedding vector(768)
);
`);

// Create index
await pool.query(`
CREATE INDEX ON documents
USING ivfflat (embedding vector_cosine_ops)
WITH (lists = 100);
`);

Supabase

import { createClient } from '@supabase/supabase-js';

const supabase = createClient(
process.env.SUPABASE_URL,
process.env.SUPABASE_KEY
);

// Store embeddings
await supabase
.from('documents')
.insert({
content: text,
embedding: embedding
});

// Search
const { data, error } = await supabase.rpc(
'match_documents',
{
query_embedding: searchEmbedding,
match_threshold: 0.7,
match_count: 5
}
);

Best Practices

  1. Text Preprocessing

    • Clean and normalize text
    • Remove irrelevant information
    • Handle special characters
    • Consider text length
  2. Vector Operations

    • Use cosine similarity
    • Normalize vectors
    • Use appropriate indexes
    • Implement caching
  3. Performance Optimization

    • Batch embedding requests
    • Use appropriate model size
    • Implement rate limiting
    • Monitor usage
  4. Application Design

    • Choose right vector DB
    • Plan for scaling
    • Consider hybrid search
    • Implement fallbacks

Quotas and Limits

PlanMonthly Token Quota
Free Tier300K tokens
$29 Plan3M tokens
$49 Plan4.5M tokens

Rate Limits

  • Batch size: 128 inputs per request
  • Max tokens per input: 8192
  • Concurrent requests: Based on plan

Error Handling

try {
const response = await openai.embeddings.create({
model: '@cf/baai/bge-base-en-v1.5',
input: texts
});
} catch (error) {
if (error.code === 'token_limit_exceeded') {
// Handle token limit
} else if (error.code === 'rate_limit_exceeded') {
// Implement backoff strategy
} else {
// Handle other errors
}
}

Getting Started

  1. Install the SDK:
npm install @neuredge/sdk
# or
pip install neuredge-sdk
  1. Set up your environment:
NEUREDGE_API_KEY=your-api-key
  1. Initialize the client:
import { Neuredge } from '@neuredge/sdk';

const neuredge = new Neuredge({
apiKey: process.env.NEUREDGE_API_KEY
});

Learn more: