Overview

Text embeddings convert text into dense numerical vectors that capture semantic meaning. These vectors enable powerful applications like semantic search, content recommendations, and document analysis.

OpenAI Compatibility

Our embeddings models are compatible with OpenAI's embeddings API, allowing you to use them as a drop-in replacement for models like text-embedding-ada-002. Simply update the base URL and model name:

import OpenAI from 'openai';

const openai = new OpenAI({
  apiKey: 'your-neuredge-key',
  baseURL: 'https://api.neuredge.dev/v1/'
});

const embedding = await openai.embeddings.create({
  model: '@cf/baai/bge-base-en-v1.5',  // Use any Neuredge embedding model
  input: 'Your text here'
});

Available Models

Model	Dimensions	Use Case	Performance	Speed
BGE Small	384	Fast retrieval, Mobile/Edge	Good	Fastest
BGE Base	768	General purpose, Production	Better	Fast
BGE Large	1024	Research, Legal, Medical	Best	Moderate

Model Selection Guide

BGE Small (384d)
- Ideal for: Mobile apps, Edge devices, High-throughput systems
- Best for: Basic similarity search, Quick prototypes
- Trade-offs: Lower semantic resolution for better speed
BGE Base (768d)
- Ideal for: Production systems, General applications
- Best for: Most business use cases
- Trade-offs: Balanced performance and speed
BGE Large (1024d)
- Ideal for: Research, Legal analysis, Medical applications
- Best for: Maximum semantic accuracy
- Trade-offs: Higher resource usage for better quality

Common Applications

1. Semantic Search

// Generate embedding for search query
const response = await openai.embeddings.create({
  model: '@cf/baai/bge-base-en-v1.5',
  input: searchQuery
});

// Search using vector similarity
const results = await db.query(`
  SELECT id, content, 
    1 - (embedding <=> $1) as similarity
  FROM documents
  WHERE 1 - (embedding <=> $1) > 0.7
  ORDER BY similarity DESC
  LIMIT 5
`, [response.data[0].embedding]);

2. Content Recommendations

// Generate embeddings for user preferences
const userEmbedding = await openai.embeddings.create({
  model: '@cf/baai/bge-base-en-v1.5',
  input: userPreferences
});

// Find similar content
const recommendations = await db.query(`
  SELECT id, title, 
    1 - (embedding <=> $1) as relevance
  FROM content
  WHERE category = ANY($2)
  ORDER BY relevance DESC
  LIMIT 10
`, [userEmbedding.data[0].embedding, userCategories]);

3. Duplicate Detection

async function findDuplicates(texts, threshold = 0.95) {
  const response = await openai.embeddings.create({
    model: '@cf/baai/bge-base-en-v1.5',
    input: texts
  });

  const embeddings = response.data;
  const duplicates = [];

  for (let i = 0; i < embeddings.length; i++) {
    for (let j = i + 1; j < embeddings.length; j++) {
      const similarity = cosineSimilarity(
        embeddings[i].embedding,
        embeddings[j].embedding
      );
      if (similarity > threshold) {
        duplicates.push([i, j]);
      }
    }
  }

  return duplicates;
}

Vector Database Integration

PostgreSQL (pgvector)

import { Pool } from 'pg';
import pgvector from 'pgvector/pg';

// Setup
const pool = new Pool();
await pool.query('CREATE EXTENSION IF NOT EXISTS vector');

// Create table
await pool.query(`
  CREATE TABLE IF NOT EXISTS documents (
    id SERIAL PRIMARY KEY,
    content TEXT,
    embedding vector(768)
  );
`);

// Create index
await pool.query(`
  CREATE INDEX ON documents 
  USING ivfflat (embedding vector_cosine_ops)
  WITH (lists = 100);
`);

Supabase

import { createClient } from '@supabase/supabase-js';

const supabase = createClient(
  process.env.SUPABASE_URL,
  process.env.SUPABASE_KEY
);

// Store embeddings
await supabase
  .from('documents')
  .insert({
    content: text,
    embedding: embedding
  });

// Search
const { data, error } = await supabase.rpc(
  'match_documents',
  {
    query_embedding: searchEmbedding,
    match_threshold: 0.7,
    match_count: 5
  }
);

Best Practices

Text Preprocessing
- Clean and normalize text
- Remove irrelevant information
- Handle special characters
- Consider text length
Vector Operations
- Use cosine similarity
- Normalize vectors
- Use appropriate indexes
- Implement caching
Performance Optimization
- Batch embedding requests
- Use appropriate model size
- Implement rate limiting
- Monitor usage
Application Design
- Choose right vector DB
- Plan for scaling
- Consider hybrid search
- Implement fallbacks

Quotas and Limits

Plan	Monthly Token Quota
Free Tier	300K tokens
$29 Plan	3M tokens
$49 Plan	4.5M tokens

Rate Limits

Batch size: 128 inputs per request
Max tokens per input: 8192
Concurrent requests: Based on plan

Error Handling

try {
  const response = await openai.embeddings.create({
    model: '@cf/baai/bge-base-en-v1.5',
    input: texts
  });
} catch (error) {
  if (error.code === 'token_limit_exceeded') {
    // Handle token limit
  } else if (error.code === 'rate_limit_exceeded') {
    // Implement backoff strategy
  } else {
    // Handle other errors
  }
}

Getting Started

Install the SDK:

npm install @neuredge/sdk
# or
pip install neuredge-sdk

Set up your environment:

NEUREDGE_API_KEY=your-api-key

Initialize the client:

import { Neuredge } from '@neuredge/sdk';

const neuredge = new Neuredge({
  apiKey: process.env.NEUREDGE_API_KEY
});

Learn more:

OpenAI Compatibility​

Available Models​

Model Selection Guide​

Common Applications​

1. Semantic Search​

2. Content Recommendations​

3. Duplicate Detection​

Vector Database Integration​

PostgreSQL (pgvector)​

Supabase​

Best Practices​

Quotas and Limits​

Rate Limits​

Error Handling​

Getting Started​