Architecture
Complete AI database architecture with vector search, ML inference, and RAG pipeline
NeuronDB Architecture
Vector Engine
High-performance ANN search with HNSW and IVF indexing, supporting multiple distance metrics (L2, Cosine, Inner Product), quantization (FP16/INT8/Binary), and SIMD-optimized operations. Pure C implementation with 158 source files.
ML Engine
52 ML algorithms implemented in pure C: Random Forest, XGBoost, LightGBM, CatBoost, SVM, KNN, Decision Trees, Naive Bayes, Neural Networks, K-means, DBSCAN, GMM, PCA, and more. ONNX runtime integration for model inference.
Embedding Engine
Text embeddings via embed_text() and embed_text_batch() functions. Multimodal support (CLIP, ImageBind). Hugging Face integration. LLM router and runtime with caching. Batch generation with GPU acceleration.
GPU Accelerator
Full GPU support: CUDA (NVIDIA), ROCm (AMD), Metal (Apple Silicon). GPU-accelerated distance calculations, ML inference, and batch processing. Automatic GPU detection with CPU fallback. Native C/C++ implementation.
Background Workers (4)
neuranq (async job queue), neuranmon (auto-tuner), neurandefrag (index maintenance), neuranllm (LLM processor). All tenant-aware with QPS/cost budgets, crash recovery, and SKIP LOCKED processing.
Advanced Features
Hybrid search (vector + FTS), reranking (cross-encoder, LLM, MMR, RRF), complete RAG pipeline, sparse vectors (SPLADE, ColBERT), query planner with cost estimation, and intelligent caching.
Why NeurondB
Vector Search & Indexing
5 production-grade vector types: vector (float32), vectorp (packed), vecmap (sparse map), vgraph (graph-based), rtext (retrieval text). HNSW and IVF indexing with automatic tuning. Multiple distance metrics: L2 (Euclidean), Cosine, Inner Product, Manhattan, Hamming, Jaccard. Product Quantization (PQ) and Optimized PQ (OPQ) for 2x-32x compression.
ML & Embeddings
52 ML algorithms implemented in pure C: Random Forest, XGBoost, LightGBM, CatBoost, Linear/Logistic Regression, Ridge, Lasso, SVM, KNN, Naive Bayes, Decision Trees, Neural Networks, Deep Learning. Built-in embedding generation with caching. ONNX runtime integration. Batch processing with GPU acceleration. Model catalog and versioning.
Hybrid Search & Retrieval
Combine vector similarity with full-text search (BM25). Weighted scoring (70% vector + 30% text). Multi-vector documents. Faceted search with category filters. Temporal decay for time-sensitive relevance. Optimal for real-world search scenarios.
Advanced Reranking
Cross-encoder neural reranking for precision improvement. LLM-powered scoring (GPT-4, Claude). ColBERT late interaction models. MMR (Maximal Marginal Relevance) for diversity. Ensemble strategies combining multiple rankers. Sub-10ms latency.
Complete RAG Pipeline
End-to-end Retrieval Augmented Generation in PostgreSQL. Document chunking and processing. Semantic retrieval with reranking. LLM integration for answer generation. Context management. Guardrails for content safety. Production-ready RAG in SQL.
Background Workers
4 production workers: neuranq (async job queue executor with SKIP LOCKED, retries, poison handling, batch processing), neuranmon (live query auto-tuner for search params, cache rotation, recall@k tracking), neurandefrag (automatic index maintenance, compaction, tombstone pruning, rebuild scheduling), neuranllm (LLM job processing with crash recovery). All tenant-aware with QPS/cost budgets.
ML Analytics Suite
Comprehensive analytics: K-means, Mini-batch K-means, DBSCAN, GMM, Hierarchical clustering (all GPU-accelerated). Dimensionality reduction: PCA, PCA Whitening, OPQ. Outlier detection: Z-score, Modified Z-score, IQR, Isolation Forest. Quality metrics: Davies-Bouldin Index, Recall@K, Precision@K, F1@K, MRR. Drift detection with temporal monitoring. Topic discovery and modeling.
GPU Acceleration
Full GPU support: CUDA (NVIDIA), ROCm (AMD), Metal (Apple Silicon). GPU-accelerated ML algorithms: Random Forest, XGBoost, LightGBM, Linear/Logistic Regression, SVM, KNN, Decision Trees, Naive Bayes, GMM, K-means. Batch distance computation (100x speedup). Automatic GPU detection with CPU fallback. Multi-stream compute overlap. Production-ready with memory management.
Performance & Optimization
SIMD-optimized distance calculations (AVX2, AVX-512, NEON). Intelligent query planning with cost estimates. ANN buffer cache for hot centroids. WAL compression with delta encoding. Parallel kNN execution. Predictive prefetching. Sub-millisecond searches on millions of vectors.
Enterprise Security
Vector encryption (AES-GCM via OpenSSL). Differential privacy for embeddings. Row-level security (RLS) integration. Multi-tenant isolation. HMAC-SHA256 signed results. Audit logging with tamper detection. Usage metering and governance policies. GDPR-compliant data handling.
Monitoring & Observability
pg_stat_neurondb view with real-time metrics. Worker heartbeats and watchdog. Query latency histograms. Cache hit rate tracking. Recall@K monitoring. Model cost accounting. Prometheus exporter ready. Structured JSON logging with neurondb: prefix.
PostgreSQL Native Architecture
Pure C implementation following 100% PostgreSQL coding standards. 144 source files + 64 headers, zero compiler warnings. PGXS build system. 473 SQL functions/types/operators. Shared memory for caching. WAL integration for durability. SPI for safe operations. Background worker framework. Standard extension, zero external dependencies, no core modifications.
Production Capabilities
Comprehensive AI database features built for enterprise production workloads
| Capability | Description | Performance | Production Ready |
|---|---|---|---|
| Vector Search | HNSW indexing, multiple distance metrics, quantization | Sub-millisecond on millions | ✓ |
| ML Inference | ONNX runtime, batch processing, embedding generation | High-throughput batch ops | ✓ |
| Hybrid Search | Vector + FTS, multi-vector, faceted, temporal | Optimized query planning | ✓ |
| Reranking | Cross-encoder, LLM, ColBERT, ensemble | GPU-accelerated support | ✓ |
| Background Workers | Queue executor, auto-tuner, index maintenance | Non-blocking async ops | ✓ |
| RAG Pipeline | Complete in-database RAG with document processing | End-to-end optimization | ✓ |
| ML Analytics | Clustering (K-means, DBSCAN, GMM), PCA, outlier detection, quality metrics, drift detection | GPU-accelerated algorithms | ✓ |
| GPU Acceleration | CUDA (NVIDIA), ROCm (AMD), Metal (Apple), 100x speedup on batch ops | Auto-detection with CPU fallback | ✓ |
| Performance Optimization | SIMD (AVX2/AVX-512/NEON), intelligent query planning, ANN cache, WAL compression | Predictive prefetching | ✓ |
| Enterprise Security | Vector encryption (AES-GCM), differential privacy, RLS integration, multi-tenant isolation | GDPR-compliant | ✓ |
| Monitoring & Observability | pg_stat_neurondb view, worker heartbeats, latency histograms, Prometheus exporter | Real-time metrics | ✓ |
| PostgreSQL Native | Pure C implementation, 473 SQL functions, zero external dependencies, WAL integration | Zero core modifications | ✓ |
NeurondB vs. Alternatives
Comprehensive comparison of NeurondB with other PostgreSQL AI and vector extensions
| Feature | NeurondB | pgvector | pgvectorscale | pgai | PostgresML |
|---|---|---|---|---|---|
| Vector Indexing | HNSW + IVF | HNSW + IVF | StreamingDiskANN | Uses pgvector | pgvector-based |
| ML Inference | ONNX (C++) | None | None | API calls | Python ML libs |
| Embedding Generation | In-database (ONNX) | External | External | External API | In-database (Transformers) |
| Hybrid Search | Native (Vector+FTS) | Manual | Manual | Manual | Manual |
| Reranking | Cross-encoder, LLM, ColBERT, MMR | None | None | None | None |
| ML Algorithms | 52 algorithms: RF, XGBoost, LightGBM, CatBoost, SVM, KNN, DT, NB, NN, K-means, DBSCAN, GMM, PCA, etc. | None | None | None | XGBoost, LightGBM, sklearn suite, Linear/Logistic |
| Background Workers | 4 workers: neuranq, neuranmon, neurandefrag, neuranllm | None | None | None | None |
| RAG Pipeline | Complete In-DB | None | None | Partial (API) | Partial (Python) |
| Quantization | FP16, INT8, Binary (2x-32x) | Binary only | Binary only | None | None |
| Implementation | Pure C | Pure C | Pure C | Rust + SQL | Python + C |
| Training Models | Fine-tuning (roadmap) | None | None | None | Full training (sklearn, XGBoost, etc.) |
| Auto-Tuning | neuranmon worker | None | None | None | None |
| GPU Support | CUDA + ROCm + Metal (native C/C++) | None | None | None | CUDA (via Python) |
| PostgreSQL Versions | 16, 17, 18 | 12-18 | 15-18 | 16-18 | 14-16 |
| License | PostgreSQL | PostgreSQL | Timescale License | PostgreSQL | PostgreSQL |
| Vector Types | 5 types: vector, vectorp, vecmap, vgraph, rtext | 1 type: vector | 1 type: vector | Uses pgvector | Uses pgvector |
| Distance Metrics | 10+ metrics: L2, Cosine, Inner Product, Manhattan, Hamming, Jaccard, etc. | 3 metrics: L2, Cosine, Inner Product | 3 metrics: L2, Cosine, Inner Product | Uses pgvector | Uses pgvector |
| SQL Functions | 473 functions | ~20 functions | ~30 functions | ~15 functions | ~50 functions |
| Index Maintenance | Auto (neurandefrag worker) | Manual | Manual | Manual | Manual |
| Performance (QPS) | 100K+ (with GPU) | 10K-50K | 50K-100K | Limited (API overhead) | 5K-20K (Python overhead) |
| Memory Efficiency | Optimized (PQ/OPQ compression) | Standard | Disk-based (low memory) | Standard | High (Python models) |
| Multi-tenancy | Native (tenant-aware workers) | None | None | None | None |
| Security | Row-level security, encryption, audit logs | PostgreSQL RLS | PostgreSQL RLS | PostgreSQL RLS | PostgreSQL RLS |
| Monitoring | pg_stat_neurondb, Prometheus, Grafana | Basic | Basic | Basic | Limited |
| Documentation | Comprehensive (473 functions documented) | Good | Moderate | Moderate | Good |
| Community Support | Active (pgElephant) | Very Active (Anthropic) | Moderate (Timescale) | Growing | Active |
| Production Readiness | Enterprise-ready | Production-ready | Beta | Early stage | Production-ready |
| Dependencies | Zero (pure C, optional ONNX) | Zero (pure C) | Zero (pure C) | Rust runtime | Python + ML libraries |
| Batch Processing | Native (neuranq worker) | Manual | Manual | Limited | Native (Python) |
| Model Catalog | Built-in (versioning, A/B testing) | None | None | None | Basic |
| Cost Efficiency | High (in-DB, no API costs) | High (in-DB) | High (disk-based) | Low (API costs) | Moderate (Python overhead) |
Add AI Capabilities to PostgreSQL
Install NeurondB. Build semantic search, RAG applications, and ML features in your PostgreSQL infrastructure.