Self Learning Postgres DB avatar
Self Learning Postgres DB

Pricing

from $0.30 / 1,000 results

Go to Apify Store
Self Learning Postgres DB

Self Learning Postgres DB

Self-learning vector database with GNN-powered index optimization. Features: vector search, RAG queries, embeddings, clustering, deduplication, batch ops, and data import/export. Scales with Raft consensus.

Pricing

from $0.30 / 1,000 results

Rating

0.0

(0)

Developer

Reuven Cohen

Reuven Cohen

Maintained by Community

Actor stats

1

Bookmarked

2

Total users

1

Monthly active users

3 days ago

Last modified

Share

Self-Learning Postgres DB - Vector Database for AI Agents

A distributed vector database that truly learns. Store embeddings, query with semantic search, and let the index improve itself through TRM (Tiny Recursive Models), SONA (Self-Optimizing Neural Architecture), and Graph Neural Networks.

Apify Actor PostgreSQL 17 License: MIT Version

Key AI Features

FeatureDescription
TRM7M parameter recursive reasoning (83% on GSM8K)
SONA3-tier learning (Instant/Background/Deep)
EWC++Anti-forgetting protection (λ=2000)
GNNGraph Neural Network index optimization
Trajectory TrackingLearn from query patterns

Features

30+ Operations for complete vector database management:

  • Semantic Search - Find documents by meaning, not just keywords
  • Batch Operations - Insert and search thousands of documents efficiently
  • Hybrid Search - Combine vector similarity with keyword matching
  • RAG Support - Built-in Retrieval-Augmented Generation queries
  • Self-Learning - GNN training for index optimization
  • Clustering - K-means document clustering
  • Deduplication - Find and remove duplicate content
  • Export/Import - JSON and CSV data migration

Zero Setup Required:

  • Embedded PostgreSQL with ruvector extension
  • Local AI embeddings (no OpenAI API key needed)
  • Automatic table and index creation

Quick Start (30 Seconds)

Full Demo

{
"action": "full_workflow",
"query": "How does machine learning work?",
"documents": [
{"content": "Machine learning is AI that learns patterns from data.", "metadata": {"category": "AI"}},
{"content": "PostgreSQL is a powerful relational database.", "metadata": {"category": "Database"}},
{"content": "Neural networks consist of layers of nodes.", "metadata": {"category": "AI"}},
{"content": "Vector databases store embeddings for similarity search.", "metadata": {"category": "Database"}}
]
}

Result: Documents ranked by semantic relevance to your query.


All 38 Actions

Document Operations

ActionDescription
insertAdd documents with auto-generated embeddings
batch_insertEfficiently insert large document sets
getRetrieve single document by ID
listList documents with filtering
updateModify existing document content/metadata
deleteRemove documents by ID, IDs, or filter
upsertInsert or update (smart merge)

Search Operations

ActionDescription
searchSemantic similarity search
batch_searchMultiple queries in one call
hybrid_searchVector + BM25 keyword combined
multi_query_searchAggregate results from multiple queries
mmr_searchMaximal Marginal Relevance (diverse results)
graph_searchGraph-based similarity traversal
range_searchAll results within distance threshold

Table Operations

ActionDescription
create_tableCreate new vector collection
drop_tableDelete collection
list_tablesShow all vector tables
table_statsCollection statistics and metrics
create_indexAdd HNSW or IVFFlat index
reindexRebuild indexes

Self-Learning / GNN / SONA

ActionDescription
train_gnnTrain Graph Neural Network on data
optimize_indexAuto-tune HNSW parameters
analyze_patternsAnalyze data distribution
sona_learnTrigger TRM/SONA background learning cycle
sona_statusCheck SONA learning status and capabilities

Clustering & Deduplication

ActionDescription
clusterK-means document clustering
find_duplicatesDetect similar document pairs
deduplicateRemove duplicate documents

Data Operations

ActionDescription
exportExport to JSON or CSV
importImport from JSON data

AI / RAG

ActionDescription
rag_queryBuild RAG context from search results
summarizeDocument statistics and previews

Utility

ActionDescription
pingTest database connection
versionGet version and feature info
embedding_modelsList available models
generate_embeddingCreate embeddings without storing
similarityCompare similarity of two texts

Use Cases

1. AI Agent Memory

{
"action": "insert",
"tableName": "agent_memory",
"documents": [
{"content": "User prefers dark mode", "metadata": {"user_id": "123", "type": "preference"}},
{"content": "User asked about Python tutorials", "metadata": {"user_id": "123", "type": "history"}}
]
}

Retrieve memories:

{
"action": "search",
"tableName": "agent_memory",
"query": "What does this user like?",
"filter": "metadata->>'user_id' = '123'"
}

2. RAG Pipeline

{
"action": "rag_query",
"query": "How do I return a product?",
"topK": 5,
"ragMaxTokens": 2000
}

Returns context ready to feed to your LLM.

3. Batch Document Processing

{
"action": "batch_insert",
"batchSize": 100,
"documents": [
// ... thousands of documents
]
}

4. Find & Remove Duplicates

{
"action": "find_duplicates",
"similarityThreshold": 0.95
}

Then:

{
"action": "deduplicate",
"similarityThreshold": 0.95
}

5. Document Clustering

{
"action": "cluster",
"numClusters": 10,
"clusteringAlgorithm": "kmeans"
}

6. Index Optimization

{
"action": "optimize_index",
"enableLearning": true
}

7. SONA Self-Learning

Check learning status:

{
"action": "sona_status"
}

Trigger learning cycle:

{
"action": "sona_learn",
"ewcLambda": 2000,
"patternThreshold": 0.7
}

Parameters Reference

Core Parameters

ParameterTypeDefaultDescription
actionstringsearchOperation to perform
connectionStringstringembeddedPostgreSQL URL for persistence
tableNamestringdocumentsTable/collection name

Search Parameters

ParameterTypeDefaultDescription
querystring-Natural language search query
queryVectorarray-Pre-computed embedding vector
topKinteger10Number of results
distanceMetricstringcosinecosine, l2, inner_product, manhattan
filterstring-SQL WHERE clause
minScorenumber0Minimum similarity score (0-1)
maxDistancenumber-Maximum distance threshold

Embedding Parameters

ParameterTypeDefaultDescription
embeddingModelstringall-MiniLM-L6-v2AI embedding model
generateEmbeddingsbooleantrueAuto-generate embeddings
dimensionsinteger384Vector dimensions

Index Parameters

ParameterTypeDefaultDescription
indexTypestringhnswhnsw, ivfflat, none
hnswMinteger16HNSW max connections
hnswEfConstructioninteger64HNSW build quality
hnswEfSearchinteger100HNSW search quality
ivfListsinteger100IVFFlat partitions

GNN Parameters

ParameterTypeDefaultDescription
enableLearningbooleanfalseEnable self-learning
learningRatenumber0.01GNN learning rate
gnnLayersinteger2GNN layer count
trainEpochsinteger10Training epochs

SONA / TRM Parameters

ParameterTypeDefaultDescription
sonaEnabledbooleantrueEnable TRM/SONA self-learning
ewcLambdanumber2000EWC++ anti-forgetting strength
patternThresholdnumber0.7Pattern recognition confidence
maxTrajectoriesinteger100Max trajectory steps to track
sonaLearningTiersarray["instant", "background"]Learning tiers to enable

Clustering Parameters

ParameterTypeDefaultDescription
numClustersinteger10K-means clusters
similarityThresholdnumber0.95Duplicate detection threshold

Embedding Models

ModelDimensionsSpeedQualityBest For
all-MiniLM-L6-v2384FastGoodPrototyping
bge-small-en-v1.5384FastExcellentProduction
bge-base-en-v1.5768MediumBetterHigh accuracy
nomic-embed-text-v1768MediumGoodLong documents (8K)
gte-small384FastGoodGeneral use
e5-small-v2384FastGoodMultilingual

Persistent Storage

Hybrid Persistence Architecture

┌─────────────────────────────────────────────────────────┐
│ Actor Run │
│ ┌──────────────┐ ┌──────────────┐ ┌───────────┐ │
│ │ Key-Value │───▶│ Embedded │───▶│ Key-Value │ │
│ │ Store (load) │ │ PostgreSQL │ (save) │ │
│ └──────────────┘ └──────────────┘ └───────────┘ │
START WORK END
└─────────────────────────────────────────────────────────┘

Flow:

  1. On Start → Load documents from Key-Value Store into embedded PostgreSQL
  2. During Run → Full vector search capabilities (HNSW, cosine, etc.)
  3. On End → Export documents back to Key-Value Store

Storage Options Comparison

FeatureExternal PostgreSQLApify Key-Value Store
Setup requiredYesNo
CostSeparate billingIncluded in Apify
Max sizeUnlimited~9GB per store
Cold startFastSlower (load data)
Best forLarge/productionSmall-medium datasets

External PostgreSQL

For persistent storage with external database:

{
"connectionString": "postgresql://user:password@host:5432/database",
"action": "search",
"query": "Your query"
}

Supported:

  • PostgreSQL 14+ with ruvector extension
  • PostgreSQL with pgvector (compatibility mode)
  • Supabase, Neon, AWS RDS, etc.

API Integration

Python

from apify_client import ApifyClient
client = ApifyClient("your-api-token")
run = client.actor("ruv/self-learning-postgres-db").call(run_input={
"action": "search",
"query": "machine learning basics",
"topK": 5
})
results = client.dataset(run["defaultDatasetId"]).list_items().items

JavaScript

import { ApifyClient } from 'apify-client';
const client = new ApifyClient({ token: 'your-api-token' });
const run = await client.actor('ruv/self-learning-postgres-db').call({
action: 'search',
query: 'machine learning basics',
topK: 5
});
const results = await client.dataset(run.defaultDatasetId).listItems();

cURL

curl -X POST "https://api.apify.com/v2/acts/ruv~self-learning-postgres-db/runs" \
-H "Authorization: Bearer YOUR_TOKEN" \
-H "Content-Type: application/json" \
-d '{
"action": "search",
"query": "machine learning",
"topK": 10
}'

Performance

Built on PostgreSQL 17.7 with AVX-512 SIMD acceleration:

Dataset SizeSearch TimeAccuracy
10,000 docs~0.3ms99%+
100,000 docs~0.5ms99%+
1,000,000 docs~1.2ms98%+

Pricing (Apify Pay-per-event)

Core Events

EventPriceDescription
Actor Start$0.001Per GB memory used
Document Insert$0.001Per document stored
Vector Search$0.001Per search query
Result$0.0005Per result returned

Advanced Operations

EventPriceDescription
Batch Operation$0.002Per batch insert/search
RAG Query$0.002Per RAG context build
GNN Training$0.01Per training session
Clustering$0.005Per cluster operation
Deduplication$0.003Per dedupe run
Data Export$0.002Per export
Data Import$0.002Per import
Table Operation$0.001Create/drop table
Index Operation$0.002Create/optimize index
Similarity Check$0.001Per comparison
Embedding Generation$0.001Per embedding

Volume Discounts:

  • Bronze: -14% off results
  • Silver: -26% off results
  • Gold: -40% off results

Development

Local Testing

# Start ruvector-postgres
docker run -d --name ruvector-pg -e POSTGRES_PASSWORD=secret -p 5432:5432 ruvnet/ruvector-postgres:latest
# Run tests
DATABASE_URL="postgresql://postgres:secret@localhost:5432/postgres" npm test

Deployment

# Set your API token in root .env
echo "APIFY_API_TOKEN=your_token" >> ../../../.env
# Deploy
npm run deploy


Support


Built with RuVector - High-performance vector search with TRM/SONA self-learning for the AI era.