Skelf-Research / open source

The search engine that thinks before it searches.

slorg drafts an answer and a knowledge graph from an LLM first, uses that graph to extract search keywords, queries the open web through SearxNG, then scores each result for relevance before it shows you anything. Same six steps every time. No magic.

Quickstart How it works View source

What "thinks before it searches" actually means

Six fixed steps. No agentic loop.

slorg is not an agent. It does not decide whether to search; it always searches. It does not pick tools at runtime; the pipeline is the same on every query. What it does instead: spend its first round-trip on the LLM, not on Google, so that the keywords sent to the web are derived from a structured plan rather than from the raw user prompt.

Step 1 Draft answer GPT writes a first-pass answer from training data alone. No web access yet.

Step 2 Knowledge graph Entities and relationships are extracted from the draft answer into a typed graph.

Step 3 Keyword extraction Search terms are derived from the graph nodes, not from the user prompt directly.

Step 4 Multi-engine search SearxNG queries Google, Bing, Yahoo, DuckDuckGo. Default limit 20 results.

Step 5 Content fetch Each candidate URL is fetched and its readable content extracted.

Step 6 Relevance scoring GPT scores each result 0–1 against the original query. Top N returned.

honest caveat The "thinks before" is structural, not metacognitive. Step 1's draft can carry the model's biases into step 3's keywords. If the LLM is wrong about the topic, the retrieved set will be wrong in a correlated way. slorg trades blind keyword search for plan-conditioned search; it does not eliminate hallucination, it just makes the plan visible.

Who it is for

Research teams who want their search legible

You can read the knowledge graph, see the extracted keywords, and inspect the per-result relevance score. The pipeline is six functions, not a black box.

People building RAG over the open web

slorg returns ranked, scored, content-extracted results as a JSON object. The results[] array is shaped for ingestion into a vector store or as in-context grounding for a downstream LLM call.

Perplexity users who want more control

Self-hosted. Bring your own OpenAI key (or any OpenAI-compatible endpoint via OPENAI_BASE_URL). Pick your engines. Pick your model.

Internal-doc searchers after a model swap

By default slorg targets the open web. The relevance-scoring layer is source-agnostic; the README documents the public web backends, and the architecture page is where to look before pointing it at private indexes.

Quick start

Install, set a key, ask.

npm install -g lorg
export OPENAI_API_KEY=sk-your-key-here

lorg "What causes the northern lights?"

Or as a Node library:

import LorgSearch from 'lorg';

const lorg = new LorgSearch(process.env.OPENAI_API_KEY);

const results = await lorg.search('What is quantum computing?', {
  model: 'gpt-4o-mini',
  maxResults: 5
});

console.log(results.answer);         // draft answer (step 1)
console.log(results.knowledgeGraph); // entities + relationships (step 2)
console.log(results.keywords);       // extracted search terms (step 3)
console.log(results.results);        // ranked web results with score (step 6)
console.log(results.tokenCount);     // OpenAI tokens used

Or as a REST service: lorg server -p 3000 exposes POST /search. Full API reference at docs.skelfresearch.com/slorg.

Configuration

Five env vars. That is the whole surface.

Variable	Default	Purpose
OPENAI_API_KEY	—	Required. Authenticates the LLM calls in steps 1, 3, and 6.
OPENAI_BASE_URL	OpenAI	Point at any OpenAI-compatible endpoint (local llama.cpp, Together, Groq, etc).
LORG_DEFAULT_MODEL	gpt-4o-mini	One model for all three LLM steps. No per-step override yet.
LORG_SEARCH_ENGINES	google,bing,yahoo,duckduckgo	Comma list passed to SearxNG.
LORG_SEARCH_LIMIT	20	Max URLs returned by SearxNG before content-fetch and scoring.

Source: project README. If you need a knob that isn't on this list, it isn't wired up — open an issue.

Go deeper

Nov 12, 2025

Source-aware retrieval: knowing what your engine knows

Most search engines hide their backends behind a single answer. slorg shows you which engines returned which URLs, what content was actually extracted, and what the LLM scored each one — because retrieval without source awareness is just confident guessing.
Oct 9, 2025

Planning the search: where the latency budget really goes

A breakdown of the three LLM calls and N content fetches that make up a slorg query, what each one costs, and which knobs actually move the needle.
Sep 14, 2025

Most agentic search throws queries at Google and prays

Why the dominant pattern in LLM-mediated search is functionally a wrapper around a single search call — and what changes when you make the plan the first artifact, not the last.

All posts →

How slorg compares

Narrow, honest comparisons against tools that overlap on one axis or another.

slorg vs. Perplexity

Both produce an answer plus ranked sources. One is a hosted product; the other is six functions you can read.
slorg vs. SearxNG + your own LLM glue

If you already run SearxNG and a model, what slorg adds is the plan: a graph, extracted keywords, and a scoring pass — in that order.

All comparisons →

The search engine that thinks before it searches.

Six fixed steps. No agentic loop.

Research teams who want their search legible

People building RAG over the open web

Perplexity users who want more control

Internal-doc searchers after a model swap

Install, set a key, ask.

Five env vars. That is the whole surface.

Features

Architecture

Use cases

FAQ & glossary

Quickstart

About

Source-aware retrieval: knowing what your engine knows

Planning the search: where the latency budget really goes

Most agentic search throws queries at Google and prays

slorg vs. Perplexity

slorg vs. SearxNG + your own LLM glue