Compare

slorg vs. SearxNG + your own LLM glue

If your stack already has a SearxNG instance and a function that calls GPT, you can build something slorg-shaped in an afternoon. The question is whether you should — and what's specifically in the six-step pipeline that isn't in the obvious roll-your-own.

The fair version of the comparison

The honest baseline is not "no AI search" — it's the very common pattern where a team has a SearxNG container running, a small wrapper that calls the OpenAI API, and a glue function that does roughly: take the user query, send it to SearxNG, take the top-N results, stuff them into a prompt with the original question, ask the model for an answer. Fewer than a hundred lines of code. Works for a lot of things.

slorg is what you get when you decide that pipeline is missing a planning step, decide the planning step is worth a round-trip, and commit to a fixed six-step shape so the failure modes are at least predictable.

Side-by-side

Dimension	slorg	SearxNG + your own glue
What runs first	LLM draft + knowledge graph. The keywords going to SearxNG are derived from the graph, not from the raw user prompt.	Typically SearxNG runs first on the raw user prompt. The LLM only sees the search results, not the question's structure.
Number of LLM calls per search	Three (draft+graph, keyword extraction, per-result scoring) plus the content fetch sweep.	Usually one (the answer synthesis). Sometimes zero, if you just want ranked results.
Keyword strategy	Keywords come from a structured graph. If the question is comparative ("X vs. Y"), the graph carries both arms and the keywords reflect that.	Whatever the user typed gets sent verbatim to SearxNG. Quality of retrieval is bounded by quality of user phrasing.
Per-result relevance scoring	Yes. Each retrieved page is fetched, content-extracted, and scored 0–1 by the LLM against the original query. Sort by score.	Optional. You'd have to add this layer yourself. Most glue scripts skip it and trust the SearxNG ranking.
Result shape	Structured object: draft answer, graph, keywords, scored results with extracted content, token count. Same shape from CLI, library, and REST endpoints.	Whatever your glue function returns. Usually a plain string answer; you can add structure but you have to choose to.
Configuration surface	Five env vars: API key, base URL, model, engines, search limit. Documented in the README.	Whatever you wired up. Usually undocumented until you have to onboard someone.
SearxNG dependency	The library calls SearxNG internally; you can point it at your own instance or use a public one.	You run SearxNG yourself; your glue talks to it directly. Full control.
Latency	Slow: ~5–15 s for default config. The plan-first design adds cost.	Fast: usually 1–3 s, dominated by the single LLM call and the search.
Cost per query	Higher: three LLM calls plus a scoring pass over content-extracted pages.	Lower: one LLM call over snippets.
Maintenance burden	External dependency; updates ship as `npm:lorg` releases.	You own all the code. Pro: nothing surprises you. Con: nothing surprises you, including the bugs you wrote.

What slorg gives you that the obvious glue does not

Three things, specifically.

1. The planning step is committed. When the keyword extraction reads from a structured graph rather than from the raw prompt, the retrieval is conditioned on the question's shape, not its phrasing. Comparative questions retrieve both arms. Multi-aspect questions retrieve each aspect. It is not magic — it is just that the model has been forced to commit to a structure before it picks search terms.

2. Relevance scoring is a separate pass. SearxNG's ranking is keyword-based. slorg's scoring at step 6 is question-aware: the model sees the original query and the extracted page content and produces a number. Sorting by that number sometimes reorders the list dramatically. You can build this yourself; most teams don't.

3. The intermediate artifacts are first-class. The draft answer, the graph, the keywords, the per-result scores — all are fields on the response. If you're wiring this into a research dashboard, the dashboard can display the graph and the scores. If you're logging for audit, you have a complete trace of what happened.

When the roll-your-own is the right answer

You need sub-second perceived latency. slorg's plan-first design is structurally slower; no tuning gets you below a few seconds.
You only need ranked URLs, not a structured retrieval object. The 50-line glue script is fine.
You have a strong opinion about the prompt or the synthesis step and want it under your direct control.
Adding an npm dependency for a six-step pipeline you could write yourself isn't worth it.

When slorg is the right answer

You want the intermediate artifacts as part of the response, not as logging side-effects you bolted on.
You want a fixed contract: same six steps, same response shape, regardless of who on the team is calling it.
You don't want to maintain prompt-engineering scripts in five different services.
You want the cost model to be predictable: a known number of LLM calls per query, surfaced as tokenCount.

The honest version

slorg is opinionated glue. If you agree with the opinion — that the plan is worth a round-trip, that scoring should be a separate pass, that intermediate artifacts should be visible — using slorg saves you the work of writing your own. If you disagree with the opinion or only need one of the three properties above, your own SearxNG-plus-LLM script is smaller, faster, and yours.

Either way, please run SearxNG. The world needs more search infrastructure that isn't a single vendor's API.

Background reading

Most agentic search throws queries at Google and prays — the case for plan-first search.
Planning the search: where the latency budget really goes — the cost breakdown.
SearxNG — the meta-search engine slorg uses; also worth running on its own.