slorg vs. SearxNG + your own LLM glue
If your stack already has a SearxNG instance and a function that calls GPT, you can build something slorg-shaped in an afternoon. The question is whether you should — and what's specifically in the six-step pipeline that isn't in the obvious roll-your-own.
The fair version of the comparison
The honest baseline is not "no AI search" — it's the very common pattern where a team has a SearxNG container running, a small wrapper that calls the OpenAI API, and a glue function that does roughly: take the user query, send it to SearxNG, take the top-N results, stuff them into a prompt with the original question, ask the model for an answer. Fewer than a hundred lines of code. Works for a lot of things.
slorg is what you get when you decide that pipeline is missing a planning step, decide the planning step is worth a round-trip, and commit to a fixed six-step shape so the failure modes are at least predictable.
Side-by-side
| Dimension | slorg | SearxNG + your own glue |
|---|---|---|
| What runs first | LLM draft + knowledge graph. The keywords going to SearxNG are derived from the graph, not from the raw user prompt. | Typically SearxNG runs first on the raw user prompt. The LLM only sees the search results, not the question's structure. |
| Number of LLM calls per search | Three (draft+graph, keyword extraction, per-result scoring) plus the content fetch sweep. | Usually one (the answer synthesis). Sometimes zero, if you just want ranked results. |
| Keyword strategy | Keywords come from a structured graph. If the question is comparative ("X vs. Y"), the graph carries both arms and the keywords reflect that. | Whatever the user typed gets sent verbatim to SearxNG. Quality of retrieval is bounded by quality of user phrasing. |
| Per-result relevance scoring | Yes. Each retrieved page is fetched, content-extracted, and scored 0–1 by the LLM against the original query. Sort by score. | Optional. You'd have to add this layer yourself. Most glue scripts skip it and trust the SearxNG ranking. |
| Result shape | Structured object: draft answer, graph, keywords, scored results with extracted content, token count. Same shape from CLI, library, and REST endpoints. | Whatever your glue function returns. Usually a plain string answer; you can add structure but you have to choose to. |
| Configuration surface | Five env vars: API key, base URL, model, engines, search limit. Documented in the README. | Whatever you wired up. Usually undocumented until you have to onboard someone. |
| SearxNG dependency | The library calls SearxNG internally; you can point it at your own instance or use a public one. | You run SearxNG yourself; your glue talks to it directly. Full control. |
| Latency | Slow: ~5–15 s for default config. The plan-first design adds cost. | Fast: usually 1–3 s, dominated by the single LLM call and the search. |
| Cost per query | Higher: three LLM calls plus a scoring pass over content-extracted pages. | Lower: one LLM call over snippets. |
| Maintenance burden | External dependency; updates ship as npm:lorg releases. | You own all the code. Pro: nothing surprises you. Con: nothing surprises you, including the bugs you wrote. |
What slorg gives you that the obvious glue does not
Three things, specifically.
1. The planning step is committed. When the keyword extraction reads from a structured graph rather than from the raw prompt, the retrieval is conditioned on the question's shape, not its phrasing. Comparative questions retrieve both arms. Multi-aspect questions retrieve each aspect. It is not magic — it is just that the model has been forced to commit to a structure before it picks search terms.
2. Relevance scoring is a separate pass. SearxNG's ranking is keyword-based. slorg's scoring at step 6 is question-aware: the model sees the original query and the extracted page content and produces a number. Sorting by that number sometimes reorders the list dramatically. You can build this yourself; most teams don't.
3. The intermediate artifacts are first-class. The draft answer, the graph, the keywords, the per-result scores — all are fields on the response. If you're wiring this into a research dashboard, the dashboard can display the graph and the scores. If you're logging for audit, you have a complete trace of what happened.
When the roll-your-own is the right answer
- You need sub-second perceived latency. slorg's plan-first design is structurally slower; no tuning gets you below a few seconds.
- You only need ranked URLs, not a structured retrieval object. The 50-line glue script is fine.
- You have a strong opinion about the prompt or the synthesis step and want it under your direct control.
- Adding an npm dependency for a six-step pipeline you could write yourself isn't worth it.
When slorg is the right answer
- You want the intermediate artifacts as part of the response, not as logging side-effects you bolted on.
- You want a fixed contract: same six steps, same response shape, regardless of who on the team is calling it.
- You don't want to maintain prompt-engineering scripts in five different services.
- You want the cost model to be predictable: a known number of LLM calls per query, surfaced as
tokenCount.
The honest version
slorg is opinionated glue. If you agree with the opinion — that the plan is worth a round-trip, that scoring should be a separate pass, that intermediate artifacts should be visible — using slorg saves you the work of writing your own. If you disagree with the opinion or only need one of the three properties above, your own SearxNG-plus-LLM script is smaller, faster, and yours.
Either way, please run SearxNG. The world needs more search infrastructure that isn't a single vendor's API.
Background reading
- Most agentic search throws queries at Google and prays — the case for plan-first search.
- Planning the search: where the latency budget really goes — the cost breakdown.
- SearxNG — the meta-search engine slorg uses; also worth running on its own.