Commit Graph

2042 Commits

Author SHA1 Message Date
Bnyro 51b6fd4f23 [del] karmasearch: remove engine (cloudflared) (#6213)
The engine is using very aggressive Cloudflare blocking for
a while now, no matter if using a normal browser like Firefox
or not.

Closes: https://github.com/searxng/searxng/issues/5976
2026-06-07 06:49:09 +02:00
Markus Heiser 0429198415 [mod] swisscows WEB: ignore video results from the first page
On the first page of the WEB search, there are, among other things, sections for
videos and news.  The video results from these sections should not be used as
results in the WEB search of SearXNG.

Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2026-06-06 18:04:19 +02:00
Markus Heiser e7cf57e9ae [mod] swisscows engines: add language / region support
Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2026-06-06 18:04:19 +02:00
Bnyro ed369ac0ec [feat] engines: add support for swisscows general 2026-06-06 18:04:19 +02:00
Bnyro 94bdbb5c63 [feat] engines: add support for swisscows videos 2026-06-06 18:04:19 +02:00
Bnyro 465b5229c6 [feat] engines: add swisscows news engine 2026-06-06 18:04:19 +02:00
Bnyro cbf97fd262 [feat] engines: add swisscows images engine
The implementation is basically a 1:1 port of the reverse engineered
swisscows JavaScript code. (it's been obfuscated, so I've restructured it
and made the variable names idiomatic instead of obfuscated var names like "a", "o", "i")

```js
/*
e: "/v5/images/search"
t: {
	itemsCount: "50"
	locale: "de-DE"
	offset: "50"
	query: "test"
	spellcheck: "true"
}
*/
// HASH library used: https://github.com/h2non/jshashes
function generateNonceAndSignature(queryParams, urlPath) {
  // urlPath = "/v5/images/search"
  // sort keys alphabetically and join to query string
  let queryStringSorted = '?' + U().stringify(queryParams, {
    arrayFormat: 'repeat',
    allowDots: !0
  }).split('&').map(e => {
    let[key, value] = e.split('=');
    return [key, decodeURIComponent(value)]
  }).sort((e, t) => e[0].localeCompare(t[0])).map(e => e.join('=')).join('&');

  function caesarShift(str, offset = 13) {
      const alphabet = 'abcdefghijklmnopqrstuvwxyz';
      let result = [];
      for (let a = 0; a < str.length; a++) {
        let c = str[a],
        alphabetIndex = alphabet.indexOf(c.toLowerCase());
        if ( - 1 !== alphabetIndex) {
          alphabetIndex += offset;
          while (alphabetIndex >= alphabet.length) alphabetIndex -= alphabet.length;
          c = c === c.toUpperCase() ? alphabet[alphabetIndex] : alphabet[alphabetIndex].toUpperCase()
        }
        result.push(c)
      }
      return result.join('')
    }
  const r = new (sha256Instance()).SHA256;
  const random = randomString(32);
  const randomShifted = caesarShift(random);
  let to_hash = [urlPath, queryStringSorted, randomShifted].join('');
  let signature = r.b64(to_hash);
  signature = signature.replace(/=/g, '').replace(/\+/g, '-').replace(/\//g, '_');
  return {
    nonce: random,
    signature: signature
  }
}

function randomString(length) {
  let t = 'ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz0123456789-._~',
  n = '';
  for (let r = 0; r < length; r++) n += t.charAt(Math.floor(Math.random() * t.length));
  return n
}
```
2026-06-06 18:04:19 +02:00
Bnyro 26fa181b84 [feat] gmx: detect captchas 2026-06-05 08:07:30 +02:00
Bnyro 0f35ef7cd6 [feat] json engine: add option to not send page num on first page 2026-06-05 08:04:49 +02:00
Bnyro b1ae576b2d [fix] xpath engine: add missing send_page_num_on_first_page docstring 2026-06-05 08:04:49 +02:00
Bnyro 5bae05514b [feat] engines: add zapmeta general search engine 2026-06-03 22:38:59 +02:00
Bnyro 253dc86c10 [fix] duckduckgo: image requests get blocked 2026-06-03 22:37:13 +02:00
Bnyro 3066bc19eb [fix] public domain image archive: fails to extract API url 2026-06-03 22:35:21 +02:00
Austin-Olacsi e964708c00 [fix] bilibili engine: fix Referer and add Accept HTTP header (#6189) 2026-06-02 06:06:31 +02:00
Bnyro 7159b8aed3 [feat] marginalia: add support for pagination 2026-05-31 12:54:53 +02:00
Bnyro 246f5a5499 [mod] svgrepo: remove engine
- SVGRepo uses Cloudflare for every session, no matter
if you're opening it in a browser or not
2026-05-31 12:54:32 +02:00
Bnyro 780ee32564 [fix] pexels: fix engine crashes with SearxEngineAccessDeniedException 2026-05-29 22:03:22 +02:00
Bnyro 0037d43d87 [fix] aol: disable http2 to prevent request fingerprinting (#6149) 2026-05-26 12:15:35 +02:00
Bnyro f5be39e245 [mod] podcastindex: remove engine (#6140)
PodcastIndex.org started using a Proof-of-Work JavaScript
challenge whose results are sent as `X-Pow-*` request headers.
Although it is technically possible to re-implement the
PoW challenge in Python, it's likely impossible to maintain
because

- the actual Proof of Concept logic might change very often

- the whole idea of the Proof of Work challenge is to use
  a "big" amount of resources (about 1s on my PC); so executing the challenge
  would almost block all other work on the SearXNG instance

At first glance, the challenge looks very similar to what
Anubis does, because it also uses SHA256 hashes.
2026-05-26 11:53:20 +02:00
Bnyro 1574939441 [fix] json, xpath engine: rename safe_search_support option to safesearch (#6143) 2026-05-26 11:38:07 +02:00
Markus Heiser 3db8b424a8 [mod] engine flaticon: migrate from LegacyResult to Image (#6142)
Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2026-05-26 09:19:17 +02:00
Bnyro cb4b70ac50 [fix] qwant news: results don't have any descriptions (#6135)
BTW: fix some typecast issues
2026-05-25 18:04:14 +02:00
Markus Heiser e29e861e2c [fix] bing engines - geoblocking in China (#6134)
In regions like China, the domain must be adjusted to avoid a redirect.

- https://github.com/searxng/searxng/issues/5243
- https://github.com/searxng/searxng/pull/5324
- https://github.com/searxng/searxng/pull/6133

Suggested / tested by @hubutui in https://github.com/searxng/searxng/pull/6133#issuecomment-4534637069

Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2026-05-25 17:05:08 +02:00
Markus Heiser 89b89a88fe [mod] engine: MyMemory Translated - typification and html to text (#6132)
The implementation is normalized, type annotations are applied, and the results
are freed from the HTML markup (which is partially present).

Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2026-05-25 16:38:06 +02:00
Bnyro 46071a011a [mod] qwant: remove web lite and improve request spoofing (#6127)
- https://lite.qwant.com seems to be dead.
- The request parameters were changed to match the ones from the Qwant website.
- Qwant is now set to inactive by default due to its strict rate-limits
2026-05-25 15:46:40 +02:00
Bnyro b0d8af96bf [feat] engines: add flaticon icons engine (#6122) 2026-05-25 13:41:44 +02:00
Markus Heiser dd27fce3b7 [unbload] drop meaningless field `number_of_results_xpath` from results (#6130)
In the result-list, the ``number_of_results`` indicate the number of hits in the
Index, they do not indicate how many results are in the answer.

In the past, search engines such as google or ddg had an indication on the first
page of a search term of how many hits there were for this term in total in
their index.

This info was added up in SearXNG and delivered under ``number_of_results``.
Nowadays the search engines no longer indicate how many hits there are in the
index and so this field in SearXNG is also superfluous.

- https://github.com/searxng/searxng/issues/2457#issuecomment-2566181574
- https://github.com/searxng/searxng/issues/2987
- https://github.com/searxng/searxng/issues/5034

Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2026-05-25 12:43:02 +02:00
Markus Heiser efc305b7f9 [mod] normalize variable name for the max number of results per request (#6131)
[mod] normalize variable name for the max number of results per request

In the past, we have used different names for the variable that specifies the
maximum number of hits in the outgoing request.

- ``page_size``
- ``number_of_results``
- ``nb_per_page``

Since *page_size* is the most accurate term and is also used in the XPath
engines, all other engines are adjusted accordingly within this
patch .. documentation adjusted accordingly.

Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2026-05-25 12:41:31 +02:00
Bnyro 323ce76004 [fix] startpage: all requests get blocked with CAPTCHA
Changes:
- Setting the "abp" query parameter causes instant blocks, it's no longer
used at Startpage
- The safesearch map changed for both the request form and the cookies. As
we were sending invalid values, that also made it easier to detect us
2026-05-23 09:43:17 +02:00
Bnyro dfc2da707b [fix] mojeek: access denied because of wrong request parameters 2026-05-23 09:43:03 +02:00
Bnyro fc90c5b09c [fix] yep: api path changed 2026-05-22 21:38:26 +02:00
Markus Heiser d3deacc6d4 [mod] engine fyyd: typing added, no functional change (#6103)
Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2026-05-21 12:06:55 +02:00
Bnyro d8f74af3d1 [mod] engine 500px: calc cursor instead of relying on pageInfo (#6091) 2026-05-21 07:31:13 +02:00
Bnyro 24b1a1b6a8 [feat] engines: add 500px.com engine (#6091) 2026-05-21 07:31:13 +02:00
Bnyro d7e8b7cd18 [feat] engines: add cara.app engine (#6092) 2026-05-17 18:39:47 +02:00
Markus Heiser f26e450778 [fix] engine: google-news - Google pushed a frontend update (#5984)
Around March 9 - 10, 2026, Google pushed a frontend update to Google News that
completely changed the HTML structure of search results.

This is a complete overhaul of the Google News engine.

- The real URL is encoded in the "jslog" attribute.
  @SeriousConcept1134: the attribute is a base64 encoded JSON
- CEID list is updated
- The typification was pushed forward

Related:

- https://github.com/searxng/searxng/issues/5852#issuecomment-4254438184
- https://github.com/searxng/searxng/issues/5852#issuecomment-4265598833

Closes: https://github.com/searxng/searxng/issues/5852
Suggested-by: SeriousConcept1134

Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2026-05-17 15:27:00 +02:00
Bnyro de49d27846 [fix] yandex images: crashes when parsing images without fallback source (#6084) 2026-05-16 15:53:23 +02:00
Bnyro 16a7537bfd [chore] engines: remove ask.com (service was discontinued) (#6083)
Source: https://www.ask.com/
2026-05-16 15:36:47 +02:00
Markus Heiser afafca93f3 [fix] engine wikidata - fails to initialize with HTTP 403 (#6081)
In order not to be further blocked, the WIKIDATA_PROPERTIES are cached, which
drastically reduces the WD-SQL request.

BTW: improve type hints

Closes: https://github.com/searxng/searxng/issues/6051

Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2026-05-15 16:21:47 +02:00
Arnaud Jeannin 790683bbd7 [fix] google: improve CAPTCHA detection (#5922)
- Detect HTTP 302 responses (Google redirecting to /sorry/index
  without the HTTP client following the redirect)
- Detect short HTML responses (<2000 bytes) containing "/sorry/"
  links (meta-refresh or JS redirect variants)

Instances with rotating IPs can set the `suspended_times.SearxEngineCaptcha` to
0 in the search settings [1], the next request will typically use a different
outgoing IP when rotating proxies are configured

[1] https://docs.searxng.org/admin/settings/settings_search.html
2026-05-15 09:25:13 +02:00
Bnyro 6cee4b8947 [feat] yep: add support for selecting search language (#6075) 2026-05-15 08:37:11 +02:00
Tommaso Colella 849e17e431 [fix] 360search: improve empty results set management and increase engine timeout (#6058) 2026-05-09 08:35:21 +02:00
Markus Heiser 130cea600d [fix] Startpage engine fails when date field is string (not integer: TypeError) (#6053)
In order to avoid an abort with an error, type- and value- errors are catched,
the publishDate cannot then be determined, but the result item remains.

[1] https://github.com/searxng/searxng/pull/5980/changes#r3091479655

Replaces the PRs:

- https://github.com/searxng/searxng/pull/5980
- https://github.com/searxng/searxng/pull/6006

Closes: https://github.com/searxng/searxng/issues/5979

Suggested-by: @Bnyro [1]

Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2026-05-07 14:56:14 +02:00
Fabian Freund a480560371 [fix] wikidata: crashes when querying due to missing escaping of quotation marks 2026-05-06 21:13:27 +02:00
Markus Heiser 36bcd6b551 [fix] engine: wikidata - improvement of typing (#5993)
The type checker in my IDE shut down after over 500 errors / after this
patch there are still 125 criticisms, however its an improvement and a better
starting point.

Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2026-05-06 08:39:55 +02:00
Markus Heiser 8fabaf86b6 [fix] engine: wikidata - initialization fails with KeyError (#5993)
The response to QUERY_PROPERTY_NAMES has changed; fields without the `name`
field are now also returned.

Closes: https://github.com/searxng/searxng/issues/5982

Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2026-05-06 08:39:55 +02:00
Bnyro d501b0420a [mod] yep: fix engine due to new API layout (#6048)
Apparently, YEP no longer supports images and news search.

Also, the naming of the query parameters changed a bit.

Closes: https://github.com/searxng/searxng/issues/6047
2026-05-06 08:17:19 +02:00
Sai Asish Y 0ac5254b8e [fix] mwmbl: crash if there's no result description available 2026-05-05 21:40:20 +02:00
Bnyro 74f1ca203f [fix] gmx: crash when there are no related query suggestions 2026-04-22 09:56:52 +02:00
Bnyro 8579974f5e [feat] engines: add GMX search engine (#5967)
Notes:
- Safesearch doesn't seem to work properly?
- In theory multiple languages are supported, but even in the web UI, they don't work properly
- Possibly, we could cache the request hashes (h query parameter), I'm not sure if it ever changes
2026-04-17 07:00:21 +02:00