72 Commits

Author SHA1 Message Date
dependabot[bot] 75c1b1dade [upd] web-client (simple): Bump less (#6289)
Bumps the minor group in /client/simple with 1 update: [less](https://github.com/less/less.js).


Updates `less` from 4.6.4 to 4.6.6
- [Release notes](https://github.com/less/less.js/releases)
- [Changelog](https://github.com/less/less.js/blob/master/CHANGELOG.md)
- [Commits](https://github.com/less/less.js/commits/v4.6.6)

---
updated-dependencies:
- dependency-name: less
  dependency-version: 4.6.6
  dependency-type: direct:development
  update-type: version-update:semver-patch
  dependency-group: minor
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2026-06-22 08:03:15 +02:00
Bnyro 097ab64c70 [del] aol: remove engine (eol) (#6299) 2026-06-22 07:32:23 +02:00
dependabot[bot] 0e9f513efc [upd] pypi: Bump the minor group with 5 updates (#6291)
Bumps the minor group with 5 updates:

| Package | From | To |
| --- | --- | --- |
| [certifi](https://github.com/certifi/python-certifi) | `2026.5.20` | `2026.6.17` |
| [pylint](https://github.com/pylint-dev/pylint) | `4.0.5` | `4.0.6` |
| [selenium](https://github.com/SeleniumHQ/Selenium) | `4.44.0` | `4.45.0` |
| [sphinxcontrib-programoutput](https://github.com/OpenNTI/sphinxcontrib-programoutput) | `0.19` | `0.20` |
| [basedpyright](https://github.com/detachhead/basedpyright) | `1.39.7` | `1.39.8` |
2026-06-22 07:30:41 +02:00
Bnyro fd42d4fda1 [fix] chatnoir: don't re-use/cache session keys
They're invalidated very quickly, so even caching them for
60 seconds results in a lot of unauthorized access errors.
2026-06-20 21:52:14 +02:00
dependabot[bot] 5c38d2feab [upd] web-client (simple): Bump @types/node in /client/simple (#6290)
Bumps [@types/node](https://github.com/DefinitelyTyped/DefinitelyTyped/tree/HEAD/types/node) from 25.9.3 to 26.0.0.
- [Release notes](https://github.com/DefinitelyTyped/DefinitelyTyped/releases)
- [Commits](https://github.com/DefinitelyTyped/DefinitelyTyped/commits/HEAD/types/node)

---
updated-dependencies:
- dependency-name: "@types/node"
  dependency-version: 26.0.0
  dependency-type: direct:development
  update-type: version-update:semver-major
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2026-06-19 16:58:47 +02:00
dependabot[bot] 38b678c493 [upd] github-actions: Bump actions/checkout from 6.0.3 to 7.0.0 (#6288)
Bumps [actions/checkout](https://github.com/actions/checkout) from 6.0.3 to 7.0.0.
- [Release notes](https://github.com/actions/checkout/releases)
- [Changelog](https://github.com/actions/checkout/blob/main/CHANGELOG.md)
- [Commits](https://github.com/actions/checkout/compare/df4cb1c069e1874edd31b4311f1884172cec0e10...9c091bb21b7c1c1d1991bb908d89e4e9dddfe3e0)

---
updated-dependencies:
- dependency-name: actions/checkout
  dependency-version: 7.0.0
  dependency-type: direct:production
  update-type: version-update:semver-major
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2026-06-19 16:58:27 +02:00
github-actions[bot] fe1848673f [l10n] update translations from Weblate (#6293)
0f1c1d570 - 2026-06-18 - lugged9922 <lugged9922@noreply.codeberg.org>
81d208307 - 2026-06-18 - Raithlin <raithlin@noreply.codeberg.org>
bf09069e8 - 2026-06-17 - return42 <return42@noreply.codeberg.org>
c010ba929 - 2026-06-17 - return42 <return42@noreply.codeberg.org>
f92ba4e98 - 2026-06-17 - M Alif fadlan <maliffadlan@gmail.com>
442e504e2 - 2026-06-17 - return42 <return42@noreply.codeberg.org>
e2ffb2275 - 2026-06-17 - return42 <return42@noreply.codeberg.org>
cc26d0794 - 2026-06-17 - return42 <return42@noreply.codeberg.org>
9639f4e84 - 2026-06-17 - return42 <return42@noreply.codeberg.org>
63059d4e7 - 2026-06-15 - AndersNordh <andersnordh@noreply.codeberg.org>
460c5260f - 2026-06-15 - kratos <makesocialfoss32@keemail.me>
b212184d9 - 2026-06-16 - ghose <ghose@noreply.codeberg.org>
c9ac8e6d7 - 2026-06-15 - AndersNordh <andersnordh@noreply.codeberg.org>
cc1f5ab59 - 2026-06-15 - Fjuro <fjuro@noreply.codeberg.org>
84f985a9f - 2026-06-14 - Outbreak2096 <outbreak2096@noreply.codeberg.org>
bdb7e25bc - 2026-06-13 - SomeTr <sometr@noreply.codeberg.org>
c3eac4c37 - 2026-06-14 - Stephan-P <stephan-p@noreply.codeberg.org>
d94ab494b - 2026-06-13 - Priit Jõerüüt <jrtcdbrg@noreply.codeberg.org>
3387bab27 - 2026-06-13 - gallegonovato <gallegonovato@noreply.codeberg.org>
2026-06-19 15:11:48 +02:00
Bnyro 8b10095e8a [fix] settings.yml: explicitely set category for xpath engines (ayo, gabanza, zapmeta, abcnyheter) (#6282) 2026-06-19 09:10:27 +02:00
Jayant Sharma b5ef7ec8f3 [fix] calculator: move math.parse inside try-catch (#6278) (#6280)
* [fix] calculator: move math.parse inside try-catch (#6278)

* build static

---------

Co-authored-by: Ivan Gabaldon <igabaldon@inetol.net>
2026-06-18 17:36:47 +02:00
Bnyro bd73cc09ea [feat] engines: add support for search.ch/web (Swiss) 2026-06-18 14:02:52 +02:00
Butui Hu 4dfdc822cf [fix] engines: chinaso: handle empty upstream results gracefully (#6266)
Signed-off-by: Hu Butui <hot123tea123@gmail.com>
2026-06-17 19:36:22 +02:00
Ivan Gabaldon 502c820a25 [fix] container: setup minimal (#6268)
Start minimal, use defaults, and extend later on. The templates are no longer
checked for changes, which was confusing and annoying after a while.

See: https://github.com/searxng/searxng/issues/6261#issuecomment-4716008282
2026-06-16 15:32:47 +02:00
Markus Heiser 4fb49b4498 [chore] add DeprecationWarning for obsolete engine.about.language property (#6265)
The old property should still be supported for a transitional period; the
reasons for this can be seen from the discussion in [1] / the further procedure
is also discussed there.

[1] https://github.com/searxng/searxng/issues/6261

Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2026-06-16 10:31:21 +02:00
Markus Heiser cf1410af8d [fix] set language_support for engines with languages in traits (#6258)
In the past, the engine option ``language_support`` was not consistently
maintained; with this patch, a ValueError is now thrown if an engine has
languages in its traits but language_support is not set to True.

Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2026-06-15 10:52:00 +02:00
Markus Heiser 6c9dcd4242 [chore] complete and normalize the attributes of engine objects (#6258)
Drop outdated engine attributes: supported_languages, language_aliases

Complete, normalize and document the type definitions for the engine-module and
engine-class.

For the ``engine.about`` section of the configuration, a type check is performed
based on structure ``searx.enginelib.EngineAbout``.

The property ``engine.about.language`` no longer exists; existing values have
been migrated to ``engine.language``.

Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2026-06-15 10:52:00 +02:00
Bnyro b3e08f2a44 [feat] engines: add searchzee engine (general, news)
The results seem to be from Brave (i.e. they are exactly
the same). But it doesn't have any strict rate-limits,
so that's nice.

News support time ranges, but apart from that, unfortunately it doesn't
support any advanced features like safesearch or languages.
2026-06-14 09:59:39 +02:00
Bnyro a857041afc [feat] engines: add support for search.ayo.de 2026-06-14 09:32:58 +02:00
Bnyro 31a8a22aa6 [feat] engines: add German tonline engine (general, news, images, videos) (#6250)
T-Online_ is a German news portal.

It gets its web results from Google, image results from Flickr and videos results
from YouTube.

For images and videos, it additionally returns result from its
news catalog. However, for pagination we have to specify the result
type (e.g. either videos from YouTube or from T-Online), so we use
flickr/youtube there instead of tonline because the tonline results
are usually irrelevant.
2026-06-14 08:46:07 +02:00
Bnyro a29cda858c [feat] engines: add luxxle (general, news, images, videos)
Add support for https://luxxle.com

Localization is not yet supported because it doesn't seem to work on their
website either, no matter which language I select, it only returns English web
results
2026-06-13 20:39:31 +02:00
Bnyro 2e10a2f614 [feat] engines: add rawweb engine (foss, hand-indexed blogs) (#6234)
RawWeb is a search engine for personal websites / blog posts.
It has its own index and the personal websites were selected
by hand. Results are quite good for what it is imo. [^1]

[^1]: https://github.com/0x2E/RawWeb.org
2026-06-13 19:09:58 +02:00
Bnyro 2100eb04e1 [feat] engines: add reloado engine (general, german) (#6233)
- adds support for https://reloado.com (german)
- as it has its own index, the results are hit or miss and mostly German, 
  but still worth integrating imo
2026-06-13 19:06:18 +02:00
Bnyro c58391d673 [feat] engines: add fastbot engine (general) (#6232)
- adds support for https://fastbot.de
- the results are really fast and mostly in English (even though it's a German
  engine)
2026-06-13 19:04:39 +02:00
Bnyro c3284c8238 [chore] make data.traits (#6211) 2026-06-13 18:37:57 +02:00
Bnyro 290d3e0c6a [feat] engines: add privacywall engine (#6211)
- add https://privacywall.org support
- the engine seems to use the Bing index, but not 100% sure
- it claims to be privacy friendly, but it's not really by itself [1]

[1]: https://discuss.privacyguides.net/t/how-is-privacy-wall-search-engine/29486
2026-06-13 18:37:57 +02:00
Bnyro 0608dfa4d1 [feat] autocomplete: add privacywall autocompleter (#6211) 2026-06-13 18:37:57 +02:00
Bnyro 1184b3212f [feat] engines: add podchaser podcast engine (#6202)
- add podchaser podcast engine
- the motivation is that podcastindex had to be removed, see #6140
2026-06-13 18:04:21 +02:00
Bnyro 65e0e4c069 [feat] engines: add vuhuv engine (#6196) 2026-06-13 17:52:43 +02:00
Bnyro d14fa1f6e2 [chore] data: add resulthunter engine traits 2026-06-13 17:21:52 +02:00
Bnyro 2d248704fa [feat] engines: add resulthunter 2026-06-13 17:21:52 +02:00
Markus Heiser 3096b1218f [mod] add type definitions for engine's "about" section (#6231)
Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2026-06-13 17:05:59 +02:00
Bnyro 82a8a90230 [feat] engines: add abcnyheter engine (general, norway) (#6231)
Add support for https://startsiden.abcnyheter.no, a netherlandish search engine
that probably uses Google or Bing? idk it also returns English results, but
e.g. ``test`` returns mostly results from netherlands.
2026-06-13 17:05:59 +02:00
Bnyro e3d4fbe570 [feat] engines: add s1search general engine (#6186)
S1Search provides various different search services, which all seem
to be somewhat based on Google and Yahoo. The site looks kinda suspicious,
but the results are fine.

You can find a list of their engines by using a subdomain finder like
https://web-toolbox.dev/en/tools/subdomain-lookup and search for `s1search.co`.
2026-06-13 14:18:04 +02:00
Bnyro 031747f29e [feat] engines: add chatnoir general engine (#6183)
Chatnoir is an open source search engine developed by universities, based on
CommonCrawl (and others).  It's uncommented by default - we don't want to
overload the universities with bot traffic that targets SearXNG (sad truth why
we can't have nice things anymore)
2026-06-13 13:52:01 +02:00
Markus Heiser e3bd7f5df1 [mod] image results: add list of alternative formats (#6153)
* [mod] template images.html: reformatted for readability (no func change)

In preparation for upcoming changes, the template is being reformatted for
better readability; no functional changes are being made.

Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>

* [mod] image results: add list of alternative formats

To test alternatives formats apply patch from below, query ``!flaticon bmw`` and
open the detail view for the image.

    diff --git a/searx/engines/flaticon.py b/searx/engines/flaticon.py
    index 06b6a8e25..d88388705 100644
    --- a/searx/engines/flaticon.py
    +++ b/searx/engines/flaticon.py
    @@ -8,7 +8,7 @@ from urllib.parse import urlencode

     import typing as t

    -from searx.result_types import EngineResults
    +from searx.result_types import EngineResults, ImageRef

     if t.TYPE_CHECKING:
         from searx.extended_types import SXNG_Response
    @@ -61,6 +61,14 @@ def response(resp: "SXNG_Response"):
                     thumbnail_src=_fix_url(result["png"]),
                     img_src=_fix_url(result["png512"]),
                     author=result["team_name"],
    +                formats=[
    +                    ImageRef(label="PNG 100x100", url="https://example.org/test.png", subtype="png"),
    +                    ImageRef(label="SVG", url="https://example.org/test.svg", subtype="svg+xml"),
    +                    ImageRef(url="https://example.org/test.jpg", subtype="jpeg"),
    +                    ImageRef(url="https://example.org/test.bmp", subtype="bmp"),
    +                    ImageRef(url="https://example.org/test.ico", subtype="x-icon"),
    +                    ImageRef(url="https://example.org/test.tif", subtype="tiff"),
    +                ],
                 )
             )

Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>

---------

Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2026-06-13 13:28:05 +02:00
Bnyro b48205b384 [fix] tiger: crashes on empty result (#6251)
e.g. when searching for "!tiger pottering github", it crashes.
not really sure why - the problem is that the HTML doesn't
really uses descriptive classes or ids, only Tailwind,
so it's very hard to select only the results HTML.
2026-06-13 09:37:43 +02:00
Bnyro 8522638b00 [fix] duckduckgo web: result title contains html (#6253) 2026-06-13 09:35:14 +02:00
dependabot[bot] ab81c77533 [upd] pypi: Bump the minor group with 2 updates (#6247)
Bumps the minor group with 2 updates: [granian](https://github.com/emmett-framework/granian) and [basedpyright](https://github.com/detachhead/basedpyright).


Updates `granian` from 2.7.5 to 2.7.6
- [Release notes](https://github.com/emmett-framework/granian/releases)
- [Commits](https://github.com/emmett-framework/granian/compare/v2.7.5...v2.7.6)

Updates `basedpyright` from 1.39.6 to 1.39.7
- [Release notes](https://github.com/detachhead/basedpyright/releases)
- [Commits](https://github.com/detachhead/basedpyright/compare/v1.39.6...v1.39.7)

---
updated-dependencies:
- dependency-name: granian
  dependency-version: 2.7.6
  dependency-type: direct:production
  update-type: version-update:semver-patch
  dependency-group: minor
- dependency-name: basedpyright
  dependency-version: 1.39.7
  dependency-type: direct:development
  update-type: version-update:semver-patch
  dependency-group: minor
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2026-06-12 22:42:26 +02:00
dependabot[bot] cc196f2a5b [upd] web-client (simple): Bump the minor group across 1 directory with 4 updates (#6249)
Bumps the minor group with 4 updates in the /client/simple directory: [@biomejs/biome](https://github.com/biomejs/biome/tree/HEAD/packages/@biomejs/biome), [@types/node](https://github.com/DefinitelyTyped/DefinitelyTyped/tree/HEAD/types/node), [sharp](https://github.com/lovell/sharp) and [stylelint](https://github.com/stylelint/stylelint).

Updates `@biomejs/biome` from 2.4.16 to 2.5.0
- [Release notes](https://github.com/biomejs/biome/releases)
- [Changelog](https://github.com/biomejs/biome/blob/main/packages/@biomejs/biome/CHANGELOG.md)
- [Commits](https://github.com/biomejs/biome/commits/@biomejs/biome@2.5.0/packages/@biomejs/biome)

Updates `@types/node` from 25.9.1 to 25.9.3
- [Release notes](https://github.com/DefinitelyTyped/DefinitelyTyped/releases)
- [Commits](https://github.com/DefinitelyTyped/DefinitelyTyped/commits/HEAD/types/node)

Updates `sharp` from 0.34.5 to 0.35.1
- [Release notes](https://github.com/lovell/sharp/releases)
- [Commits](https://github.com/lovell/sharp/compare/v0.34.5...v0.35.1)

Updates `stylelint` from 17.12.0 to 17.13.0
- [Release notes](https://github.com/stylelint/stylelint/releases)
- [Changelog](https://github.com/stylelint/stylelint/blob/main/CHANGELOG.md)
- [Commits](https://github.com/stylelint/stylelint/compare/17.12.0...17.13.0)

---
updated-dependencies:
- dependency-name: "@biomejs/biome"
  dependency-version: 2.5.0
  dependency-type: direct:development
  update-type: version-update:semver-minor
  dependency-group: minor
- dependency-name: "@types/node"
  dependency-version: 25.9.3
  dependency-type: direct:development
  update-type: version-update:semver-patch
  dependency-group: minor
- dependency-name: sharp
  dependency-version: 0.35.1
  dependency-type: direct:development
  update-type: version-update:semver-minor
  dependency-group: minor
- dependency-name: stylelint
  dependency-version: 17.13.0
  dependency-type: direct:development
  update-type: version-update:semver-minor
  dependency-group: minor
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2026-06-12 20:40:51 +02:00
dependabot[bot] dd3022d680 [upd] web-client (simple): Bump sort-package-json in /client/simple (#6246)
Bumps [sort-package-json](https://github.com/keithamus/sort-package-json) from 3.6.1 to 4.0.0.
- [Release notes](https://github.com/keithamus/sort-package-json/releases)
- [Commits](https://github.com/keithamus/sort-package-json/compare/v3.6.1...v4.0.0)

---
updated-dependencies:
- dependency-name: sort-package-json
  dependency-version: 4.0.0
  dependency-type: direct:development
  update-type: version-update:semver-major
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2026-06-12 19:51:22 +02:00
Bnyro de8a3de15a [feat] engines: add support for Kagi (requires API key) 2026-06-12 14:48:47 +02:00
Bnyro 4dd0bf4867 [fix] fireball: all results are shown in general category 2026-06-11 17:30:46 +02:00
Bnyro 1957876dd6 [feat] engines: add dogpile (general, news, images, videos)
Add support for the Dogpile search engine, found at:

https://seirdy.one/posts/2021/03/10/search-engines-with-own-indexes/

It seems to use the same index as startpage because results are similar and they
share the ``qadf`` (Safe-Search) request parameter.
2026-06-11 16:09:13 +02:00
Bnyro ab13451086 [mod] odysee: move format_duration helper into utils.py 2026-06-11 16:09:13 +02:00
Bnyro a1490676e3 [mod] fireball: small fixup from code review (#6240)
Co-authored-by: Markus Heiser <markus.heiser@darmarIT.de>
2026-06-11 12:09:57 +02:00
Bnyro 3a382cb3f3 [chore] helix config: enable pyling and use black via pylsp 2026-06-11 11:03:38 +02:00
Ivan Gabaldon 9d9d605b15 [fix] ci: use install buildhost script (#6105) 2026-06-11 08:23:37 +02:00
Bnyro de03f4eb11 [feat] engines: add fireball engine (general, news, videos) 2026-06-10 21:00:49 +02:00
Markus Heiser 00f7c68a6f [chore] drop emacs' obsolete .dir-locals template (#6236)
Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2026-06-10 17:38:19 +02:00
Bnyro 41c98b3b41 [chore] devops: add languages config for helix editor
The default Helix configuration for Python is different,
so the pylint warnings aren't shown and the formatter
re-formats files by accident when you edit an existing file.

Therefore, this commit adds `python` language configuration
to ease developing SearXNG with Helix Editor [^1].

[^1]: https://helix-editor.com
2026-06-10 17:38:01 +02:00
Bnyro f4c63c8eb0 [feat] engines: add duckduckgo web engine as alternative to html.duckduckgo.com
html.duckduckgo.com captchas all my IPs very fast. I figured out that using
duckduckgo.com works even if html.duckduckgo.com is captcha-ed, hence adding
support for duckduckgo.com's general web search here.

This implementation fetches the link to the first API page
(i.e. ``links.duckduckgo.com/d.js?...``) from duckduckgo.com and uses the ``n``
parameter of the API to fetch all subsequent pages.

This also means that it's not possible to immediately search for the third
page - the first and the second page would need to be loaded first.

The reason why we can't just normally use the `vqd` value is that the API URLs
require an additional parameter `dp` which seems generated at server-side, so we
can't build it ourselves and must scrape it from the HTML pages.
2026-06-10 16:49:56 +02:00
Markus Heiser 26801e92af [fix] sqlitedb: create DB Schema (DDL) during app initialization (hardening) (#6187)
The initialization of the DB schema ("base schema") has so far been done on
demand, which causes race conditions with competing threads and processes.

The DDL statements for creating the "base schema" are now executed as part of
the initialization of the app.

Further improvements were made to harden the database applications:

- Wikidata & Radio-Browser engine perform their initialization only once (so far
  the initialization was carried out in each thread/process).

- If multiple processes try to set DB's WAL mode when opening the DB at the same
  time, this usually leads to another race condition, which is now also caught.

Related:

- https://github.com/searxng/searxng/issues/6181#issuecomment-4586705

Closes: #6181

Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2026-06-10 15:48:49 +02:00
Bnyro f3fab143be [feat] engines: add tiger.ch engine
Add support for https://tiger.ch (general, news)

It is disabled and inactive by default because it's just a metasearch engine
like SearXNG is, so it's mostly useful for bypassing rate-limits on other
engines: (it has its own German index, but it's not that great) in theory it
supports different locales, but I was too lazy to implement that (I only need
German and English results anyways, which are returned by default...)
2026-06-08 13:35:13 +02:00
Bnyro 72a827ae93 [fix] yep: send Sec-Fetch headers to bypass "access denied" (#6223)
Avoids yep's botblocking by sending Sec-Fetch-* headers (as the browser does).
2026-06-08 10:55:17 +02:00
Bnyro 6ca9d3784c [feat] engines: add seek-ninja general engine (#6217)
Add support for https://seek.ninja (general)

It's very slow because the engine uses Server-side events, that incrementally
send data in their HTTP response [1].

I.e. we wait for the end of the response (7+ seconds), even though the results
data arrives within a few seconds -> it's very slow, because SearXNG wants to
get the full response body before it calls the `response(resp)` method

We could use httpx-sse [2], but I'm not sure how to integrate this into SearXNG
and if it's worth it

[1] https://developer.mozilla.org/en-US/docs/Web/API/Server-sent_events/
[2] https://github.com/florimondmanca/httpx-sse
2026-06-08 07:09:06 +02:00
Bnyro 63f264220b [feat] engines: add heexy engine (general, images) (#6218) 2026-06-08 05:54:35 +02:00
Austin-Olacsi 41fcf0be4b [fix] aol engine uses wikidata id for C++ (#6221) 2026-06-08 05:32:26 +02:00
Bnyro 86903a2c66 [fix] flaticon: crash if result tag has no name (#6219) 2026-06-07 14:16:44 +02:00
Markus Heiser 70de3cc561 Revert "[fix] no such table during engine init (#6185)" (#6215)
This reverts commit 9d49a9f344.
2026-06-07 09:23:35 +02:00
Bnyro 51b6fd4f23 [del] karmasearch: remove engine (cloudflared) (#6213)
The engine is using very aggressive Cloudflare blocking for
a while now, no matter if using a normal browser like Firefox
or not.

Closes: https://github.com/searxng/searxng/issues/5976
2026-06-07 06:49:09 +02:00
Brock Vojkovic 9d49a9f344 [fix] no such table during engine init (#6185) 2026-06-07 06:04:12 +02:00
Bnyro e260a732c8 [fix] online engine processor: accept language headers doesn't get sent for 'all' language 2026-06-06 18:24:16 +02:00
Markus Heiser 0429198415 [mod] swisscows WEB: ignore video results from the first page
On the first page of the WEB search, there are, among other things, sections for
videos and news.  The video results from these sections should not be used as
results in the WEB search of SearXNG.

Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2026-06-06 18:04:19 +02:00
Markus Heiser e7cf57e9ae [mod] swisscows engines: add language / region support
Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2026-06-06 18:04:19 +02:00
Bnyro ed369ac0ec [feat] engines: add support for swisscows general 2026-06-06 18:04:19 +02:00
Bnyro 94bdbb5c63 [feat] engines: add support for swisscows videos 2026-06-06 18:04:19 +02:00
Bnyro 465b5229c6 [feat] engines: add swisscows news engine 2026-06-06 18:04:19 +02:00
Bnyro cbf97fd262 [feat] engines: add swisscows images engine
The implementation is basically a 1:1 port of the reverse engineered
swisscows JavaScript code. (it's been obfuscated, so I've restructured it
and made the variable names idiomatic instead of obfuscated var names like "a", "o", "i")

```js
/*
e: "/v5/images/search"
t: {
	itemsCount: "50"
	locale: "de-DE"
	offset: "50"
	query: "test"
	spellcheck: "true"
}
*/
// HASH library used: https://github.com/h2non/jshashes
function generateNonceAndSignature(queryParams, urlPath) {
  // urlPath = "/v5/images/search"
  // sort keys alphabetically and join to query string
  let queryStringSorted = '?' + U().stringify(queryParams, {
    arrayFormat: 'repeat',
    allowDots: !0
  }).split('&').map(e => {
    let[key, value] = e.split('=');
    return [key, decodeURIComponent(value)]
  }).sort((e, t) => e[0].localeCompare(t[0])).map(e => e.join('=')).join('&');

  function caesarShift(str, offset = 13) {
      const alphabet = 'abcdefghijklmnopqrstuvwxyz';
      let result = [];
      for (let a = 0; a < str.length; a++) {
        let c = str[a],
        alphabetIndex = alphabet.indexOf(c.toLowerCase());
        if ( - 1 !== alphabetIndex) {
          alphabetIndex += offset;
          while (alphabetIndex >= alphabet.length) alphabetIndex -= alphabet.length;
          c = c === c.toUpperCase() ? alphabet[alphabetIndex] : alphabet[alphabetIndex].toUpperCase()
        }
        result.push(c)
      }
      return result.join('')
    }
  const r = new (sha256Instance()).SHA256;
  const random = randomString(32);
  const randomShifted = caesarShift(random);
  let to_hash = [urlPath, queryStringSorted, randomShifted].join('');
  let signature = r.b64(to_hash);
  signature = signature.replace(/=/g, '').replace(/\+/g, '-').replace(/\//g, '_');
  return {
    nonce: random,
    signature: signature
  }
}

function randomString(length) {
  let t = 'ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz0123456789-._~',
  n = '';
  for (let r = 0; r < length; r++) n += t.charAt(Math.floor(Math.random() * t.length));
  return n
}
```
2026-06-06 18:04:19 +02:00
dependabot[bot] 37187dc2d8 [upd] web-client (simple): Bump the minor group across 1 directory with 5 updates (#6169)
Bumps the minor group with 5 updates in the /client/simple directory:

| Package | From | To |
| --- | --- | --- |
| [@biomejs/biome](https://github.com/biomejs/biome/tree/HEAD/packages/@biomejs/biome) | `2.4.15` | `2.4.16` |
| [@types/node](https://github.com/DefinitelyTyped/DefinitelyTyped/tree/HEAD/types/node) | `25.8.0` | `25.9.1` |
| [edge.js](https://github.com/edge-js/edge) | `6.5.0` | `6.5.1` |
| [stylelint](https://github.com/stylelint/stylelint) | `17.11.1` | `17.12.0` |
| [vite](https://github.com/vitejs/vite/tree/HEAD/packages/vite) | `8.0.13` | `8.0.16` |

Updates `@biomejs/biome` from 2.4.15 to 2.4.16
- [Release notes](https://github.com/biomejs/biome/releases)
- [Changelog](https://github.com/biomejs/biome/blob/main/packages/@biomejs/biome/CHANGELOG.md)
- [Commits](https://github.com/biomejs/biome/commits/@biomejs/biome@2.4.16/packages/@biomejs/biome)

Updates `@types/node` from 25.8.0 to 25.9.1
- [Release notes](https://github.com/DefinitelyTyped/DefinitelyTyped/releases)
- [Commits](https://github.com/DefinitelyTyped/DefinitelyTyped/commits/HEAD/types/node)

Updates `edge.js` from 6.5.0 to 6.5.1
- [Release notes](https://github.com/edge-js/edge/releases)
- [Changelog](https://github.com/edge-js/edge/blob/6.x/CHANGELOG.md)
- [Commits](https://github.com/edge-js/edge/compare/v6.5.0...v6.5.1)

Updates `stylelint` from 17.11.1 to 17.12.0
- [Release notes](https://github.com/stylelint/stylelint/releases)
- [Changelog](https://github.com/stylelint/stylelint/blob/main/CHANGELOG.md)
- [Commits](https://github.com/stylelint/stylelint/compare/17.11.1...17.12.0)

Updates `vite` from 8.0.13 to 8.0.16
- [Release notes](https://github.com/vitejs/vite/releases)
- [Changelog](https://github.com/vitejs/vite/blob/main/packages/vite/CHANGELOG.md)
- [Commits](https://github.com/vitejs/vite/commits/v8.0.16/packages/vite)

---
updated-dependencies:
- dependency-name: "@biomejs/biome"
  dependency-version: 2.4.16
  dependency-type: direct:development
  update-type: version-update:semver-patch
  dependency-group: minor
- dependency-name: "@types/node"
  dependency-version: 25.9.1
  dependency-type: direct:development
  update-type: version-update:semver-minor
  dependency-group: minor
- dependency-name: edge.js
  dependency-version: 6.5.1
  dependency-type: direct:development
  update-type: version-update:semver-patch
  dependency-group: minor
- dependency-name: stylelint
  dependency-version: 17.12.0
  dependency-type: direct:development
  update-type: version-update:semver-minor
  dependency-group: minor
- dependency-name: vite
  dependency-version: 8.0.14
  dependency-type: direct:development
  update-type: version-update:semver-patch
  dependency-group: minor
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2026-06-05 16:26:27 +02:00
dependabot[bot] 2f049cb037 [upd] github-actions: Bump actions/checkout from 6.0.2 to 6.0.3 (#6204)
Bumps [actions/checkout](https://github.com/actions/checkout) from 6.0.2 to 6.0.3.
- [Release notes](https://github.com/actions/checkout/releases)
- [Changelog](https://github.com/actions/checkout/blob/main/CHANGELOG.md)
- [Commits](https://github.com/actions/checkout/compare/de0fac2e4500dabe0009e67214ff5f5447ce83dd...df4cb1c069e1874edd31b4311f1884172cec0e10)

---
updated-dependencies:
- dependency-name: actions/checkout
  dependency-version: 6.0.3
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2026-06-05 16:17:01 +02:00
dependabot[bot] eb39bc0dc1 [upd] github-actions: Bump github/codeql-action from 4.36.0 to 4.36.2 (#6203)
Bumps [github/codeql-action](https://github.com/github/codeql-action) from 4.36.0 to 4.36.2.
- [Release notes](https://github.com/github/codeql-action/releases)
- [Changelog](https://github.com/github/codeql-action/blob/main/CHANGELOG.md)
- [Commits](https://github.com/github/codeql-action/compare/7211b7c8077ea37d8641b6271f6a365a22a5fbfa...8aad20d150bbac5944a9f9d289da16a4b0d87c1e)

---
updated-dependencies:
- dependency-name: github/codeql-action
  dependency-version: 4.36.2
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2026-06-05 16:16:35 +02:00
dependabot[bot] 007a4e2155 [upd] pypi: Bump typer from 0.26.3 to 0.26.7 in the minor group (#6205)
Bumps the minor group with 1 update: [typer](https://github.com/fastapi/typer).


Updates `typer` from 0.26.3 to 0.26.7
- [Release notes](https://github.com/fastapi/typer/releases)
- [Changelog](https://github.com/fastapi/typer/blob/master/docs/release-notes.md)
- [Commits](https://github.com/fastapi/typer/compare/0.26.3...0.26.7)
2026-06-05 11:54:28 +02:00
github-actions[bot] 13ce187e64 [l10n] update translations from Weblate (#6206)
19b2047a9 - 2026-05-30 - daemul72 <daemul72@noreply.codeberg.org>
2026-06-05 11:52:35 +02:00
260 changed files with 6180 additions and 2851 deletions
-163
View File
@@ -1,163 +0,0 @@
;;; .dir-locals.el
;;
;; Per-Directory Local Variables:
;; https://www.gnu.org/software/emacs/manual/html_node/emacs/Directory-Variables.html
;;
;; For full fledge developer tools install emacs packages:
;;
;; M-x package-install ...
;;
;; magit gitconfig
;; nvm lsp-mode lsp-pyright lsp-eslint
;; pyvenv pylint pip-requirements
;; jinja2-mode
;; json-mode
;; company company-jedi company-quickhelp company-shell
;; realgud
;; sphinx-doc markdown-mode graphviz-dot-mode
;; apache-mode nginx-mode
;;
;; To setup a developer environment, build target::
;;
;; $ make node.env.dev pyenv.install
;;
;; Some buffer locals are referencing the project environment:
;;
;; - prj-root --> <repo>/
;; - nvm-dir --> <repo>/.nvm
;; - python-environment-directory --> <repo>/local
;; - python-environment-default-root-name --> py3
;; - python-shell-virtualenv-root --> <repo>/local/py3
;; When this variable is set with the path of the virtualenv to use,
;; `process-environment' and `exec-path' get proper values in order to run
;; shells inside the specified virtualenv, example::
;; (setq python-shell-virtualenv-root "/path/to/env/")
;; - python-shell-interpreter --> <repo>/local/py3/bin/python
;;
;; Python development:
;;
;; Jedi, flycheck & other python stuff should use the 'python-shell-interpreter'
;; from the local py3 environment.
;;
((nil
. ((fill-column . 80)
(indent-tabs-mode . nil)
(eval . (progn
(add-to-list 'auto-mode-alist '("\\.html\\'" . jinja2-mode))
;; project root folder is where the `.dir-locals.el' is located
(setq-local prj-root
(locate-dominating-file default-directory ".dir-locals.el"))
(setq-local python-environment-directory
(expand-file-name "./local" prj-root))
;; to get in use of NVM environment, install https://github.com/rejeep/nvm.el
(setq-local nvm-dir (expand-file-name "./.nvm" prj-root))
;; use nodejs from the (local) NVM environment (see nvm-dir)
(nvm-use-for-buffer)
(ignore-errors (require 'lsp))
(setq-local lsp-server-install-dir (car (cdr nvm-current-version)))
(setq-local lsp-enable-file-watchers nil)
;; use 'py3' environment as default
(setq-local python-environment-default-root-name
"py3")
(setq-local python-shell-virtualenv-root
(expand-file-name
python-environment-default-root-name python-environment-directory))
(setq-local python-shell-interpreter
(expand-file-name
"bin/python" python-shell-virtualenv-root))))))
(makefile-gmake-mode
. ((indent-tabs-mode . t)))
(yaml-mode
. ((eval . (progn
;; flycheck should use the local py3 environment
(setq-local flycheck-yaml-yamllint-executable
(expand-file-name "bin/yamllint" python-shell-virtualenv-root))
(setq-local flycheck-yamllintrc
(expand-file-name ".yamllint.yml" prj-root))
(flycheck-checker . yaml-yamllint)))))
(json-mode
. ((eval . (progn
(setq-local js-indent-level 4)
(flycheck-checker . json-python-json)))))
(js-mode
. ((eval . (progn
(ignore-errors (require 'lsp-eslint))
(setq-local js-indent-level 2)
;; flycheck should use the eslint checker from developer tools
(setq-local flycheck-javascript-eslint-executable
(expand-file-name "node_modules/.bin/eslint" prj-root))
;; (flycheck-mode)
(if (featurep 'lsp-eslint)
(lsp))
))))
(python-mode
. ((eval . (progn
(ignore-errors (require 'jedi-core))
(ignore-errors (require 'lsp-pyright))
(ignore-errors (sphinx-doc-mode))
(setq-local python-environment-virtualenv
(list (expand-file-name "bin/virtualenv" python-shell-virtualenv-root)
;;"--system-site-packages"
"--quiet"))
(setq-local pylint-command
(expand-file-name "bin/pylint" python-shell-virtualenv-root))
(if (featurep 'lsp-pyright)
(lsp))
;; pylint will find the '.pylintrc' file next to the CWD
;; https://pylint.readthedocs.io/en/latest/user_guide/run.html#command-line-options
(setq-local flycheck-pylintrc
".pylintrc")
;; flycheck & other python stuff should use the local py3 environment
(setq-local flycheck-python-pylint-executable
python-shell-interpreter)
;; use 'M-x jedi:show-setup-info' and 'M-x epc:controller' to inspect jedi server
;; https://tkf.github.io/emacs-jedi/latest/#jedi:environment-root -- You
;; can specify a full path instead of a name (relative path). In that case,
;; python-environment-directory is ignored and Python virtual environment
;; is created at the specified path.
(setq-local jedi:environment-root
python-shell-virtualenv-root)
;; https://tkf.github.io/emacs-jedi/latest/#jedi:server-command
(setq-local jedi:server-command
(list python-shell-interpreter
jedi:server-script))
;; jedi:environment-virtualenv --> see above 'python-environment-virtualenv'
;; is set buffer local! No need to setup jedi:environment-virtualenv:
;;
;; Virtualenv command to use. A list of string. If it is nil,
;; python-environment-virtualenv is used instead. You must set non-nil
;; value to jedi:environment-root in order to make this setting work.
;;
;; https://tkf.github.io/emacs-jedi/latest/#jedi:environment-virtualenv
;;
;; (setq-local jedi:environment-virtualenv
;; (list (expand-file-name "bin/virtualenv" python-shell-virtualenv-root)
;; "--python"
;; "/usr/bin/python3.4"
;; ))
))))
)
+1
View File
@@ -1,5 +1,6 @@
* *
!container/*.template.*
!container/entrypoint.sh !container/entrypoint.sh
!searx/** !searx/**
!requirements*.txt !requirements*.txt
+3 -3
View File
@@ -78,7 +78,7 @@ jobs:
python-version: "${{ env.PYTHON_VERSION }}" python-version: "${{ env.PYTHON_VERSION }}"
- name: Checkout - name: Checkout
uses: actions/checkout@de0fac2e4500dabe0009e67214ff5f5447ce83dd # v6.0.2 uses: actions/checkout@9c091bb21b7c1c1d1991bb908d89e4e9dddfe3e0 # v7.0.0
with: with:
persist-credentials: "false" persist-credentials: "false"
fetch-depth: "0" fetch-depth: "0"
@@ -141,7 +141,7 @@ jobs:
steps: steps:
- name: Checkout - name: Checkout
uses: actions/checkout@de0fac2e4500dabe0009e67214ff5f5447ce83dd # v6.0.2 uses: actions/checkout@9c091bb21b7c1c1d1991bb908d89e4e9dddfe3e0 # v7.0.0
with: with:
persist-credentials: "false" persist-credentials: "false"
@@ -175,7 +175,7 @@ jobs:
steps: steps:
- name: Checkout - name: Checkout
uses: actions/checkout@de0fac2e4500dabe0009e67214ff5f5447ce83dd # v6.0.2 uses: actions/checkout@9c091bb21b7c1c1d1991bb908d89e4e9dddfe3e0 # v7.0.0
with: with:
persist-credentials: "false" persist-credentials: "false"
+1 -1
View File
@@ -46,7 +46,7 @@ jobs:
python-version: "${{ env.PYTHON_VERSION }}" python-version: "${{ env.PYTHON_VERSION }}"
- name: Checkout - name: Checkout
uses: actions/checkout@de0fac2e4500dabe0009e67214ff5f5447ce83dd # v6.0.2 uses: actions/checkout@9c091bb21b7c1c1d1991bb908d89e4e9dddfe3e0 # v7.0.0
with: with:
persist-credentials: "false" persist-credentials: "false"
+5 -2
View File
@@ -37,7 +37,7 @@ jobs:
python-version: "${{ env.PYTHON_VERSION }}" python-version: "${{ env.PYTHON_VERSION }}"
- name: Checkout - name: Checkout
uses: actions/checkout@de0fac2e4500dabe0009e67214ff5f5447ce83dd # v6.0.2 uses: actions/checkout@9c091bb21b7c1c1d1991bb908d89e4e9dddfe3e0 # v7.0.0
with: with:
persist-credentials: "false" persist-credentials: "false"
fetch-depth: "0" fetch-depth: "0"
@@ -50,11 +50,14 @@ jobs:
python-${{ env.PYTHON_VERSION }}-${{ runner.arch }}- python-${{ env.PYTHON_VERSION }}-${{ runner.arch }}-
path: "./local/" path: "./local/"
- name: Setup dependencies
run: sudo ./utils/searxng.sh install buildhost
- name: Setup venv - name: Setup venv
run: make V=1 install run: make V=1 install
- name: Build documentation - name: Build documentation
run: make V=1 docs.clean docs.html run: make V=1 docs.html
- if: github.ref_name == 'master' - if: github.ref_name == 'master'
name: Release name: Release
+2 -2
View File
@@ -39,7 +39,7 @@ jobs:
python-version: "${{ matrix.python-version }}" python-version: "${{ matrix.python-version }}"
- name: Checkout - name: Checkout
uses: actions/checkout@de0fac2e4500dabe0009e67214ff5f5447ce83dd # v6.0.2 uses: actions/checkout@9c091bb21b7c1c1d1991bb908d89e4e9dddfe3e0 # v7.0.0
with: with:
persist-credentials: "false" persist-credentials: "false"
@@ -67,7 +67,7 @@ jobs:
python-version: "${{ env.PYTHON_VERSION }}" python-version: "${{ env.PYTHON_VERSION }}"
- name: Checkout - name: Checkout
uses: actions/checkout@de0fac2e4500dabe0009e67214ff5f5447ce83dd # v6.0.2 uses: actions/checkout@9c091bb21b7c1c1d1991bb908d89e4e9dddfe3e0 # v7.0.0
with: with:
persist-credentials: "false" persist-credentials: "false"
+2 -2
View File
@@ -40,7 +40,7 @@ jobs:
python-version: "${{ env.PYTHON_VERSION }}" python-version: "${{ env.PYTHON_VERSION }}"
- name: Checkout - name: Checkout
uses: actions/checkout@de0fac2e4500dabe0009e67214ff5f5447ce83dd # v6.0.2 uses: actions/checkout@9c091bb21b7c1c1d1991bb908d89e4e9dddfe3e0 # v7.0.0
with: with:
token: "${{ secrets.WEBLATE_GITHUB_TOKEN }}" token: "${{ secrets.WEBLATE_GITHUB_TOKEN }}"
fetch-depth: "0" fetch-depth: "0"
@@ -88,7 +88,7 @@ jobs:
python-version: "${{ env.PYTHON_VERSION }}" python-version: "${{ env.PYTHON_VERSION }}"
- name: Checkout - name: Checkout
uses: actions/checkout@de0fac2e4500dabe0009e67214ff5f5447ce83dd # v6.0.2 uses: actions/checkout@9c091bb21b7c1c1d1991bb908d89e4e9dddfe3e0 # v7.0.0
with: with:
token: "${{ secrets.WEBLATE_GITHUB_TOKEN }}" token: "${{ secrets.WEBLATE_GITHUB_TOKEN }}"
fetch-depth: "0" fetch-depth: "0"
+2 -2
View File
@@ -24,7 +24,7 @@ jobs:
steps: steps:
- name: Checkout - name: Checkout
uses: actions/checkout@de0fac2e4500dabe0009e67214ff5f5447ce83dd # v6.0.2 uses: actions/checkout@9c091bb21b7c1c1d1991bb908d89e4e9dddfe3e0 # v7.0.0
with: with:
persist-credentials: "false" persist-credentials: "false"
@@ -41,6 +41,6 @@ jobs:
write-comment: "false" write-comment: "false"
- name: Upload SARIFs - name: Upload SARIFs
uses: github/codeql-action/upload-sarif@7211b7c8077ea37d8641b6271f6a365a22a5fbfa # v4.36.0 uses: github/codeql-action/upload-sarif@8aad20d150bbac5944a9f9d289da16a4b0d87c1e # v4.36.2
with: with:
sarif_file: "./scout.sarif" sarif_file: "./scout.sarif"
+11
View File
@@ -0,0 +1,11 @@
[[language]]
name = "python"
language-servers = ["basedpyright", "pylsp"]
auto-format = true
[language-server.pylsp.config.pylsp]
plugins.pylint.enabled = true
plugins.isort.enabled = true
plugins.black.enabled = true
plugins.black.skip_string_normalization = true
plugins.black.line_length = 120
+15 -29
View File
@@ -2,12 +2,12 @@
"$schema": "./node_modules/@biomejs/biome/configuration_schema.json", "$schema": "./node_modules/@biomejs/biome/configuration_schema.json",
"files": { "files": {
"ignoreUnknown": true, "ignoreUnknown": true,
"includes": ["**", "!node_modules"] "includes": ["**", "!node_modules", "!src/brand", "!src/svg"]
}, },
"assist": { "assist": {
"enabled": true, "enabled": true,
"actions": { "actions": {
"recommended": true, "preset": "recommended",
"source": { "source": {
"useSortedAttributes": "on", "useSortedAttributes": "on",
"useSortedProperties": "on" "useSortedProperties": "on"
@@ -27,12 +27,14 @@
"linter": { "linter": {
"enabled": true, "enabled": true,
"rules": { "rules": {
"recommended": true, "preset": "recommended",
"complexity": { "complexity": {
"noForEach": "error", "noForEach": "error",
"noImplicitCoercions": "error", "noImplicitCoercions": "error",
"noRedundantDefaultExport": "error",
"noUselessCatchBinding": "error", "noUselessCatchBinding": "error",
"noUselessUndefined": "error", "noUselessUndefined": "error",
"useArrayFind": "error",
"useSimplifiedLogicExpression": "error" "useSimplifiedLogicExpression": "error"
}, },
"correctness": { "correctness": {
@@ -42,25 +44,11 @@
"useSingleJsDocAsterisk": "error" "useSingleJsDocAsterisk": "error"
}, },
"nursery": { "nursery": {
"noContinue": "warn",
"noEqualsToNull": "warn",
"noFloatingPromises": "warn", "noFloatingPromises": "warn",
"noForIn": "warn",
"noIncrementDecrement": "warn",
"noMisusedPromises": "warn", "noMisusedPromises": "warn",
"noMultiAssign": "warn",
"noMultiStr": "warn",
"noNestedPromises": "warn",
"noParametersOnlyUsedInRecursion": "warn",
"noRedundantDefaultExport": "warn",
"noReturnAssign": "warn",
"noUselessReturn": "off",
"useAwaitThenable": "off", "useAwaitThenable": "off",
"useConsistentEnumValueType": "warn",
"useDestructuring": "warn",
"useExhaustiveSwitchCases": "warn", "useExhaustiveSwitchCases": "warn",
"useExplicitType": "off", "useExplicitType": "off",
"useFind": "warn",
"useRegexpExec": "warn" "useRegexpExec": "warn"
}, },
"performance": { "performance": {
@@ -75,23 +63,15 @@
"noCommonJs": "error", "noCommonJs": "error",
"noEnum": "error", "noEnum": "error",
"noImplicitBoolean": "error", "noImplicitBoolean": "error",
"noIncrementDecrement": "error",
"noInferrableTypes": "error", "noInferrableTypes": "error",
"noMultiAssign": "error",
"noMultilineString": "error",
"noNamespace": "error", "noNamespace": "error",
"noNegationElse": "error", "noNegationElse": "error",
"noNestedTernary": "error", "noNestedTernary": "error",
"noParameterAssign": "error", "noParameterAssign": "error",
"noParameterProperties": "error", "noParameterProperties": "error",
"noRestrictedTypes": {
"level": "error",
"options": {
"types": {
"Element": {
"message": "Element is too generic",
"use": "HTMLElement"
}
}
}
},
"noSubstr": "error", "noSubstr": "error",
"noUnusedTemplateLiteral": "error", "noUnusedTemplateLiteral": "error",
"noUselessElse": "error", "noUselessElse": "error",
@@ -107,6 +87,7 @@
} }
}, },
"useConsistentBuiltinInstantiation": "error", "useConsistentBuiltinInstantiation": "error",
"useConsistentEnumValueType": "error",
"useConsistentMemberAccessibility": { "useConsistentMemberAccessibility": {
"level": "error", "level": "error",
"options": { "options": {
@@ -126,6 +107,7 @@
} }
}, },
"useDefaultSwitchClause": "error", "useDefaultSwitchClause": "error",
"useDestructuring": "error",
"useExplicitLengthCheck": "error", "useExplicitLengthCheck": "error",
"useForOf": "error", "useForOf": "error",
"useGroupedAccessorPairs": "error", "useGroupedAccessorPairs": "error",
@@ -142,13 +124,17 @@
"useUnifiedTypeSignatures": "error" "useUnifiedTypeSignatures": "error"
}, },
"suspicious": { "suspicious": {
"noAlert": "error",
"noBitwiseOperators": "error", "noBitwiseOperators": "error",
"noConstantBinaryExpressions": "error", "noConstantBinaryExpressions": "error",
"noDeprecatedImports": "error", "noDeprecatedImports": "error",
"noEmptyBlockStatements": "error", "noEmptyBlockStatements": "error",
"noEqualsToNull": "error",
"noEvolvingTypes": "error", "noEvolvingTypes": "error",
"noForIn": "error",
"noImportCycles": "error", "noImportCycles": "error",
"noNestedPromises": "error",
"noParametersOnlyUsedInRecursion": "error",
"noReturnAssign": "error",
"noUnassignedVariables": "error", "noUnassignedVariables": "error",
"noVar": "error", "noVar": "error",
"useNumberToFixedDigitsArgument": "error", "useNumberToFixedDigitsArgument": "error",
+379 -354
View File
File diff suppressed because it is too large Load Diff
+8 -8
View File
@@ -29,21 +29,21 @@
"swiped-events": "1.2.0" "swiped-events": "1.2.0"
}, },
"devDependencies": { "devDependencies": {
"@biomejs/biome": "2.4.15", "@biomejs/biome": "2.5.0",
"@types/node": "^25.8.0", "@types/node": "^26.0.0",
"browserslist": "^4.28.2", "browserslist": "^4.28.2",
"browserslist-to-esbuild": "^2.1.1", "browserslist-to-esbuild": "^2.1.1",
"edge.js": "^6.5.0", "edge.js": "^6.5.1",
"less": "^4.6.4", "less": "^4.6.6",
"mathjs": "^15.2.0", "mathjs": "^15.2.0",
"sharp": "~0.34.5", "sharp": "~0.35.1",
"sort-package-json": "^3.6.1", "sort-package-json": "^4.0.0",
"stylelint": "^17.11.1", "stylelint": "^17.13.0",
"stylelint-config-standard-less": "^4.1.0", "stylelint-config-standard-less": "^4.1.0",
"stylelint-prettier": "^5.0.3", "stylelint-prettier": "^5.0.3",
"svgo": "^4.0.1", "svgo": "^4.0.1",
"typescript": "~6.0.3", "typescript": "~6.0.3",
"vite": "^8.0.13", "vite": "^8.0.16",
"vite-bundle-analyzer": "^1.3.8" "vite-bundle-analyzer": "^1.3.8"
} }
} }
+1 -1
View File
@@ -77,9 +77,9 @@ export default class Calculator extends Plugin {
protected async run(): Promise<string | undefined> { protected async run(): Promise<string | undefined> {
const searchInput = getElement<HTMLInputElement>("q"); const searchInput = getElement<HTMLInputElement>("q");
const node = Calculator.math.parse(searchInput.value);
try { try {
const node = Calculator.math.parse(searchInput.value);
return `${node.toString()} = ${node.evaluate()}`; return `${node.toString()} = ${node.evaluate()}`;
} catch { } catch {
// not a compatible math expression // not a compatible math expression
+1 -4
View File
@@ -21,8 +21,6 @@ RUN --mount=type=cache,id=uv,target=/root/.cache/uv set -eux -o pipefail; \
COPY --exclude=./searx/version_frozen.py ./searx/ ./searx/ COPY --exclude=./searx/version_frozen.py ./searx/ ./searx/
ARG TIMESTAMP_SETTINGS="0"
RUN set -eux -o pipefail; \ RUN set -eux -o pipefail; \
python -m compileall -q -f -j 0 --invalidation-mode=unchecked-hash ./searx/; \ python -m compileall -q -f -j 0 --invalidation-mode=unchecked-hash ./searx/; \
find ./searx/static/ -type f \ find ./searx/static/ -type f \
@@ -30,5 +28,4 @@ RUN set -eux -o pipefail; \
-exec gzip -9 -k {} + \ -exec gzip -9 -k {} + \
-exec brotli -9 -k {} + \ -exec brotli -9 -k {} + \
-exec gzip --test {}.gz + \ -exec gzip --test {}.gz + \
-exec brotli --test {}.br +; \ -exec brotli --test {}.br +
touch -c --date="@$TIMESTAMP_SETTINGS" ./searx/settings.yml
+9 -30
View File
@@ -77,43 +77,23 @@ volume_handler() {
setup_ownership "$target" "directory" setup_ownership "$target" "directory"
} }
# Handle configuration file updates setup() {
config_handler() { local template_settings="/usr/local/searxng/settings.template.yml"
local target="$1" local target_settings="$__SEARXNG_CONFIG_PATH/settings.yml"
local template="$2"
local new_template_target="$target.new"
# Create/Update the configuration file if [ ! -f "$target_settings" ]; then
if [ -f "$target" ]; then
setup_ownership "$target" "file"
if [ "$template" -nt "$target" ]; then
cp -pfT "$template" "$new_template_target"
cat <<EOF
...
... INFORMATION
... Update available for "$target"
... It is recommended to update the configuration file to ensure proper functionality
...
... New version placed at "$new_template_target"
... Please review and merge changes
...
EOF
fi
else
cat <<EOF cat <<EOF
... ...
... INFORMATION ... INFORMATION
... "$target" does not exist, creating from template... ... "$target_settings" does not exist, creating from template...
... ...
EOF EOF
cp -pfT "$template" "$target" cp -pfT "$template_settings" "$target_settings"
sed -i "s/ultrasecretkey/$(head -c 24 /dev/urandom | base64 | tr -dc 'a-zA-Z0-9')/g" "$target" sed -i "s/ultrasecretkey/$(head -c 24 /dev/urandom | base64 | tr -dc 'a-zA-Z0-9')/g" "$target_settings"
fi fi
check_file "$target" check_file "$target_settings"
} }
cat <<EOF cat <<EOF
@@ -124,8 +104,7 @@ EOF
volume_handler "$__SEARXNG_CONFIG_PATH" volume_handler "$__SEARXNG_CONFIG_PATH"
volume_handler "$__SEARXNG_DATA_PATH" volume_handler "$__SEARXNG_DATA_PATH"
# Check for files setup
config_handler "$__SEARXNG_SETTINGS_PATH" "/usr/local/searxng/searx/settings.yml"
# root only features # root only features
if [ "$(id -u)" -eq 0 ]; then if [ "$(id -u)" -eq 0 ]; then
+8
View File
@@ -0,0 +1,8 @@
# Read the documentation before extending the defaults:
# https://docs.searxng.org/admin/settings/
use_default_settings: true
server:
secret_key: "ultrasecretkey"
image_proxy: true
+1
View File
@@ -43,6 +43,7 @@
- ``google`` - ``google``
- ``mwmbl`` - ``mwmbl``
- ``naver`` - ``naver``
- ``privacywall``
- ``quark`` - ``quark``
- ``qwant`` - ``qwant``
- ``seznam`` - ``seznam``
-8
View File
@@ -1,8 +0,0 @@
.. _aol engine:
===
AOL
===
.. automodule:: searx.engines.aol
:members:
+9
View File
@@ -0,0 +1,9 @@
.. _kagi engines:
============
Kagi Engines
============
.. automodule:: searx.engines.kagi
:members:
-8
View File
@@ -1,8 +0,0 @@
.. _karmasearch engine:
===========
Karmasearch
===========
.. automodule:: searx.engines.karmasearch
:members:
+1 -1
View File
@@ -87,7 +87,7 @@ Parameters
``autocomplete`` : default from :ref:`settings search` ``autocomplete`` : default from :ref:`settings search`
[ ``google``, ``dbpedia``, ``duckduckgo``, ``mwmbl``, ``startpage``, [ ``google``, ``dbpedia``, ``duckduckgo``, ``mwmbl``, ``startpage``,
``wikipedia``, ``swisscows``, ``qwant`` ] ``privacywall``, ``wikipedia``, ``swisscows``, ``qwant`` ]
Service which completes words as you type. Service which completes words as you type.
+2 -2
View File
@@ -58,8 +58,8 @@ Configured Engines
{% for mod in engines %} {% for mod in engines %}
* - `{{mod.name}} <{{mod.about and mod.about.website}}>`_ * - `{{mod.name}} <{{mod.about and mod.about.website}}>`_
{%- if mod.about and mod.about.language %} {%- if mod.language %}
({{mod.about.language | upper}}) ({{mod.language | upper}})
{%- endif %} {%- endif %}
- ``!{{mod.shortcut}}`` - ``!{{mod.shortcut}}``
- {%- if 'searx.engines.' + mod.__name__ in documented_modules %} - {%- if 'searx.engines.' + mod.__name__ in documented_modules %}
+5 -5
View File
@@ -2,16 +2,16 @@ mock==5.2.0
nose2[coverage_plugin]==0.16.0 nose2[coverage_plugin]==0.16.0
cov-core==1.15.0 cov-core==1.15.0
black==25.9.0 black==25.9.0
pylint==4.0.5 pylint==4.0.6
splinter==0.21.0 splinter==0.21.0
selenium==4.44.0 selenium==4.45.0
Sphinx==8.2.3;python_version <= "3.11" Sphinx==8.2.3;python_version <= "3.11"
Sphinx==9.1.0; python_version > "3.11" Sphinx==9.1.0; python_version > "3.11"
sphinx-issues==6.0.0 sphinx-issues==6.0.0
sphinx-jinja==2.0.2 sphinx-jinja==2.0.2
sphinx-tabs==3.5.0 sphinx-tabs==3.5.0
furo==2025.12.19 furo==2025.12.19
sphinxcontrib-programoutput==0.19 sphinxcontrib-programoutput==0.20
sphinx-autobuild==2025.8.25 sphinx-autobuild==2025.8.25
sphinx-notfound-page==1.1.0 sphinx-notfound-page==1.1.0
myst-parser==5.0.0 myst-parser==5.0.0
@@ -23,6 +23,6 @@ coloredlogs==15.0.1
docutils>=0.21.2;python_version <= "3.11" docutils>=0.21.2;python_version <= "3.11"
docutils>=0.22.4; python_version > "3.11" docutils>=0.22.4; python_version > "3.11"
parameterized==0.9.0 parameterized==0.9.0
granian[reload]==2.7.5 granian[reload]==2.7.6
basedpyright==1.39.6 basedpyright==1.39.8
types-lxml==2026.2.16 types-lxml==2026.2.16
+2 -2
View File
@@ -1,2 +1,2 @@
granian==2.7.5 granian==2.7.6
granian[pname]==2.7.5 granian[pname]==2.7.6
+2 -2
View File
@@ -1,4 +1,4 @@
certifi==2026.5.20 certifi==2026.6.17
babel==2.18.0 babel==2.18.0
flask-babel==4.0.0 flask-babel==4.0.0
flask==3.1.3 flask==3.1.3
@@ -13,7 +13,7 @@ sniffio==1.3.1
valkey==6.1.1 valkey==6.1.1
markdown-it-py==4.2.0 markdown-it-py==4.2.0
msgspec==0.21.1 msgspec==0.21.1
typer==0.26.3 typer==0.26.7
isodate==0.7.2 isodate==0.7.2
whitenoise==6.12.0 whitenoise==6.12.0
typing-extensions==4.15.0 typing-extensions==4.15.0
+18
View File
@@ -179,6 +179,23 @@ def naver(query: str, _sxng_locale: str) -> list[str]:
return results return results
def privacywall(query: str, sxng_locale: str) -> list[str]:
# Privacywall search autocompleter
country = None
if "-" in sxng_locale:
country = sxng_locale.split("-")[1]
args = {'q': query, 'cc': country}
url = f"https://www.privacywall.org/search/secure/suggestions.php?{urlencode(args)}"
response = get(url)
if not response.ok:
return []
data: list[list[str]] = response.json()
return data[1]
def qihu360search(query: str, _sxng_locale: str) -> list[str]: def qihu360search(query: str, _sxng_locale: str) -> list[str]:
# 360Search search autocompleter # 360Search search autocompleter
url = f"https://sug.so.360.cn/suggest?{urlencode({'format': 'json', 'word': query})}" url = f"https://sug.so.360.cn/suggest?{urlencode({'format': 'json', 'word': query})}"
@@ -361,6 +378,7 @@ backends: dict[str, t.Callable[[str, str], list[str]]] = {
'google': google_complete, 'google': google_complete,
'mwmbl': mwmbl, 'mwmbl': mwmbl,
'naver': naver, 'naver': naver,
'privacywall': privacywall,
'quark': quark, 'quark': quark,
'qwant': qwant, 'qwant': qwant,
'seznam': seznam, 'seznam': seznam,
+8 -5
View File
@@ -444,12 +444,10 @@ class ExpireCacheSQLite(sqlitedb.SQLiteAppl, ExpireCache):
def get(self, key: str, default: typing.Any = None, ctx: str | None = None) -> typing.Any: def get(self, key: str, default: typing.Any = None, ctx: str | None = None) -> typing.Any:
"""Get value of ``key`` from table given by argument ``ctx``. If """Get value of ``key`` from table given by argument ``ctx``. If
``ctx`` argument is ``None`` (the default), a table name is generated ``ctx`` argument is ``None`` (the default), a table name is generated
from the :py:obj:`ExpireCacheCfg.name`. If ``key`` not exists (in from the :py:obj:`ExpireCacheCfg.name`. If ``key`` not exists in
table), the ``default`` value is returned. the table or the table not exists, the ``default`` value is returned.
""" """
table = ctx table = ctx
self.maintenance()
if not table: if not table:
table = self.normalize_name(self.cfg.name) table = self.normalize_name(self.cfg.name)
@@ -457,6 +455,9 @@ class ExpireCacheSQLite(sqlitedb.SQLiteAppl, ExpireCache):
if table not in self.table_names: if table not in self.table_names:
return default return default
# Before values are taken from the table, a maintenance interval may
# need to be carried out.
self.maintenance()
sql = f"SELECT value FROM {table} WHERE key = ?" sql = f"SELECT value FROM {table} WHERE key = ?"
row = self.DB.execute(sql, (key,)).fetchone() row = self.DB.execute(sql, (key,)).fetchone()
if row is None: if row is None:
@@ -469,12 +470,14 @@ class ExpireCacheSQLite(sqlitedb.SQLiteAppl, ExpireCache):
If ``ctx`` argument is ``None`` (the default), a table name is If ``ctx`` argument is ``None`` (the default), a table name is
generated from the :py:obj:`ExpireCacheCfg.name`.""" generated from the :py:obj:`ExpireCacheCfg.name`."""
table = ctx table = ctx
self.maintenance()
if not table: if not table:
table = self.normalize_name(self.cfg.name) table = self.normalize_name(self.cfg.name)
if table in self.table_names: if table in self.table_names:
# Before values are taken from the table, a maintenance interval may
# need to be carried out.
self.maintenance()
for row in self.DB.execute(f"SELECT key, value FROM {table}"): for row in self.DB.execute(f"SELECT key, value FROM {table}"):
yield row[0], self.deserialize(row[1]) yield row[0], self.deserialize(row[1])
+466 -181
View File
@@ -5740,186 +5740,6 @@
"zu-ZA": "ZA" "zu-ZA": "ZA"
} }
}, },
"karmasearch": {
"all_locale": null,
"custom": {},
"data_type": "traits_v1",
"languages": {},
"regions": {
"da-DK": "da-DK",
"de-AT": "de-AT",
"de-CH": "de-CH",
"de-DE": "de-DE",
"en-AU": "en-AU",
"en-CA": "en-CA",
"en-GB": "en-GB",
"en-ID": "en-ID",
"en-IN": "en-IN",
"en-MY": "en-MY",
"en-NZ": "en-NZ",
"en-PH": "en-PH",
"en-US": "en-US",
"en-ZA": "en-ZA",
"es-AR": "es-AR",
"es-CL": "es-CL",
"es-ES": "es-ES",
"es-MX": "es-MX",
"es-US": "es-US",
"fi-FI": "fi-FI",
"fr-BE": "fr-BE",
"fr-CA": "fr-CA",
"fr-CH": "fr-CH",
"fr-FR": "fr-FR",
"it-IT": "it-IT",
"ja-JP": "ja-JP",
"ko-KR": "ko-KR",
"nl-BE": "nl-BE",
"nl-NL": "nl-NL",
"pl-PL": "pl-PL",
"pt-BR": "pt-BR",
"ru-RU": "ru-RU",
"sv-SE": "sv-SE",
"tr-TR": "tr-TR",
"zh-CN": "zh-CN",
"zh-HK": "zh-HK",
"zh-TW": "zh-TW"
}
},
"karmasearch images": {
"all_locale": null,
"custom": {},
"data_type": "traits_v1",
"languages": {},
"regions": {
"da-DK": "da-DK",
"de-AT": "de-AT",
"de-CH": "de-CH",
"de-DE": "de-DE",
"en-AU": "en-AU",
"en-CA": "en-CA",
"en-GB": "en-GB",
"en-ID": "en-ID",
"en-IN": "en-IN",
"en-MY": "en-MY",
"en-NZ": "en-NZ",
"en-PH": "en-PH",
"en-US": "en-US",
"en-ZA": "en-ZA",
"es-AR": "es-AR",
"es-CL": "es-CL",
"es-ES": "es-ES",
"es-MX": "es-MX",
"es-US": "es-US",
"fi-FI": "fi-FI",
"fr-BE": "fr-BE",
"fr-CA": "fr-CA",
"fr-CH": "fr-CH",
"fr-FR": "fr-FR",
"it-IT": "it-IT",
"ja-JP": "ja-JP",
"ko-KR": "ko-KR",
"nl-BE": "nl-BE",
"nl-NL": "nl-NL",
"pl-PL": "pl-PL",
"pt-BR": "pt-BR",
"ru-RU": "ru-RU",
"sv-SE": "sv-SE",
"tr-TR": "tr-TR",
"zh-CN": "zh-CN",
"zh-HK": "zh-HK",
"zh-TW": "zh-TW"
}
},
"karmasearch news": {
"all_locale": null,
"custom": {},
"data_type": "traits_v1",
"languages": {},
"regions": {
"da-DK": "da-DK",
"de-AT": "de-AT",
"de-CH": "de-CH",
"de-DE": "de-DE",
"en-AU": "en-AU",
"en-CA": "en-CA",
"en-GB": "en-GB",
"en-ID": "en-ID",
"en-IN": "en-IN",
"en-MY": "en-MY",
"en-NZ": "en-NZ",
"en-PH": "en-PH",
"en-US": "en-US",
"en-ZA": "en-ZA",
"es-AR": "es-AR",
"es-CL": "es-CL",
"es-ES": "es-ES",
"es-MX": "es-MX",
"es-US": "es-US",
"fi-FI": "fi-FI",
"fr-BE": "fr-BE",
"fr-CA": "fr-CA",
"fr-CH": "fr-CH",
"fr-FR": "fr-FR",
"it-IT": "it-IT",
"ja-JP": "ja-JP",
"ko-KR": "ko-KR",
"nl-BE": "nl-BE",
"nl-NL": "nl-NL",
"pl-PL": "pl-PL",
"pt-BR": "pt-BR",
"ru-RU": "ru-RU",
"sv-SE": "sv-SE",
"tr-TR": "tr-TR",
"zh-CN": "zh-CN",
"zh-HK": "zh-HK",
"zh-TW": "zh-TW"
}
},
"karmasearch videos": {
"all_locale": null,
"custom": {},
"data_type": "traits_v1",
"languages": {},
"regions": {
"da-DK": "da-DK",
"de-AT": "de-AT",
"de-CH": "de-CH",
"de-DE": "de-DE",
"en-AU": "en-AU",
"en-CA": "en-CA",
"en-GB": "en-GB",
"en-ID": "en-ID",
"en-IN": "en-IN",
"en-MY": "en-MY",
"en-NZ": "en-NZ",
"en-PH": "en-PH",
"en-US": "en-US",
"en-ZA": "en-ZA",
"es-AR": "es-AR",
"es-CL": "es-CL",
"es-ES": "es-ES",
"es-MX": "es-MX",
"es-US": "es-US",
"fi-FI": "fi-FI",
"fr-BE": "fr-BE",
"fr-CA": "fr-CA",
"fr-CH": "fr-CH",
"fr-FR": "fr-FR",
"it-IT": "it-IT",
"ja-JP": "ja-JP",
"ko-KR": "ko-KR",
"nl-BE": "nl-BE",
"nl-NL": "nl-NL",
"pl-PL": "pl-PL",
"pt-BR": "pt-BR",
"ru-RU": "ru-RU",
"sv-SE": "sv-SE",
"tr-TR": "tr-TR",
"zh-CN": "zh-CN",
"zh-HK": "zh-HK",
"zh-TW": "zh-TW"
}
},
"mojeek": { "mojeek": {
"all_locale": null, "all_locale": null,
"custom": { "custom": {
@@ -6814,6 +6634,255 @@
}, },
"regions": {} "regions": {}
}, },
"privacywall": {
"all_locale": null,
"custom": {},
"data_type": "traits_v1",
"languages": {},
"regions": {
"bg-BG": "BG",
"cs-CZ": "CZ",
"da-DK": "DK",
"de-AT": "AT",
"de-BE": "BE",
"de-CH": "CH",
"de-DE": "DE",
"de-LI": "LI",
"de-LU": "LU",
"el-CY": "CY",
"el-GR": "GR",
"en-AU": "AU",
"en-CA": "CA",
"en-GB": "GB",
"en-HK": "HK",
"en-IE": "IE",
"en-IN": "IN",
"en-MT": "MT",
"en-NZ": "NZ",
"en-PH": "PH",
"en-SG": "SG",
"en-US": "US",
"es-AR": "AR",
"es-CL": "CL",
"es-CO": "CO",
"es-ES": "ES",
"es-MX": "MX",
"es-PE": "PE",
"es-VE": "VE",
"et-EE": "EE",
"fi-FI": "FI",
"fil-PH": "PH",
"fr-BE": "BE",
"fr-CA": "CA",
"fr-CH": "CH",
"fr-FR": "FR",
"fr-LU": "LU",
"ga-IE": "IE",
"gsw-CH": "CH",
"gsw-LI": "LI",
"hi-IN": "IN",
"hr-HR": "HR",
"hu-HU": "HU",
"id-ID": "ID",
"it-CH": "CH",
"it-IT": "IT",
"ja-JP": "JP",
"ko-KR": "KR",
"lb-LU": "LU",
"lt-LT": "LT",
"lv-LV": "LV",
"mi-NZ": "NZ",
"ms-MY": "MY",
"ms-SG": "SG",
"mt-MT": "MT",
"nb-NO": "NO",
"nl-BE": "BE",
"nl-NL": "NL",
"nn-NO": "NO",
"pl-PL": "PL",
"pt-BR": "BR",
"pt-PT": "PT",
"qu-PE": "PE",
"ro-RO": "RO",
"sk-SK": "SK",
"sl-SI": "SI",
"sv-FI": "FI",
"sv-SE": "SE",
"ta-SG": "SG",
"th-TH": "TH",
"tr-CY": "CY",
"vi-VN": "VN",
"zh-HK": "HK",
"zh-SG": "SG",
"zh-TW": "TW"
}
},
"privacywall images": {
"all_locale": null,
"custom": {},
"data_type": "traits_v1",
"languages": {},
"regions": {
"bg-BG": "BG",
"cs-CZ": "CZ",
"da-DK": "DK",
"de-AT": "AT",
"de-BE": "BE",
"de-CH": "CH",
"de-DE": "DE",
"de-LI": "LI",
"de-LU": "LU",
"el-CY": "CY",
"el-GR": "GR",
"en-AU": "AU",
"en-CA": "CA",
"en-GB": "GB",
"en-HK": "HK",
"en-IE": "IE",
"en-IN": "IN",
"en-MT": "MT",
"en-NZ": "NZ",
"en-PH": "PH",
"en-SG": "SG",
"en-US": "US",
"es-AR": "AR",
"es-CL": "CL",
"es-CO": "CO",
"es-ES": "ES",
"es-MX": "MX",
"es-PE": "PE",
"es-VE": "VE",
"et-EE": "EE",
"fi-FI": "FI",
"fil-PH": "PH",
"fr-BE": "BE",
"fr-CA": "CA",
"fr-CH": "CH",
"fr-FR": "FR",
"fr-LU": "LU",
"ga-IE": "IE",
"gsw-CH": "CH",
"gsw-LI": "LI",
"hi-IN": "IN",
"hr-HR": "HR",
"hu-HU": "HU",
"id-ID": "ID",
"it-CH": "CH",
"it-IT": "IT",
"ja-JP": "JP",
"ko-KR": "KR",
"lb-LU": "LU",
"lt-LT": "LT",
"lv-LV": "LV",
"mi-NZ": "NZ",
"ms-MY": "MY",
"ms-SG": "SG",
"mt-MT": "MT",
"nb-NO": "NO",
"nl-BE": "BE",
"nl-NL": "NL",
"nn-NO": "NO",
"pl-PL": "PL",
"pt-BR": "BR",
"pt-PT": "PT",
"qu-PE": "PE",
"ro-RO": "RO",
"sk-SK": "SK",
"sl-SI": "SI",
"sv-FI": "FI",
"sv-SE": "SE",
"ta-SG": "SG",
"th-TH": "TH",
"tr-CY": "CY",
"vi-VN": "VN",
"zh-HK": "HK",
"zh-SG": "SG",
"zh-TW": "TW"
}
},
"privacywall videos": {
"all_locale": null,
"custom": {},
"data_type": "traits_v1",
"languages": {},
"regions": {
"bg-BG": "BG",
"cs-CZ": "CZ",
"da-DK": "DK",
"de-AT": "AT",
"de-BE": "BE",
"de-CH": "CH",
"de-DE": "DE",
"de-LI": "LI",
"de-LU": "LU",
"el-CY": "CY",
"el-GR": "GR",
"en-AU": "AU",
"en-CA": "CA",
"en-GB": "GB",
"en-HK": "HK",
"en-IE": "IE",
"en-IN": "IN",
"en-MT": "MT",
"en-NZ": "NZ",
"en-PH": "PH",
"en-SG": "SG",
"en-US": "US",
"es-AR": "AR",
"es-CL": "CL",
"es-CO": "CO",
"es-ES": "ES",
"es-MX": "MX",
"es-PE": "PE",
"es-VE": "VE",
"et-EE": "EE",
"fi-FI": "FI",
"fil-PH": "PH",
"fr-BE": "BE",
"fr-CA": "CA",
"fr-CH": "CH",
"fr-FR": "FR",
"fr-LU": "LU",
"ga-IE": "IE",
"gsw-CH": "CH",
"gsw-LI": "LI",
"hi-IN": "IN",
"hr-HR": "HR",
"hu-HU": "HU",
"id-ID": "ID",
"it-CH": "CH",
"it-IT": "IT",
"ja-JP": "JP",
"ko-KR": "KR",
"lb-LU": "LU",
"lt-LT": "LT",
"lv-LV": "LV",
"mi-NZ": "NZ",
"ms-MY": "MY",
"ms-SG": "SG",
"mt-MT": "MT",
"nb-NO": "NO",
"nl-BE": "BE",
"nl-NL": "NL",
"nn-NO": "NO",
"pl-PL": "PL",
"pt-BR": "BR",
"pt-PT": "PT",
"qu-PE": "PE",
"ro-RO": "RO",
"sk-SK": "SK",
"sl-SI": "SI",
"sv-FI": "FI",
"sv-SE": "SE",
"ta-SG": "SG",
"th-TH": "TH",
"tr-CY": "CY",
"vi-VN": "VN",
"zh-HK": "HK",
"zh-SG": "SG",
"zh-TW": "TW"
}
},
"qwant": { "qwant": {
"all_locale": null, "all_locale": null,
"custom": {}, "custom": {},
@@ -7355,6 +7424,222 @@
}, },
"regions": {} "regions": {}
}, },
"resulthunter": {
"all_locale": "all",
"custom": {
"ui_lang": {
"az": "az",
"bg": "bg",
"br": "br",
"ca": "ca",
"cs": "cs",
"cy": "cy",
"da": "da",
"de-DE": "de-de",
"el": "el",
"en-CA": "en-ca",
"en-GB": "en-gb",
"en-IN": "en-in",
"en-US": "en-us",
"es": "es",
"et": "et",
"eu": "eu",
"fi-FI": "fi-fi",
"fr-CA": "fr-ca",
"fr-FR": "fr-fr",
"gl": "gl",
"hr": "hr",
"hu": "hu",
"id": "id",
"it": "it",
"ja-JP": "ja-jp",
"ka": "ka",
"ko": "ko",
"lt": "lt",
"lv": "lv",
"ms": "ms",
"nb": "nb",
"nl": "nl",
"pl": "pl",
"pt-BR": "pt-br",
"ro": "ro",
"ru": "ru",
"sk": "sk",
"sl": "sl",
"sq-AL": "sq-al",
"sr": "sr",
"sr_Latn": "sr-latn",
"sv": "sv",
"sw-KE": "sw-ke",
"th": "th",
"tr": "tr",
"uk": "uk",
"vi": "vi",
"zh": "zh",
"zh-TW": "zh-tw"
}
},
"data_type": "traits_v1",
"languages": {},
"regions": {
"ar-SA": "sa",
"da-DK": "dk",
"de-AT": "at",
"de-BE": "be",
"de-CH": "ch",
"de-DE": "de",
"en-AU": "au",
"en-CA": "ca",
"en-GB": "gb",
"en-HK": "hk",
"en-IN": "in",
"en-NZ": "nz",
"en-PH": "ph",
"en-US": "us",
"en-ZA": "za",
"es-AR": "ar",
"es-CL": "cl",
"es-ES": "es",
"es-MX": "mx",
"fi-FI": "fi",
"fil-PH": "ph",
"fr-BE": "be",
"fr-CA": "ca",
"fr-CH": "ch",
"fr-FR": "fr",
"gsw-CH": "ch",
"hi-IN": "in",
"id-ID": "id",
"it-CH": "ch",
"it-IT": "it",
"ja-JP": "jp",
"ko-KR": "kr",
"mi-NZ": "nz",
"ms-MY": "my",
"nb-NO": "no",
"nl-BE": "be",
"nl-NL": "nl",
"nn-NO": "no",
"pl-PL": "pl",
"pt-BR": "br",
"pt-PT": "pt",
"ru-RU": "ru",
"sv-FI": "fi",
"sv-SE": "se",
"tr-TR": "tr",
"zh-CN": "cn",
"zh-HK": "hk",
"zh-TW": "tw"
}
},
"resulthunter images": {
"all_locale": "all",
"custom": {
"ui_lang": {
"az": "az",
"bg": "bg",
"br": "br",
"ca": "ca",
"cs": "cs",
"cy": "cy",
"da": "da",
"de-DE": "de-de",
"el": "el",
"en-CA": "en-ca",
"en-GB": "en-gb",
"en-IN": "en-in",
"en-US": "en-us",
"es": "es",
"et": "et",
"eu": "eu",
"fi-FI": "fi-fi",
"fr-CA": "fr-ca",
"fr-FR": "fr-fr",
"gl": "gl",
"hr": "hr",
"hu": "hu",
"id": "id",
"it": "it",
"ja-JP": "ja-jp",
"ka": "ka",
"ko": "ko",
"lt": "lt",
"lv": "lv",
"ms": "ms",
"nb": "nb",
"nl": "nl",
"pl": "pl",
"pt-BR": "pt-br",
"ro": "ro",
"ru": "ru",
"sk": "sk",
"sl": "sl",
"sq-AL": "sq-al",
"sr": "sr",
"sr_Latn": "sr-latn",
"sv": "sv",
"sw-KE": "sw-ke",
"th": "th",
"tr": "tr",
"uk": "uk",
"vi": "vi",
"zh": "zh",
"zh-TW": "zh-tw"
}
},
"data_type": "traits_v1",
"languages": {},
"regions": {
"ar-SA": "sa",
"da-DK": "dk",
"de-AT": "at",
"de-BE": "be",
"de-CH": "ch",
"de-DE": "de",
"en-AU": "au",
"en-CA": "ca",
"en-GB": "gb",
"en-HK": "hk",
"en-IN": "in",
"en-NZ": "nz",
"en-PH": "ph",
"en-US": "us",
"en-ZA": "za",
"es-AR": "ar",
"es-CL": "cl",
"es-ES": "es",
"es-MX": "mx",
"fi-FI": "fi",
"fil-PH": "ph",
"fr-BE": "be",
"fr-CA": "ca",
"fr-CH": "ch",
"fr-FR": "fr",
"gsw-CH": "ch",
"hi-IN": "in",
"id-ID": "id",
"it-CH": "ch",
"it-IT": "it",
"ja-JP": "jp",
"ko-KR": "kr",
"mi-NZ": "nz",
"ms-MY": "my",
"nb-NO": "no",
"nl-BE": "be",
"nl-NL": "nl",
"nn-NO": "no",
"pl-PL": "pl",
"pt-BR": "br",
"pt-PT": "pt",
"ru-RU": "ru",
"sv-FI": "fi",
"sv-SE": "se",
"tr-TR": "tr",
"zh-CN": "cn",
"zh-HK": "hk",
"zh-TW": "tw"
}
},
"sepiasearch": { "sepiasearch": {
"all_locale": null, "all_locale": null,
"custom": {}, "custom": {},
@@ -9300,4 +9585,4 @@
}, },
"regions": {} "regions": {}
} }
} }
+161 -111
View File
@@ -3,6 +3,7 @@
- :py:obj:`searx.enginelib.EngineCache` - :py:obj:`searx.enginelib.EngineCache`
- :py:obj:`searx.enginelib.Engine` - :py:obj:`searx.enginelib.Engine`
- :py:obj:`searx.enginelib.EngineAbout`
- :py:obj:`searx.enginelib.traits` - :py:obj:`searx.enginelib.traits`
There is a command line for developer purposes and for deeper analysis. Here is There is a command line for developer purposes and for deeper analysis. Here is
@@ -23,7 +24,7 @@ an example in which the command line is called in the development environment::
""" """
__all__ = ["EngineCache", "Engine", "ENGINES_CACHE"] __all__ = ["EngineCache", "Engine", "EngineAbout", "ENGINES_CACHE"]
import typing as t import typing as t
import abc import abc
@@ -31,6 +32,7 @@ from collections.abc import Callable
import logging import logging
import string import string
import typer import typer
import msgspec
from ..cache import ExpireCacheSQLite, ExpireCacheCfg from ..cache import ExpireCacheSQLite, ExpireCacheCfg
@@ -39,7 +41,7 @@ if t.TYPE_CHECKING:
from searx.enginelib.traits import EngineTraits from searx.enginelib.traits import EngineTraits
from searx.extended_types import SXNG_Response from searx.extended_types import SXNG_Response
from searx.result_types import EngineResults from searx.result_types import EngineResults
from searx.search.processors import OfflineParamTypes, OnlineParamTypes from searx.search.processors import OfflineParamTypes, OnlineParamTypes, ProcessorType
ENGINES_CACHE: ExpireCacheSQLite = ExpireCacheSQLite.build_cache( ENGINES_CACHE: ExpireCacheSQLite = ExpireCacheSQLite.build_cache(
ExpireCacheCfg( ExpireCacheCfg(
@@ -178,111 +180,7 @@ class EngineCache:
return ENGINES_CACHE.secret_hash(name=name) return ENGINES_CACHE.secret_hash(name=name)
class Engine(abc.ABC): # pylint: disable=too-few-public-methods class EngineAbout(msgspec.Struct, kw_only=True):
"""Class of engine instances build from YAML settings.
Further documentation see :ref:`general engine configuration`.
.. hint::
This class is currently never initialized and only used for type hinting.
"""
logger: logging.Logger
# Common options in the engine module
engine_type: str
"""Type of the engine (:ref:`searx.search.processors`)"""
paging: bool
"""Engine supports multiple pages."""
max_page: int = 0
"""If the engine supports paging, then this is the value for the last page
that is still supported. ``0`` means unlimited numbers of pages."""
time_range_support: bool
"""Engine supports search time range."""
safesearch: bool
"""Engine supports SafeSearch"""
language_support: bool
"""Engine supports languages (locales) search."""
language: str
"""For an engine, when there is ``language: ...`` in the YAML settings the engine
does support only this one language:
.. code:: yaml
- name: google french
engine: google
language: fr
"""
region: str
"""For an engine, when there is ``region: ...`` in the YAML settings the engine
does support only this one region::
.. code:: yaml
- name: google belgium
engine: google
region: fr-BE
"""
fetch_traits: "Callable[[EngineTraits, bool], None]"
"""Function to to fetch engine's traits from origin."""
traits: "traits.EngineTraits"
"""Traits of the engine."""
# settings.yml
categories: list[str]
"""Specifies to which :ref:`engine categories` the engine should be added."""
name: str
"""Name that will be used across SearXNG to define this engine. In settings, on
the result page .."""
engine: str
"""Name of the python file used to handle requests and responses to and from
this search engine (file name from :origin:`searx/engines` without
``.py``)."""
enable_http: bool
"""Enable HTTP (by default only HTTPS is enabled)."""
shortcut: str
"""Code used to execute bang requests (``!foo``)"""
timeout: float
"""Specific timeout for search-engine."""
display_error_messages: bool
"""Display error messages on the web UI."""
proxies: dict[str, dict[str, str]]
"""Set proxies for a specific engine (YAML):
.. code:: yaml
proxies :
http: socks5://proxy:port
https: socks5://proxy:port
"""
disabled: bool
"""To disable by default the engine, but not deleting it. It will allow the
user to manually activate it in the settings."""
inactive: bool
"""Remove the engine from the settings (*disabled & removed*)."""
about: dict[str, dict[str, str]]
"""Additional fields describing the engine. """Additional fields describing the engine.
.. code:: yaml .. code:: yaml
@@ -296,21 +194,173 @@ class Engine(abc.ABC): # pylint: disable=too-few-public-methods
results: HTML results: HTML
""" """
using_tor_proxy: bool # pylint: disable=too-few-public-methods
website: str = ""
"""Official web-site of the origin."""
wikidata_id: str = ""
"""`Wikidata ID <https://www.wikidata.org/wiki/Wikidata:Identifiers>`_"""
official_api_documentation: str = ""
"""URL of the official API (regardless of whether it is used)"""
use_official_api: bool = False
"""SearXNG engine makes use of the official API or not"""
require_api_key: bool = False
"""API requires a key or not."""
results: str = ""
"""Data format of the source (online-engines: of the response)."""
description: str = ""
"""Brief description of the engine and where it gets its data from.
This value should only be set as long as no description of the data source
is available via a :py:obj:`EngineAbout.wikidata_id`.
"""
language: str = ""
"""Deprecated! Migrate your setting from `engine.about.language` to
`engine.language`"""
class Engine(abc.ABC): # pylint: disable=too-few-public-methods
"""Class of engine instances build from YAML settings.
Further documentation see :ref:`general engine configuration`.
The defaults are taken from :py:obj:`searx.engines.ENGINE_DEFAULT_ARGS`.
.. hint::
This class is currently never initialized and only used for type hinting.
"""
logger: logging.Logger
# Common options of the engine module
engine_type: "ProcessorType" = "online"
"""Type of the engine (:ref:`searx.search.processors`)"""
paging: bool = False
"""Engine supports multiple pages."""
max_page: int = 0
"""If the engine supports paging, then this is the value for the last page
that is still supported. ``0`` means unlimited numbers of pages."""
time_range_support: bool = False
"""Engine supports search time range."""
safesearch: bool = False
"""Engine supports SafeSearch"""
language_support: bool = False
"""Engine supports languages (locales) search."""
fetch_traits: "Callable[[EngineTraits, bool], None]"
"""Function to to fetch engine's traits from origin."""
traits: "traits.EngineTraits"
"""Traits of the engine."""
# settings.yml
name: str
"""Name that will be used across SearXNG to define this engine. In settings, on
the result page .."""
engine: str
"""Name of the python file used to handle requests and responses to and from
this search engine (file name from :origin:`searx/engines` without
``.py``)."""
categories: list[str] = ["general"]
"""Specifies to which :ref:`engine categories` the engine should be added."""
language: str = ""
"""If the engine supports only one language, this language is specified here
(``en``, ``de``, ``"no"`` or ..); otherwise, the value remains empty. For
the YAML configuration: think of the `YAML-Norway problem
<https://ruuda.nl/2023/the-yaml-document-from-hell#the-norway-problem>`_
.. code:: yaml
- name: google norway
engine: google
language: "no"
Depending on ``language_support``, this value has similar but also slightly
different meanings.
- When ``language_support`` is **true**, the map of
:py:obj:`traits.EngineTraits.languages` is reduced to the selected
language
- When ``language_support`` is **false**, then the implementation of the
engine only supports this one ``language``
"""
region: str = ""
"""For an engine, when there is ``region: ...`` in the YAML settings the engine
does support only this one region::
.. code:: yaml
- name: google belgium
engine: google
region: fr-BE
"""
enable_http: bool
"""Enable HTTP (by default only HTTPS is enabled)."""
shortcut: str
"""Code used to execute bang requests (``!foo``)"""
timeout: float
"""Specific timeout for search-engine."""
display_error_messages: bool
"""Display error messages on the web UI."""
disabled: bool = False
"""To disable by default the engine, but not deleting it. It will allow the
user to manually activate it in the settings."""
inactive: bool = False
"""Remove the engine from the settings (*disabled & removed*)."""
about: EngineAbout = EngineAbout()
"""Additional fields describing the engine."""
using_tor_proxy: bool = False
"""Using tor proxy (``true``) or not (``false``) for this engine.""" """Using tor proxy (``true``) or not (``false``) for this engine."""
send_accept_language_header: bool send_accept_language_header: bool = True
"""When this option is activated (default), the language (locale) that is """When this option is activated (default), the language (locale) that is
selected by the user is used to build and send a ``Accept-Language`` header selected by the user is used to build and send a ``Accept-Language`` header
in the request to the origin search engine.""" in the request to the origin search engine."""
tokens: list[str] tokens: list[str] = []
"""A list of secret tokens to make this engine *private*, more details see """A list of secret tokens to make this engine *private*, more details see
:ref:`private engines`.""" :ref:`private engines`."""
weight: int weight: float = 1.0
"""Weighting of the results of this engine (:ref:`weight <settings engines>`).""" """Weighting of the results of this engine (:ref:`weight <settings engines>`)."""
proxies: dict[str, dict[str, str]]
"""Set proxies for a specific engine (YAML):
.. code:: yaml
proxies :
http: socks5://proxy:port
https: socks5://proxy:port
"""
def setup(self, engine_settings: dict[str, t.Any]) -> bool: # pylint: disable=unused-argument def setup(self, engine_settings: dict[str, t.Any]) -> bool: # pylint: disable=unused-argument
"""Dynamic setup of the engine settings. """Dynamic setup of the engine settings.
+15 -12
View File
@@ -142,11 +142,11 @@ class EngineTraits:
""" """
if self.data_type == "traits_v1": if self.data_type == "traits_v1":
self._set_traits_v1(engine) self._set_traits_v1(engine) # pyright: ignore[reportArgumentType]
else: else:
raise TypeError("engine traits of type %s is unknown" % self.data_type) raise TypeError("engine traits of type %s is unknown" % self.data_type)
def _set_traits_v1(self, engine: "Engine | types.ModuleType") -> None: def _set_traits_v1(self, engine: "Engine") -> None:
# For an engine, when there is `language: ...` in the YAML settings the engine # For an engine, when there is `language: ...` in the YAML settings the engine
# does support only this one language (region):: # does support only this one language (region)::
# #
@@ -159,22 +159,25 @@ class EngineTraits:
_msg = "settings.yml - engine: '%s' / %s: '%s' not supported" _msg = "settings.yml - engine: '%s' / %s: '%s' not supported"
languages = traits.languages if engine.language:
if hasattr(engine, "language"): if engine.language_support:
if engine.language not in languages: if not len(traits.languages) > 1:
raise ValueError(_msg % (engine.name, "language", engine.language)) raise ValueError(
traits.languages = {engine.language: languages[engine.language]} f"engine {engine.name}: activated language_support with just one or less languages"
)
if engine.language not in traits.languages:
raise ValueError(_msg % (engine.name, "language", engine.language))
traits.languages = {engine.language: traits.languages[engine.language]}
regions = traits.regions if engine.region:
if hasattr(engine, "region"): if engine.region not in traits.regions:
if engine.region not in regions:
raise ValueError(_msg % (engine.name, "region", engine.region)) raise ValueError(_msg % (engine.name, "region", engine.region))
traits.regions = {engine.region: regions[engine.region]} traits.regions = {engine.region: traits.regions[engine.region]}
engine.language_support = bool(traits.languages or traits.regions) engine.language_support = bool(traits.languages or traits.regions)
# set the copied & modified traits in engine's namespace # set the copied & modified traits in engine's namespace
engine.traits = traits # pyright: ignore[reportAttributeAccessIssue] engine.traits = traits
class EngineTraitsMap(dict[str, EngineTraits]): class EngineTraitsMap(dict[str, EngineTraits]):
+1 -1
View File
@@ -22,8 +22,8 @@ about = {
"use_official_api": False, "use_official_api": False,
"require_api_key": False, "require_api_key": False,
"results": "HTML", "results": "HTML",
"language": "zh",
} }
language = "zh"
# Engine Configuration # Engine Configuration
categories = ["general"] categories = ["general"]
+6 -6
View File
@@ -5,19 +5,19 @@ intended monkey patching of the engine modules.
.. attention:: .. attention::
Monkey-patching modules is a practice from the past that shouldn't be Monkey-patching modules is a practice from the past that shouldn't be
expanded upon. In the long run, there should be an engine class that can be expanded upon. In the long run, engines should be instances of
inherited. However, as long as this class doesn't exist, and as long as all :py:obj:`searx.enginelib.Engine`. However, as long as long as all engine
engine modules aren't converted to an engine class, these builtin types will modules aren't converted to this class, these builtin types will still be
still be needed. needed.
""" """
import logging import logging
from searx.enginelib import traits as _traits from searx.enginelib import traits as _traits
logger: logging.Logger logger: logging.Logger
supported_languages: str
language_aliases: str
language_support: bool language_support: bool
language: str
region: str
traits: _traits.EngineTraits traits: _traits.EngineTraits
# from searx.engines.ENGINE_DEFAULT_ARGS # from searx.engines.ENGINE_DEFAULT_ARGS
+50 -9
View File
@@ -12,41 +12,50 @@ import typing as t
import sys import sys
import copy import copy
import os
from os.path import realpath, dirname from os.path import realpath, dirname
import warnings
import types import types
import inspect import inspect
import msgspec
from searx import logger, settings from searx import logger, settings
from searx.utils import load_module from searx.utils import load_module
from searx.data import ENGINE_TRAITS
if t.TYPE_CHECKING: from searx.enginelib import Engine, EngineAbout
from searx.enginelib import Engine
logger = logger.getChild('engines') logger = logger.getChild('engines')
ENGINE_DIR = dirname(realpath(__file__)) ENGINE_DIR = dirname(realpath(__file__))
# Defaults for the namespace of an engine module, see load_engine() # Defaults for the namespace of an engine module, see load_engine()
ENGINE_DEFAULT_ARGS: dict[str, int | str | list[t.Any] | dict[str, t.Any] | bool] = { ENGINE_DEFAULT_ARGS: dict[str, t.Any] = {
# Common options in the engine module # Common options in the engine module
"engine_type": "online", "engine_type": "online",
"paging": False, "paging": False,
"max_page": 0,
"time_range_support": False, "time_range_support": False,
"safesearch": False, "safesearch": False,
"language_support": False,
# settings.yml # settings.yml
"categories": ["general"], "categories": ["general"],
"language": "",
"region": "",
"enable_http": False, "enable_http": False,
"shortcut": "-", "shortcut": "-",
"timeout": settings["outgoing"]["request_timeout"], "timeout": settings["outgoing"]["request_timeout"],
"display_error_messages": True, "display_error_messages": True,
"disabled": False, "disabled": False,
"inactive": False, "inactive": False,
"about": {}, "about": EngineAbout(),
"using_tor_proxy": False, "using_tor_proxy": False,
"send_accept_language_header": True, "send_accept_language_header": True,
"tokens": [], "tokens": [],
"max_page": 0, "weight": 1.0,
} }
"""Default values that are set in an engine of type *module*, please compare
with the class :py:obj:`searx.enginelib.Engine`."""
# set automatically when an engine does not have any tab category # set automatically when an engine does not have any tab category
DEFAULT_CATEGORY = 'other' DEFAULT_CATEGORY = 'other'
@@ -176,14 +185,41 @@ def set_loggers(engine: "Engine|types.ModuleType", engine_name: str):
def update_engine_attributes(engine: "Engine | types.ModuleType", engine_data: dict[str, t.Any]): def update_engine_attributes(engine: "Engine | types.ModuleType", engine_data: dict[str, t.Any]):
# pylint: disable=too-many-branches
# set engine attributes from engine_data # set engine attributes from engine_data
kvargs: dict[str, t.Any]
if isinstance(engine.about, EngineAbout):
kvargs = {**msgspec.to_builtins(engine.about), **engine_data.get("about", {})}
else:
kvargs = {**engine.about, **engine_data.get("about", {})}
try:
engine.about = EngineAbout(**kvargs)
except TypeError as exc:
raise TypeError(
f"engine '{engine_data['name']}' ({engine_data['engine']}) - in the about section --> {exc}"
) from exc
# warn about deprecated engine settings
if engine.about.language:
if hasattr(engine, "language") and not engine.language:
engine.language = engine.about.language
warnings.warn(
f"engine '{engine_data['name']}' ({engine_data['engine']})"
f" - migrate engine.about.language to engine.language!",
DeprecationWarning,
2,
)
for param_name, param_value in engine_data.items(): for param_name, param_value in engine_data.items():
if param_name == "about":
continue
if param_name == 'categories': if param_name == 'categories':
if isinstance(param_value, str): if isinstance(param_value, str):
param_value = list(map(str.strip, param_value.split(','))) param_value = list(map(str.strip, param_value.split(',')))
engine.categories = param_value # type: ignore engine.categories = param_value # type: ignore
elif hasattr(engine, 'about') and param_name == 'about':
engine.about = {**engine.about, **engine_data['about']} # type: ignore
else: else:
setattr(engine, param_name, param_value) setattr(engine, param_name, param_value)
@@ -192,6 +228,9 @@ def update_engine_attributes(engine: "Engine | types.ModuleType", engine_data: d
if not hasattr(engine, arg_name): if not hasattr(engine, arg_name):
setattr(engine, arg_name, copy.deepcopy(arg_value)) setattr(engine, arg_name, copy.deepcopy(arg_value))
if ENGINE_TRAITS.get(engine.name, {}).get("languages") and not engine.language_support:
raise ValueError(f"engine '{engine.name}' ({engine_data['engine']}) language_support should be set to True")
def update_attributes_for_tor(engine: "Engine | types.ModuleType"): def update_attributes_for_tor(engine: "Engine | types.ModuleType"):
if using_tor_proxy(engine) and hasattr(engine, 'onion_url'): if using_tor_proxy(engine) and hasattr(engine, 'onion_url'):
@@ -278,6 +317,8 @@ def load_engines(engine_list: list[dict[str, t.Any]]):
else: else:
# if an engine can't be loaded (if for example the engine is missing # if an engine can't be loaded (if for example the engine is missing
# tor or some other requirements) its set to inactive! # tor or some other requirements) its set to inactive!
logger.error("loading engine %s failed: set engine to inactive!", engine_data.get("name", "???")) logger.error(
f"(PID {os.getpid()}) loading engine %s failed: set engine to inactive!", engine_data.get("name", "???")
)
engine_data["inactive"] = True engine_data["inactive"] = True
return engines return engines
+1 -1
View File
@@ -16,12 +16,12 @@ about = {
"use_official_api": False, "use_official_api": False,
"require_api_key": False, "require_api_key": False,
"results": "HTML", "results": "HTML",
"language": "zh",
} }
# Engine Configuration # Engine Configuration
categories = ["videos"] categories = ["videos"]
paging = True paging = True
language = "zh"
# Base URL # Base URL
base_url = "https://www.acfun.cn" base_url = "https://www.acfun.cn"
+1
View File
@@ -64,6 +64,7 @@ about: dict[str, t.Any] = {
# engine dependent config # engine dependent config
categories = ["files", "books"] categories = ["files", "books"]
paging: bool = True paging: bool = True
language_support = True
# search-url # search-url
base_url: list[str] | str = [] base_url: list[str] | str = []
+1 -1
View File
@@ -42,8 +42,8 @@ about = {
'use_official_api': False, 'use_official_api': False,
'require_api_key': False, 'require_api_key': False,
'results': 'HTML', 'results': 'HTML',
'language': 'it',
} }
language = "it"
def request(query, params): def request(query, params):
-210
View File
@@ -1,210 +0,0 @@
# SPDX-License-Identifier: AGPL-3.0-or-later
"""AOL supports WEB, image, and video search. Internally, it uses the Bing
index.
AOL doesn't seem to support setting the language via request parameters, instead
the results are based on the URL. For example, there is
- `search.aol.com <https://search.aol.com>`_ for English results
- `suche.aol.de <https://suche.aol.de>`_ for German results
However, AOL offers its services only in a few regions:
- en-US: search.aol.com
- de-DE: suche.aol.de
- fr-FR: recherche.aol.fr
- en-GB: search.aol.co.uk
- en-CA: search.aol.ca
In order to still offer sufficient support for language and region, the `search
keywords`_ known from Bing, ``language`` and ``loc`` (region), are added to the
search term (AOL is basically just a proxy for Bing).
.. _search keywords:
https://support.microsoft.com/en-us/topic/advanced-search-keywords-ea595928-5d63-4a0b-9c6b-0b769865e78a
"""
from urllib.parse import urlencode, unquote_plus
import typing as t
from lxml import html
from dateutil import parser
from searx.result_types import EngineResults
from searx.utils import eval_xpath_list, eval_xpath, extract_text
if t.TYPE_CHECKING:
from searx.extended_types import SXNG_Response
from searx.search.processors import OnlineParams
about = {
"website": "https://www.aol.com",
"wikidata_id": "Q2407",
"official_api_documentation": None,
"use_official_api": False,
"require_api_key": False,
"results": "HTML",
}
categories = ["general"]
search_type = "search" # supported: search, image, video
paging = True
safesearch = True
time_range_support = True
results_per_page = 10
base_url = "https://search.aol.com"
time_range_map = {"day": "1d", "week": "1w", "month": "1m", "year": "1y"}
safesearch_map = {0: "p", 1: "r", 2: "i"}
enable_http2 = False
def init(_):
if search_type not in ("search", "image", "video"):
raise ValueError(f"unsupported search type {search_type}")
def request(query: str, params: "OnlineParams") -> None:
language, region = (params["searxng_locale"].split("-") + [None])[:2]
if language and language != "all":
query = f"{query} language:{language}"
if region:
query = f"{query} loc:{region}"
args: dict[str, str | int | None] = {
"q": query,
"b": params["pageno"] * results_per_page + 1, # page is 1-indexed
"pz": results_per_page,
}
if params["time_range"]:
args["fr2"] = "time"
args["age"] = params["time_range"]
else:
args["fr2"] = "sb-top-search"
params["cookies"]["sB"] = f"vm={safesearch_map[params['safesearch']]}"
params["url"] = f"{base_url}/aol/{search_type}?{urlencode(args)}"
logger.debug(params)
def _deobfuscate_url(obfuscated_url: str) -> str | None:
# URL looks like "https://search.aol.com/click/_ylt=AwjFSDjd;_ylu=JfsdjDFd/RV=2/RE=1774058166/RO=10/RU=https%3a%2f%2fen.wikipedia.org%2fwiki%2fTree/RK=0/RS=BP2CqeMLjscg4n8cTmuddlEQA2I-" # pylint: disable=line-too-long
if not obfuscated_url:
return None
for part in obfuscated_url.split("/"):
if part.startswith("RU="):
return unquote_plus(part[3:])
# pattern for de-obfuscating URL not found, fall back to Yahoo's tracking link
return obfuscated_url
def _general_results(doc: html.HtmlElement) -> EngineResults:
res = EngineResults()
for result in eval_xpath_list(doc, "//div[@id='web']//ol/li[not(contains(@class, 'first'))]"):
obfuscated_url = extract_text(eval_xpath(result, ".//h3/a/@href"))
if not obfuscated_url:
continue
url = _deobfuscate_url(obfuscated_url)
if not url:
continue
res.add(
res.types.MainResult(
url=url,
title=extract_text(eval_xpath(result, ".//h3/a")) or "",
content=extract_text(eval_xpath(result, ".//div[contains(@class, 'compText')]")) or "",
thumbnail=extract_text(eval_xpath(result, ".//a[contains(@class, 'thm')]/img/@data-src")) or "",
)
)
return res
def _video_results(doc: html.HtmlElement) -> EngineResults:
res = EngineResults()
for result in eval_xpath_list(doc, "//div[contains(@class, 'results')]//ol/li"):
obfuscated_url = extract_text(eval_xpath(result, ".//a/@href"))
if not obfuscated_url:
continue
url = _deobfuscate_url(obfuscated_url)
if not url:
continue
published_date_raw = extract_text(eval_xpath(result, ".//div[contains(@class, 'v-age')]"))
try:
published_date = parser.parse(published_date_raw or "")
except parser.ParserError:
published_date = None
res.add(
res.types.LegacyResult(
{
"template": "videos.html",
"url": url,
"title": extract_text(eval_xpath(result, ".//h3")),
"content": extract_text(eval_xpath(result, ".//div[contains(@class, 'compText')]")),
"thumbnail": extract_text(eval_xpath(result, ".//img[contains(@class, 'thm')]/@src")),
"length": extract_text(eval_xpath(result, ".//span[contains(@class, 'v-time')]")),
"publishedDate": published_date,
}
)
)
return res
def _image_results(doc: html.HtmlElement) -> EngineResults:
res = EngineResults()
for result in eval_xpath_list(doc, "//section[@id='results']//ul/li"):
obfuscated_url = extract_text(eval_xpath(result, "./a/@href"))
if not obfuscated_url:
continue
url = _deobfuscate_url(obfuscated_url)
if not url:
continue
res.add(
res.types.LegacyResult(
{
"template": "images.html",
# results don't have an extra URL, only the image source
"url": url,
"title": extract_text(eval_xpath(result, ".//a/@aria-label")),
"thumbnail_src": extract_text(eval_xpath(result, ".//img/@src")),
"img_src": url,
}
)
)
return res
def response(resp: "SXNG_Response") -> EngineResults:
doc = html.fromstring(resp.text)
match search_type:
case "search":
results = _general_results(doc)
case "image":
results = _image_results(doc)
case "video":
results = _video_results(doc)
case _:
raise ValueError("unsupported search type")
for suggestion in eval_xpath_list(doc, ".//ol[contains(@class, 'searchRightBottom')]//table//a"):
results.add(results.types.LegacyResult({"suggestion": extract_text(suggestion)}))
return results
+1
View File
@@ -35,6 +35,7 @@ about = {
categories = ["it", "software wikis"] categories = ["it", "software wikis"]
paging = True paging = True
main_wiki = "wiki.archlinux.org" main_wiki = "wiki.archlinux.org"
language_support = True
def request(query, params): def request(query, params):
+1 -1
View File
@@ -54,8 +54,8 @@ about = {
"use_official_api": True, "use_official_api": True,
"require_api_key": True, "require_api_key": True,
"results": "JSON", "results": "JSON",
"language": "en",
} }
language = "en"
CACHE: EngineCache CACHE: EngineCache
"""Persistent (SQLite) key/value cache that deletes its values after ``expire`` """Persistent (SQLite) key/value cache that deletes its values after ``expire``
+1 -1
View File
@@ -23,8 +23,8 @@ about = {
"use_official_api": False, "use_official_api": False,
"require_api_key": False, "require_api_key": False,
"results": "JSON", "results": "JSON",
"language": "zh",
} }
language = "zh"
paging = True paging = True
categories = [] categories = []
+1
View File
@@ -34,6 +34,7 @@ about = {
categories = ["general", "social media"] categories = ["general", "social media"]
paging = True paging = True
time_range_support = True time_range_support = True
language_support = True
base_url = "https://boardreader.com" base_url = "https://boardreader.com"
time_range_map = {"day": "1", "week": "7", "month": "30", "year": "365"} time_range_map = {"day": "1", "week": "7", "month": "30", "year": "365"}
+1 -1
View File
@@ -13,8 +13,8 @@ about = {
'use_official_api': False, 'use_official_api': False,
'require_api_key': False, 'require_api_key': False,
'results': 'JSON', 'results': 'JSON',
'language': 'de',
} }
language = "de"
paging = True paging = True
categories = ['general'] categories = ['general']
+115
View File
@@ -0,0 +1,115 @@
# SPDX-License-Identifier: AGPL-3.0-or-later
"""Chatnoir is an open source search engine developed by Webis, a network of
researchers from the universities of Weimar, Halle and Leipzig. It supports
different different text corpora as indexes, e.g. CommonCrawl. See its
`announcement`_ for more information.
.. _announcement : https://groups.google.com/g/common-crawl/c/3o2dOHpeRxo/m/H2Osqz9dAAAJ
"""
import typing as t
from searx.exceptions import SearxEngineAPIException
from searx.extended_types import SXNG_Response
from searx.network import get, post
from searx.result_types import EngineResults
from searx.utils import html_to_text
if t.TYPE_CHECKING:
from searx.search.processors import OnlineParams
about = {
"website": "https://www.chatnoir.eu",
"official_api_documentation": "https://www.chatnoir.eu/docs/api-general",
"use_official_api": True,
"require_api_key": False,
"results": "JSON",
}
base_url = "https://www.chatnoir.eu"
categories = ["general"]
paging = True
page_size = 10
api_key = ""
"""You can optionally provide your own API key here. This one will then be used
instead of scraping an API key."""
search_index = "cw22"
"""Search index to browse in. See `the API documentation
<https://www.chatnoir.eu/docs/api-general>`_ for a full list."""
def _obtain_api_key() -> tuple[str, str, str]:
home_resp = get(base_url)
if not home_resp.ok:
raise SearxEngineAPIException("failed to obtain api key")
csrf_token = home_resp.cookies["csrftoken"]
token_resp = post(
"https://www.chatnoir.eu/?init",
headers={
"Referer": f"{base_url}/",
"X-Requested-With": "XMLHttpRequest",
"X-Csrf-Token": csrf_token,
},
cookies=home_resp.cookies,
)
if not token_resp.ok:
raise SearxEngineAPIException("failed to obtain api key")
session_id = token_resp.cookies["sessionid"]
scraped_api_key = token_resp.json()["token"]["token"]
return csrf_token, session_id, scraped_api_key
def request(query: str, params: "OnlineParams"):
if api_key:
# use user-provided API key instead of scraping one
headers = {
"Authorization": f"Bearer {api_key}",
}
params["headers"].update(headers)
else:
csrf_token, session_id, scraped_api_key = _obtain_api_key()
headers = {
"Authorization": f"Bearer {scraped_api_key}",
"X-Csrf-Token": csrf_token,
}
params["headers"].update(headers)
params["cookies"] = {"csrftoken": session_id, "sessionid": session_id}
params["url"] = f"{base_url}/api/v1/_search"
params["method"] = "POST"
json_data = {
"query": query,
"index": [
search_index,
],
"from": (params["pageno"] - 1) * page_size,
"size": page_size,
"_extended_meta": True,
}
params["json"] = json_data
def response(resp: "SXNG_Response") -> EngineResults:
res = EngineResults()
results = resp.json()["results"]
for result in results:
res.add(
res.types.MainResult(
url=result["target_uri"],
title=html_to_text(result["title"]),
content=html_to_text(result["snippet"]),
)
)
return res
+1 -1
View File
@@ -10,8 +10,8 @@ about = {
'use_official_api': False, 'use_official_api': False,
'require_api_key': False, 'require_api_key': False,
'results': 'JSON', 'results': 'JSON',
'language': 'de',
} }
language = "de"
paging = True paging = True
categories = [] categories = []
+8 -1
View File
@@ -70,13 +70,13 @@ about = {
"use_official_api": False, "use_official_api": False,
"require_api_key": False, "require_api_key": False,
"results": "JSON", "results": "JSON",
"language": "zh",
} }
paging = True paging = True
time_range_support = True time_range_support = True
results_per_page = 10 results_per_page = 10
categories = [] categories = []
language = "zh"
ChinasoCategoryType = t.Literal['news', 'videos', 'images'] ChinasoCategoryType = t.Literal['news', 'videos', 'images']
"""ChinaSo supports news, videos, images search. """ChinaSo supports news, videos, images search.
@@ -156,6 +156,13 @@ def response(resp):
except Exception as e: except Exception as e:
raise SearxEngineAPIException(f"Invalid response: {e}") from e raise SearxEngineAPIException(f"Invalid response: {e}") from e
# Upstream returns {'status': 0, 'msg': 'empty result', 'data': {}} when there
# are no results; this is a valid empty result rather than an API error.
if not isinstance(data, dict) or "data" not in data:
raise SearxEngineAPIException("Invalid response")
if not data["data"]:
return []
parsers = {'news': parse_news, 'images': parse_images, 'videos': parse_videos} parsers = {'news': parse_news, 'images': parse_images, 'videos': parse_videos}
return parsers[chinaso_category](data) return parsers[chinaso_category](data)
+1
View File
@@ -40,6 +40,7 @@ categories = ["videos"]
paging = True paging = True
page_size = 10 page_size = 10
language_support = True
time_range_support = True time_range_support = True
time_delta_dict = { time_delta_dict = {
"day": timedelta(days=1), "day": timedelta(days=1),
+6 -8
View File
@@ -24,7 +24,7 @@ import typing as t
import json import json
from searx.result_types import EngineResults from searx.result_types import EngineResults
from searx.enginelib import EngineCache from searx.enginelib import EngineCache, EngineAbout
if t.TYPE_CHECKING: if t.TYPE_CHECKING:
from searx.search.processors import RequestParams from searx.search.processors import RequestParams
@@ -35,13 +35,11 @@ categories = ["general"]
disabled = True disabled = True
timeout = 2.0 timeout = 2.0
about = { language = "en"
"wikidata_id": None, about = EngineAbout(
"official_api_documentation": None, results="JSON",
"use_official_api": False, description="Demo offline engine Engine with results in the English language.",
"require_api_key": False, )
"results": "JSON",
}
# if there is a need for globals, use a leading underline # if there is a need for globals, use a leading underline
_my_offline_engine: str = "" _my_offline_engine: str = ""
+9 -8
View File
@@ -25,6 +25,7 @@ import typing as t
from urllib.parse import urlencode from urllib.parse import urlencode
from searx.result_types import EngineResults from searx.result_types import EngineResults
from searx.enginelib import EngineAbout
if t.TYPE_CHECKING: if t.TYPE_CHECKING:
from searx.extended_types import SXNG_Response from searx.extended_types import SXNG_Response
@@ -43,14 +44,14 @@ page_size = 20
search_api = "https://api.artic.edu/api/v1/artworks/search" search_api = "https://api.artic.edu/api/v1/artworks/search"
image_api = "https://www.artic.edu/iiif/2/" image_api = "https://www.artic.edu/iiif/2/"
about = { about = EngineAbout(
"website": "https://www.artic.edu", website="https://www.artic.edu",
"wikidata_id": "Q239303", wikidata_id="Q239303",
"official_api_documentation": "http://api.artic.edu/docs/", official_api_documentation="http://api.artic.edu/docs/",
"use_official_api": True, use_official_api=True,
"require_api_key": False, require_api_key=False,
"results": "JSON", results="JSON",
} )
# if there is a need for globals, use a leading underline # if there is a need for globals, use a leading underline
+1 -1
View File
@@ -11,8 +11,8 @@ about = {
'use_official_api': False, 'use_official_api': False,
'require_api_key': False, 'require_api_key': False,
'results': 'HTML', 'results': 'HTML',
'language': 'de',
} }
language = "de"
categories = [] categories = []
paging = True paging = True
+101
View File
@@ -0,0 +1,101 @@
# SPDX-License-Identifier: AGPL-3.0-or-later
"""Dogpile is a metasearch engine by the American advertising company `System1`_.
.. _System1: https://system1.com/
"""
import typing as t
from datetime import datetime, timezone
import html
from searx.utils import format_duration, html_to_text, humanize_number
from searx.result_types import EngineResults
if t.TYPE_CHECKING:
from searx.extended_types import SXNG_Response
from searx.search.processors import OnlineParams
about = {
"website": "https://www.dogpile.com",
"wikidata_id": "Q3595363",
"official_api_documentation": None,
"use_official_api": False,
"require_api_key": False,
"results": "JSON",
}
paging = True
safesearch = True
categories = ["general"]
dogpile_categ = "search"
"""Category to search in. Can be either "search", "images", "videos" or "news"."""
base_url = "https://www.dogpile.com"
safe_search_map = {0: "none", 1: "moderate", 2: "heavy"}
def init(_):
if dogpile_categ not in ("search", "images", "videos", "news"):
raise ValueError("invalid search type: %s" % dogpile_categ)
def request(query: str, params: "OnlineParams"):
params["url"] = f"{base_url}/api/{dogpile_categ}"
params["method"] = "POST"
params["json"] = {"q": query, "qadf": safe_search_map[params["safesearch"]], "page": params["pageno"]}
return params
def response(resp: "SXNG_Response"):
res = EngineResults()
json_resp = resp.json()
for result in json_resp["results"]:
if dogpile_categ == "search":
res.add(
res.types.MainResult(
url=result["clickUrl"],
title=html_to_text(result["title"]),
content=html_to_text(result["description"]),
)
)
elif dogpile_categ == "news":
res.add(
res.types.MainResult(
url=result["clickUrl"],
title=html_to_text(html.unescape(result["title"])),
content=html_to_text(html.unescape(result["description"])),
thumbnail=result["thumbnailUrl"],
publishedDate=datetime.fromtimestamp(result["date"], tz=timezone.utc),
)
)
elif dogpile_categ == "videos":
res.add(
res.types.LegacyResult(
template="videos.html",
url=result["clickUrl"],
title=html_to_text(result["title"]),
content=html_to_text(result["description"]),
thumbnail=result["thumbnailUrl"],
publishedDate=datetime.fromisoformat(result["publishDate"]),
length=format_duration(result["duration"]),
views=humanize_number(result["viewCount"]),
)
)
elif dogpile_categ == "images":
res.add(
res.types.Image(
url=result["altClickUrl"],
title=html_to_text(result["title"]),
content=html_to_text(result["description"]),
img_src=result["clickUrl"],
thumbnail_src=result["thumbnailUrl"],
resolution=f"{result['width']}x{result['height']}",
img_format=result["format"],
)
)
return res
+1
View File
@@ -203,6 +203,7 @@ about: dict[str, str | bool] = {
categories: list[str] = ["general", "web"] categories: list[str] = ["general", "web"]
paging: bool = True paging: bool = True
time_range_support: bool = True time_range_support: bool = True
language_support = True
safesearch: bool = True safesearch: bool = True
"""DDG-lite: user can't select but the results are filtered.""" """DDG-lite: user can't select but the results are filtered."""
+1
View File
@@ -28,6 +28,7 @@ about = {
"require_api_key": False, "require_api_key": False,
"results": "JSON (site requires js to get images)", "results": "JSON (site requires js to get images)",
} }
language_support = True
# engine dependent config # engine dependent config
categories = [] categories = []
+1
View File
@@ -26,6 +26,7 @@ about = {
"require_api_key": False, "require_api_key": False,
"results": "JSON", "results": "JSON",
} }
language_support = True
# engine dependent config # engine dependent config
categories = ["weather"] categories = ["weather"]
+156
View File
@@ -0,0 +1,156 @@
# SPDX-License-Identifier: AGPL-3.0-or-later
"""DuckDuckGo Web (general)
This implementation fetches the link to the first API page
(i.e. ``links.duckduckgo.com/d.js?...``) from duckduckgo.com and uses the ``n``
parameter of the API to fetch all subsequent pages.
This also means that it's not possible to immediately search for the third
page - the first and the second page would need to be loaded first.
The reason why we can't just normally use the `vqd` value is that the API URLs
require an additional parameter `dp` which seems generated at server-side, so we
can't build it ourselves and must scrape it from the HTML pages.
"""
import typing as t
from urllib.parse import quote_plus
from lxml import html
from searx.utils import html_to_text, gen_useragent, extract_text, eval_xpath
from searx.result_types import EngineResults
from searx.enginelib import EngineCache
from searx.network import get
if t.TYPE_CHECKING:
from searx.extended_types import SXNG_Response
from searx.search.processors import OnlineParams
about = {
"website": "https://duckduckgo.com/",
"wikidata_id": "Q12805",
"use_official_api": False,
"require_api_key": False,
"results": "JSON",
}
# engine dependent config
categories = ["general"]
paging = True
_HTTP_User_Agent: str = gen_useragent()
base_url = "https://duckduckgo.com"
CACHE: EngineCache
"""Cache to store the API URLs for combinations of (query, page)."""
def setup(engine_settings: dict[str, str]):
global CACHE # pylint:disable=global-statement
CACHE = EngineCache(engine_settings["name"])
return CACHE
def _fetch_first_page_link(
query: str,
headers: dict[str, str],
):
"""Search for a::
<link id="deep_preload_link" rel="preload" as="script"
href="https://links.duckduckgo.com/d.js?q=rust&t=D&l=us-en&s=0&a=h_&ct=DE&vqd=VQD_VALUE&bing_market=en-US&p_ent=&ex=-1&dp=LONG_TOKEN
>
This points to the first page
""" # pylint:disable=line-too-long
cache_key = _cache_key(query, 1)
cached: str | None = CACHE.get(cache_key)
if cached:
return cached
resp = get(
url=f"{base_url}/?q={quote_plus(query)}&t=h_&ia=web",
headers=headers,
timeout=2,
)
if resp.status_code != 200:
logger.error("vqd: got HTTP %s from duckduckgo.com", resp.status_code)
dom = html.fromstring(resp.text)
first_page_link = extract_text(eval_xpath(dom, "//link[@id='deep_preload_link']/@href"))
if not first_page_link:
logger.error("vqd: failed to load first page JS url from ddg response (return empty string)")
return ""
logger.debug("got link to first page from duckduckgo.com request: '%s'", first_page_link)
CACHE.set(cache_key, first_page_link, expire=7200)
return first_page_link
def _cache_key(query: str, pageno: int) -> str:
return f"nextpage_url|{query}|{pageno}"
def request(query: str, params: "OnlineParams") -> None:
if len(query) >= 500:
# DDG does not accept queries with more than 499 chars
params["url"] = None
return
headers = params["headers"]
# The vqd value is generated from the query and the UA header. To be able
# to reuse the vqd value, the UA header must be static.
headers["User-Agent"] = _HTTP_User_Agent
headers["Accept"] = "*/*"
headers["Referer"] = f"{base_url}/"
headers["Host"] = "duckduckgo.com"
# Sec-Fetch headers are required to not get blocked when sending a Firefox user agent
headers["Sec-Fetch-Dest"] = "script"
headers["Sec-Fetch-Mode"] = "no-cors"
headers["Sec-Fetch-Site"] = "same-site"
api_url = ""
if params["pageno"] > 1:
api_url = CACHE.get(_cache_key(query, params["pageno"]))
else:
api_url = _fetch_first_page_link(query, headers)
if not api_url:
params["url"] = None
return
params["url"] = api_url.replace("/d.js?", "/d.js?o=json&")
# TODO: support safesearch, timerange and engine traits # pylint:disable=fixme
def response(resp: "SXNG_Response"):
res = EngineResults()
res_json = resp.json()
for result in res_json["results"]:
if "u" not in result:
continue
res.add(
res.types.MainResult(url=result["u"], title=html_to_text(result["t"]), content=html_to_text(result["a"]))
)
# link to next page
next_page_path = res_json["results"][-1].get("n")
if next_page_path:
CACHE.set(
_cache_key(resp.search_params["query"], resp.search_params["pageno"] + 1),
base_url + next_page_path,
expire=60 * 60,
)
return res
+1 -1
View File
@@ -14,8 +14,8 @@ about = {
"use_official_api": False, "use_official_api": False,
"require_api_key": False, "require_api_key": False,
"results": 'HTML', "results": 'HTML',
"language": 'de',
} }
language = "de"
categories = ['dictionaries'] categories = ['dictionaries']
paging = True paging = True
+1 -1
View File
@@ -55,7 +55,7 @@ about = {
'official_api_documentation': 'https://www.elastic.co/guide/en/elasticsearch/reference/current/search-search.html', 'official_api_documentation': 'https://www.elastic.co/guide/en/elasticsearch/reference/current/search-search.html',
'use_official_api': True, 'use_official_api': True,
'require_api_key': False, 'require_api_key': False,
'format': 'JSON', "results": "JSON",
} }
base_url = 'http://localhost:9200' base_url = 'http://localhost:9200'
+169
View File
@@ -0,0 +1,169 @@
# SPDX-License-Identifier: AGPL-3.0-or-later
"""Fireball_ is a Germany-based, privacy-focused search engine.
It likely doesn't have its own index, but it's unclear where its results come
from.
.. _Fireball: https://fireball.com
"""
import typing as t
from datetime import datetime
from urllib.parse import urlencode
from searx.enginelib import EngineCache
from searx.exceptions import SearxEngineAPIException
from searx.extended_types import SXNG_Response
from searx.result_types import EngineResults
from searx.network import post
from searx.utils import html_to_text
if t.TYPE_CHECKING:
from searx.search.processors import OnlineParams
about = {
"website": "https://fireball.com",
"official_api_documentation": None,
"use_official_api": False,
"require_api_key": False,
"results": "JSON",
}
base_url = "https://fireball.com"
categories = ["general"]
fireball_category = "web" # values: "web", "news", "videos"
paging = False
safesearch = True
safe_search_map = {0: "off", 1: "moderate", 2: "strict"}
CACHE: EngineCache
"""Cache to store the settings cookie (contains e.g. language, safesearch, ...)."""
CACHE_VALID_DURATION = 30 * 24 * 3600 # one month, same as website
"""Duration how long settings cookies are valid."""
def init(engine_settings: dict[str, t.Any]):
global CACHE # pylint: disable=global-statement
CACHE = EngineCache(engine_settings["name"])
if fireball_category not in ("web", "news", "videos"):
raise ValueError(f"Unsupported category: {fireball_category}")
def _cache_key(fireball_settings: dict[str, str]) -> str:
return f"fireball_settings_{fireball_settings['safesearch']}_{fireball_settings['market']}"
def _get_search_settings_cookie(params: 'OnlineParams') -> str:
"""Get a 'fireball' cookie for the given locale and safesearch setting set
in params."""
# the language is set by only specifying the search country on their
# website, they only list DE and US, but in fact it supports much more
# countries
country = "US"
if params["searxng_locale"] != "all":
language_parts = params["searxng_locale"].split("-")
country = language_parts[-1].upper()
fireball_settings = {
"action": "save",
"language": "en", # language is irrelevant, only changes UI language
"market": country,
"adprovider": "automatic",
"target": "_blank",
"tiles": "on",
"safesearch": safe_search_map[params["safesearch"]],
}
cache_key = _cache_key(fireball_settings)
cached_cookie = CACHE.get(cache_key)
if cached_cookie:
return cached_cookie
resp = post("https://fireball.com/settings", data=fireball_settings)
if not resp.ok:
raise SearxEngineAPIException("failed to obtain cookie for settings")
cookie = resp.cookies.get("fireball")
if not cookie:
raise SearxEngineAPIException("failed to obtain cookie for settings")
CACHE.set(cache_key, cookie, expire=CACHE_VALID_DURATION)
return cookie
def request(query: str, params: "OnlineParams"):
# no matter the category, the request is always the same, i.e. we get all
# different categories with one HTTP request
args = {
"f": "web",
"q": query,
}
params["url"] = f"{base_url}/getResults/?{urlencode(args)}"
params["cookies"]["fireball"] = _get_search_settings_cookie(params)
# referer header has to be set, otherwise the requests get blocked
params["headers"]["Referer"] = f"{base_url}/search?{urlencode(args)}"
def response(resp: "SXNG_Response") -> EngineResults:
res = EngineResults()
json_data = resp.json()
for result in json_data.get(fireball_category, {}).get("results", []):
published_date = None
if result.get("page_age"):
published_date = datetime.fromisoformat(result["page_age"])
if fireball_category == "web":
res.add(
res.types.MainResult(
url=result["url"],
title=html_to_text(result["title"]),
content=html_to_text(result["description"]),
publishedDate=published_date,
)
)
elif fireball_category == "news":
thumbnail: str | None = None
if result.get("thumbnail"):
thumbnail = result["thumbnail"]["src"]
res.add(
res.types.MainResult(
url=result["url"],
title=html_to_text(result["title"]),
content=html_to_text(result["description"]),
thumbnail=thumbnail or "",
publishedDate=published_date,
)
)
elif fireball_category == "videos":
length = None
if result.get("video"):
length = result["video"].get("duration")
res.add(
res.types.LegacyResult(
{
"template": "videos.html",
"url": result["url"],
"title": html_to_text(result["title"]),
"content": html_to_text(result["description"]),
"thumbnail": result.get("thumbnail", {}).get("original"),
"length": length,
"publishedDate": published_date,
}
)
)
return res
+5 -1
View File
@@ -53,13 +53,17 @@ def response(resp: "SXNG_Response"):
result: dict[str, str] # TBH: dict[str, t.Any] result: dict[str, str] # TBH: dict[str, t.Any]
for result in resp.json()["items"]: for result in resp.json()["items"]:
tags = [
tag_info["tag"] for tag_info in result["tags"] if tag_info["tag"] # pyright: ignore[reportArgumentType]
]
res.add( res.add(
res.types.Image( res.types.Image(
title=result["name"], title=result["name"],
content=", ".join([tag["tag"] for tag in result["tags"]]), # pyright: ignore[reportArgumentType] content=", ".join(tags),
url=_fix_url(result["slug"]), url=_fix_url(result["slug"]),
thumbnail_src=_fix_url(result["png"]), thumbnail_src=_fix_url(result["png"]),
img_src=_fix_url(result["png512"]), img_src=_fix_url(result["png512"]),
img_format="PNG",
author=result["team_name"], author=result["team_name"],
) )
) )
+1 -1
View File
@@ -27,8 +27,8 @@ about = {
'official_api_documentation': None, 'official_api_documentation': None,
'require_api_key': False, 'require_api_key': False,
'results': 'HTML', 'results': 'HTML',
'language': 'de',
} }
language = "de"
paging = True paging = True
categories = ['shopping'] categories = ['shopping']
+1
View File
@@ -57,6 +57,7 @@ max_page = 50
.. _Google max 50 pages: https://github.com/searxng/searxng/issues/2982 .. _Google max 50 pages: https://github.com/searxng/searxng/issues/2982
""" """
time_range_support = True time_range_support = True
language_support = True
safesearch = True safesearch = True
time_range_dict = {"day": "d", "week": "w", "month": "m", "year": "y"} time_range_dict = {"day": "d", "week": "w", "month": "m", "year": "y"}
+1
View File
@@ -43,6 +43,7 @@ max_page = 50
""" """
time_range_support = True time_range_support = True
language_support = True
safesearch = True safesearch = True
filter_mapping = {0: 'images', 1: 'active', 2: 'active'} filter_mapping = {0: 'images', 1: 'active', 2: 'active'}
+1
View File
@@ -66,6 +66,7 @@ about = {
categories = ["news"] categories = ["news"]
paging = False paging = False
time_range_support = False time_range_support = False
language_support = True
# Google-News results are always *SafeSearch*. Option 'safesearch' is set to # Google-News results are always *SafeSearch*. Option 'safesearch' is set to
# False here. # False here.
+90
View File
@@ -0,0 +1,90 @@
# SPDX-License-Identifier: AGPL-3.0-or-later
"""Heexy_ is a minimalist search engine that focuses on privacy.
Although it also supports news and videos, these are not implemented here
because they usually return no result to very few irrelevant ones.
It seems to use Bing internally, as the image thumbnails are loaded from Bing.
.. _Heexy: https://docs.heexy.org/introduction
"""
from urllib.parse import urlencode
import typing as t
from searx.exceptions import SearxEngineAccessDeniedException
from searx.result_types import EngineResults
if t.TYPE_CHECKING:
from searx.extended_types import SXNG_Response
from searx.search.processors import OnlineParams
about = {
"website": "https://heexy.org",
"wikidata_id": None,
"official_api_documentation": None,
"use_official_api": False,
"require_api_key": False,
"results": "JSON",
}
paging = True
safesearch = True
categories = ["general"]
heexy_categ = "web"
"""Category to search in. Can be either "web" or "image"."""
base_url = "https://seapi.heexy.org"
safe_search_map = {0: "off", 1: "on", 2: "on"}
def init(_):
if heexy_categ not in ("web", "image"):
raise ValueError("invalid search category: %s" % heexy_categ)
def request(query: str, params: "OnlineParams") -> None:
args = {
"q": query,
"page": params["pageno"],
"safe": safe_search_map[params["safesearch"]],
}
if params["searxng_locale"] != "all":
args["lang"] = params["searxng_locale"].split("-")[0]
params["url"] = f"{base_url}/search/{heexy_categ}?{urlencode(args)}"
params["headers"]["Origin"] = base_url
def response(resp: "SXNG_Response"):
res = EngineResults()
json_resp = resp.json()
if not json_resp["success"]:
raise SearxEngineAccessDeniedException()
result: dict[str, str]
for result in json_resp["results"]:
if heexy_categ == "web":
res.add(
res.types.MainResult(
url=result["url"],
title=result["title"],
content=result["description"],
)
)
elif heexy_categ == "image":
res.add(
res.types.Image(
title=result["description"],
url=result["url"],
thumbnail_src=result["image"],
img_src=result["rawImage"],
)
)
return res
+1 -1
View File
@@ -34,8 +34,8 @@ about = {
"use_official_api": True, "use_official_api": True,
"require_api_key": False, "require_api_key": False,
"results": "JSON", "results": "JSON",
"language": "it",
} }
language = "it"
def request(query, params): def request(query, params):
+1 -1
View File
@@ -16,8 +16,8 @@ about = {
"use_official_api": False, "use_official_api": False,
"require_api_key": False, "require_api_key": False,
"results": 'HTML', "results": 'HTML',
"language": 'fr',
} }
language = "fr"
# engine dependent config # engine dependent config
categories = ['videos'] categories = ['videos']
+1 -1
View File
@@ -14,9 +14,9 @@ about = {
"use_official_api": False, "use_official_api": False,
"require_api_key": False, "require_api_key": False,
"results": "JSON", "results": "JSON",
"language": "zh",
} }
language = "zh"
paging = True paging = True
time_range_support = True time_range_support = True
categories = ["videos"] categories = ["videos"]
+3 -3
View File
@@ -13,8 +13,8 @@ about = {
"use_official_api": True, "use_official_api": True,
"require_api_key": False, "require_api_key": False,
"results": 'JSON', "results": 'JSON',
"language": 'ja',
} }
language = "ja"
categories = ['dictionaries'] categories = ['dictionaries']
paging = False paging = False
@@ -110,8 +110,8 @@ def get_infobox(alt_forms, result_url, definitions):
# definitions # definitions
infobox_content.append( infobox_content.append(
''' '''
<small><a href="https://www.edrdg.org/wiki/index.php/JMdict-EDICT_Dictionary_Project">JMdict</a> <small><a href="https://www.edrdg.org/wiki/index.php/JMdict-EDICT_Dictionary_Project">JMdict</a>
and <a href="https://www.edrdg.org/enamdict/enamdict_doc.html">JMnedict</a> and <a href="https://www.edrdg.org/enamdict/enamdict_doc.html">JMnedict</a>
by <a href="https://www.edrdg.org/edrdg/licence.html">EDRDG</a>, CC BY-SA 3.0.</small> by <a href="https://www.edrdg.org/edrdg/licence.html">EDRDG</a>, CC BY-SA 3.0.</small>
<ul> <ul>
''' '''
+3
View File
@@ -79,6 +79,9 @@ from json import loads
from urllib.parse import urlencode from urllib.parse import urlencode
from searx.utils import to_string, html_to_text from searx.utils import to_string, html_to_text
from searx.network import raise_for_httperror from searx.network import raise_for_httperror
from searx.enginelib import EngineAbout
about = EngineAbout()
search_url = None search_url = None
""" """
+190
View File
@@ -0,0 +1,190 @@
# SPDX-License-Identifier: AGPL-3.0-or-later
"""Kagi_ is a paid, privacy-focused search engine.
Using it requires an API key. If you have a Kagi account, you can obtain an API
key in the `API portal`_.
To enable Kagi, add the following to the ``engines`` seciton of
``settings.yml``:
.. code:: yaml
- name: kagi
engine: kagi
categories: [general, web]
shortcut: kg
api_key: ""
kagi_categ: search
- name: kagi.news
engine: kagi
categories: [news, web]
shortcut: kgn
api_key: ""
kagi_categ: news
- name: kagi.images
engine: kagi
categories: [images, web]
shortcut: kgi
paging: false
api_key: ""
kagi_categ: images
- name: kagi.videos
engine: kagi
categories: [videos, web]
shortcut: kgv
api_key: ""
kagi_categ: videos
.. _Kagi: https://kagi.com
.. _Api Portal: https://help.kagi.com/kagi/api/overview.html
"""
from datetime import datetime, timedelta
import typing as t
import html
from searx.extended_types import SXNG_Response
from searx.result_types import EngineResults
from searx.utils import parse_duration_string
if t.TYPE_CHECKING:
from searx.search.processors import OnlineParams
TimeRangeType = t.Literal["day", "week", "month", "year"]
about = {
"website": "https://kagi.com",
"wikidata_id": "Q26000117",
"official_api_documentation": "https://kagi.com/api/docs/openapi",
"use_official_api": True,
"require_api_key": True,
"results": "JSON",
}
paging = True
"""All categories except the ``images`` category support paging."""
safesearch = True
time_range_support = True
categories = ["general"]
kagi_categ: t.Literal["search", "images", "news", "videos"] = "search"
"""Search category. Supported values: "search" (general), "images", "news", "videos"."""
base_url = "https://kagi.com"
safe_search_map = {0: False, 1: True, 2: True}
time_range_to_days_map: dict[TimeRangeType, int] = {"day": 1, "week": 7, "month": 30, "year": 365}
api_key = ""
"""Kagi API key. Required for using this engine."""
def init(_):
if not api_key:
raise ValueError("api_key is required for using kagi")
if kagi_categ not in ("search", "images", "news", "videos"):
raise ValueError(f"Unsupported category: {kagi_categ}") # pyright: ignore[reportUnreachable]
def request(query: str, params: "OnlineParams"):
# According to the API docs, Kagi supports at maximum page 10
if params["pageno"] > 10:
return
params["headers"]["Authorization"] = f"Bearer {api_key}"
params["url"] = f"{base_url}/api/v1/search"
filters = {}
time_range = params.get("time_range")
if time_range:
# Kagi expects the minimum date to return results from as argument to `after`
time_period = timedelta(days=time_range_to_days_map[time_range])
oldest_result_date = datetime.now() - time_period
filters["after"] = oldest_result_date.strftime("%Y-%m-%d")
# there doesn't seem to be a list of languages anywhere,
# so we just assume that it supports all languages
filters["region"] = "no_region"
if params["searxng_locale"] != "all":
_locale = params["searxng_locale"].split("-")
if len(_locale) > 1:
filters["region"] = _locale[-1].lower()
args: dict[str, t.Any] = {
"query": query,
"page": params["pageno"],
"workflow": kagi_categ,
"safe_search": safe_search_map[params["safesearch"]],
"filters": filters,
}
params["method"] = "POST"
params["json"] = args
def response(resp: "SXNG_Response") -> EngineResults:
res = EngineResults()
json_data: dict[str, t.Any] = resp.json()
if kagi_categ in ("images", "videos"):
# the JSON key is "image" for "images" and "video" for "videos"
json_results = json_data["data"][kagi_categ[:-1]]
else:
json_results = json_data["data"][kagi_categ]
for result in json_results:
published_date: datetime | None = None
if result.get("time"):
published_date = datetime.fromisoformat(result["time"])
if kagi_categ in ("search", "news"):
res.add(
res.types.MainResult(
url=result["url"],
title=html.unescape(result["title"]),
content=html.unescape(result["snippet"]),
thumbnail=result.get("image", {}).get("url") or "",
publishedDate=published_date,
)
)
elif kagi_categ == "images":
res.add(
res.types.Image(
url=result["url"],
title=html.unescape(result.get("title")),
img_src=result.get("image", {}).get("url"),
resolution=f"{result['image']['width']}x{result['image']['height']}",
thumbnail_src=result.get("props", {}).get("thumbnail", {}).get("url"),
)
)
elif kagi_categ == "videos":
length: timedelta | None = None
if result["props"].get("duration"):
length = parse_duration_string(result["props"]["duration"])
res.add(
res.types.LegacyResult(
{
"template": "videos.html",
"url": result["url"],
"title": html.unescape(result["title"]),
"content": html.unescape(result["snippet"]),
"thumbnail": result.get("image", {}).get("url"),
"publishedDate": published_date,
"author": result["props"].get("creator_name"),
"length": length,
}
)
)
for suggestion in json_data["data"].get("related_search", []):
res.add(res.types.LegacyResult({"suggestion": suggestion["title"]}))
return res
-205
View File
@@ -1,205 +0,0 @@
# SPDX-License-Identifier: AGPL-3.0-or-later
"""Karmasearch uses Brave's index, so the results should be the same as Brave's.
However, the advantages of this engine are:
- it has less strict rate-limits
- it has a JSON API, so it's less likely to break
"""
from datetime import datetime
from urllib.parse import urlencode
import typing as t
from dateutil import parser
from searx.enginelib.traits import EngineTraits
from searx.utils import html_to_text
from searx.result_types import EngineResults, MainResult
from searx.result_types._base import LegacyResult
if t.TYPE_CHECKING:
from searx.extended_types import SXNG_Response
from searx.search.processors import OnlineParams
about = {
"website": "https://karmasearch.org",
"official_api_documentation": None,
"use_official_api": False,
"require_api_key": False,
"results": "JSON",
}
base_url = "https://api.karmasearch.org"
categories = ["web", "general"]
search_type = "web" # supported: web, images, videos, news
# all types except "images" support pagination
paging = True
safesearch = True
time_range_support = True
safe_search_map = {0: "off", 1: "moderate", 2: "strict"}
time_range_map = {"day": "Day", "week": "Week", "month": "Month", "year": "Year"}
def init(_):
if search_type not in ("web", "images", "videos", "news"):
raise ValueError(f"invalid search type: {search_type}")
def request(query: str, params: "OnlineParams") -> None:
engine_region: str = traits.get_region(params["searxng_locale"]) or "en-US"
args: dict[str, str | int] = {
"searchTerm": query,
"adultFilter": safe_search_map[params["safesearch"]],
"pageNumber": params["pageno"],
"country": engine_region.split("-")[-1],
"userLanguage": "en", # UI language: en, es or fr / no effect on search results
"market": engine_region,
}
if params["time_range"]:
args["freshness"] = time_range_map[params["time_range"]]
# Needed to circumvent Cloudflare bot protection
params['headers']['Referer'] = "https://karmasearch.org"
params["url"] = f"{base_url}/search/{search_type}?{urlencode(args)}"
def _parse_date(date_string: str) -> datetime | None:
try:
return parser.parse(date_string)
except parser.ParserError:
return None
def _parse_general(result: dict[str, str]):
return MainResult(
url=result["url"],
title=result["title"],
content=html_to_text(result["description"]),
thumbnail=result.get("thumbnail", ""),
)
def _parse_news(result: dict[str, str]) -> LegacyResult:
return LegacyResult(
{
"url": result["url"],
"title": result["title"],
"content": html_to_text(result["description"]),
"thumbnail": result.get("thumbnail"),
"publishedDate": _parse_date(result.get("age", "")),
}
)
def _parse_videos(result: dict[str, t.Any]) -> LegacyResult:
return LegacyResult(
{
"template": "videos.html",
"url": result["url"],
"title": result["title"],
"content": html_to_text(result["description"]),
"thumbnail": result.get("thumbnail"),
"publishedDate": _parse_date(result.get("age", "")),
"length": result.get("video", {}).get("duration"),
}
)
def _parse_images(result: dict[str, t.Any]) -> LegacyResult:
return LegacyResult(
{
"template": "images.html",
"url": result["url"],
"title": result["title"],
"content": "",
"img_src": result.get("properties", {}).get("url"),
"thumbnail_src": result.get("thumbnail", {}).get("src"),
}
)
def response(resp: "SXNG_Response") -> EngineResults:
res = EngineResults()
json_resp: dict[str, t.Any] = resp.json()
if not isinstance(json_resp, dict):
return res # pyright: ignore[reportUnreachable]
for result in json_resp["results"]:
# hide sponsored results
if result.get("sponsored", False):
continue
if "videos" in result:
for videos_result in result["videos"]:
res.add(_parse_videos(videos_result))
continue
if "news" in result:
for news_result in result["news"]:
res.add(_parse_news(news_result))
continue
if search_type == "news":
res.add(_parse_news(result))
elif search_type == "videos":
res.add(_parse_videos(result))
elif search_type == "images":
res.add(_parse_images(result))
else:
res.add(_parse_general(result))
return res
def fetch_traits(engine_traits: EngineTraits):
"""Fetch :ref:`languages <brave languages>` and :ref:`regions <brave
regions>` from Brave."""
# pylint: disable=import-outside-toplevel, too-many-branches
from lxml import html
import babel
from searx.locales import region_tag
from searx.network import get # see https://github.com/searxng/searxng/issues/762
# from searx.engines.xpath import extract_text
from searx.utils import gen_useragent
headers = {
"Accept-Encoding": "gzip, deflate",
"Cache-Control": "no-cache",
"DNT": "1",
"Connection": "keep-alive",
"Accept-Language": "en,en-US;q=0.7,en;q=0.3",
"User-Agent": gen_useragent(),
}
resp = get("https://karmasearch.org/settings", headers=headers, timeout=5)
if not resp.ok:
raise RuntimeError("Response from Brave languages is not OK.")
dom = html.fromstring(resp.text)
for option in dom.xpath("//select[@name='country']/option"):
country_tag: str = option.get("value", "")
try:
sxng_tag = region_tag(babel.Locale.parse(country_tag, sep="-"))
except babel.UnknownLocaleError:
# silently ignore unknown languages
continue
# print("%-20s: %s <-- %s" % (extract_text(option), country_tag, sxng_tag))
conflict = engine_traits.regions.get(sxng_tag)
if conflict:
if conflict != country_tag:
print("CONFLICT: babel %s --> %s, %s" % (sxng_tag, conflict, country_tag))
continue
engine_traits.regions[sxng_tag] = country_tag
+210
View File
@@ -0,0 +1,210 @@
# SPDX-License-Identifier: AGPL-3.0-or-later
"""Luxxle_ is an American search engine focusing on providing "unbiased"
results.
.. _Luxxle: https://luxxle.com
"""
from json import dumps
from urllib.parse import quote_plus, unquote_plus
import typing as t
from lxml import html
from searx.result_types import EngineResults
from searx.network import get
from searx.utils import (
extr,
gen_useragent,
eval_xpath_list,
extract_text,
eval_xpath,
parse_duration_string,
ElementType,
)
if t.TYPE_CHECKING:
from searx.search.processors import OnlineParams
from searx.extended_types import SXNG_Response
about = {
"website": "https://luxxle.com",
"official_api_documentation": None,
"use_official_api": False,
"require_api_key": False,
"results": "HTML",
}
categories = []
safeseach = True
base_url = "https://luxxle.com"
luxxle_categ = "search"
"""Supported categories: "search", "news", "images", "videos"."""
# otherwise all requests get blocked (http2-fingerprinted probably)
enable_http2 = False
safe_search_map = {0: "Off", 1: "Moderate", 2: "Strict"}
def init(_):
if luxxle_categ not in ("search", "images", "videos", "news"):
raise ValueError("invalid luxxle category: %s" % luxxle_categ)
def _obtain_telemetry_data(query: str) -> dict[str, str]:
"""This data is required for sending search queries.
The luxsearch page (for general results) has a JS dict called ``telemetryData``
that contains all the important info, but the others don't, so we don't use it
here. But it's useful to understand which info is needed.
.. code-block:: javascript
var telemetryData = {
errorInformation: errorInformation,
query: "youapps club",
ip: "10.10.10.10",
timeOf: "1781119224",
authorization: "db889e0ae67d3c320858ad97f51cc4f0a4d8e1913c4f5ebe5d2eafef606521dd",
};
This data is only valid for very short times
"""
resp = get(
f"{base_url}/lux{luxxle_categ}?q={quote_plus(query)}", headers={"User-Agent": gen_useragent(), "Sec-GPC": "1"}
)
def extr_js_variable(name: str) -> str:
val = extr(resp.text, f"var {name} = \"", "\";")
if not val:
val = extr(resp.text, f"var {name} = '", "';")
return val
return {
"ip": extr_js_variable("ip"),
"timeOf": extr_js_variable("timeOf"),
"authorization": extr_js_variable("authorization"),
"preferencesCookie": extr_js_variable("preferencesCookie"),
}
def request(query: str, params: "OnlineParams") -> None:
telemetry_data = _obtain_telemetry_data(query)
market = params["searxng_locale"]
if market == "all":
market = "en-US"
params["url"] = f"{base_url}/load_{luxxle_categ}.php"
search_data = {
**telemetry_data,
"query": query,
"market": market,
"safeSearch": safe_search_map[params["safesearch"]],
"freshness": "",
"language": "english", # UI language
}
if luxxle_categ == "images":
# for some reason this is sent as form data
params["data"] = {"searchData": dumps(search_data)}
else:
params["json"] = {"searchData": search_data}
params["method"] = "POST"
def _extract_url_from_redirect(url: str):
# urls usually look like "/redirect?url=<url>"
query_start_idx = url.find("?url=")
if query_start_idx < 0:
return url
url_start_idx = query_start_idx + len("?url=")
return unquote_plus(url[url_start_idx:])
def _general_results(doc: ElementType, res: EngineResults):
for result in eval_xpath_list(doc, "//div[@id='mainResults']/div[contains(@class, 'resultsContainer')]"):
res.add(
res.types.MainResult(
url=_extract_url_from_redirect(
extract_text(eval_xpath(result, "./div[contains(@class, 'urlAddressLink')]/a/@href")) or ""
),
title=extract_text(eval_xpath(result, "./div[contains(@class, 'urlname')]")) or "",
content=extract_text(eval_xpath(result, "./div[contains(@class, 'urlSnippet')]")) or "",
)
)
def _news_results(doc: ElementType, res: EngineResults):
for result in eval_xpath_list(
doc, "//div[contains(@class, 'newsResults')]/div[contains(@class, 'mediaResultNewsPage')]"
):
res.add(
res.types.MainResult(
url=_extract_url_from_redirect(
extract_text(eval_xpath(result, ".//div[contains(@class, 'mediaResultNewsPageTitle')]/a/@href"))
or ""
),
title=extract_text(eval_xpath(result, ".//div[contains(@class, 'mediaResultNewsPageTitle')]/a")) or "",
content=extract_text(eval_xpath(result, ".//div[contains(@class, 'mediaResultNewsPageDescription')]"))
or "",
thumbnail=extract_text(eval_xpath(result, ".//div[contains(@class, 'mediaResultThumbnail')]//img/@src"))
or "",
)
)
def _video_results(doc: ElementType, res: EngineResults):
for result in eval_xpath_list(doc, "//div[@id='mainResults']/div[contains(@class, 'mediaResult')]"):
res.add(
res.types.MainResult(
template="videos.html",
url=extract_text(eval_xpath(result, "./@data-url")) or "",
title=extract_text(eval_xpath(result, ".//div[contains(@class, 'mediaResultTitleVideo')]/a")) or "",
content=extract_text(eval_xpath(result, ".//div[contains(@class, 'mediaResultDescription')]")) or "",
thumbnail=extract_text(eval_xpath(result, ".//img[contains(@class, 'videoThumbnail')]/@src")) or "",
author=extract_text(eval_xpath(result, ".//div[contains(@class, 'videoCreator')]")) or "",
length=parse_duration_string(
extract_text(eval_xpath(result, ".//span[contains(@class, 'mediaResultDuration')]")) or ""
),
)
)
def _image_results(doc: ElementType, res: EngineResults):
for result in eval_xpath_list(doc, "//div[contains(@class, 'imageResultsWrapper')]/div"):
res.add(
res.types.Image(
url=_extract_url_from_redirect(
extract_text(eval_xpath(result, ".//a[contains(@class, 'imageResultSource')]/@href")) or ""
),
title=extract_text(eval_xpath(result, ".//a[contains(@class, 'imageResultTitle')]")) or "",
source=extract_text(eval_xpath(result, ".//div[contains(@class, 'imageResultSource')]")) or "",
thumbnail_src=extract_text(eval_xpath(result, "./@data-thumbnail-src")) or "",
img_src=extract_text(eval_xpath(result, "./@data-image-src")) or "",
)
)
def response(resp: "SXNG_Response") -> EngineResults:
doc = html.fromstring(resp.text)
res = EngineResults()
match luxxle_categ:
case "search":
_general_results(doc, res)
case "images":
_image_results(doc, res)
case "videos":
_video_results(doc, res)
case "news":
_news_results(doc, res)
case _:
raise ValueError("unsupported category: %s" % luxxle_categ)
return res
+1 -1
View File
@@ -11,9 +11,9 @@ about = {
"use_official_api": True, "use_official_api": True,
"require_api_key": False, "require_api_key": False,
"results": 'JSON', "results": 'JSON',
"language": "de",
} }
language = "de"
categories = ['videos'] categories = ['videos']
paging = True paging = True
time_range_support = False time_range_support = False
+1
View File
@@ -20,6 +20,7 @@ about = {
} }
paging = True # paging is only supported for general search paging = True # paging is only supported for general search
safesearch = True safesearch = True
language_support = True
time_range_support = True # time range search is supported for general and news time_range_support = True # time range search is supported for general and news
max_page = 10 max_page = 10
+2 -1
View File
@@ -35,8 +35,9 @@ about = {
'use_official_api': False, 'use_official_api': False,
'require_api_key': False, 'require_api_key': False,
'results': 'JSON', 'results': 'JSON',
'language': 'de',
} }
language = "de"
paging = True paging = True
categories = ["movies"] categories = ["movies"]
+1 -1
View File
@@ -26,8 +26,8 @@ about = {
"use_official_api": False, "use_official_api": False,
"require_api_key": False, "require_api_key": False,
"results": "HTML", "results": "HTML",
"language": "ko",
} }
language = "ko"
categories = [] categories = []
paging = True paging = True
+1 -1
View File
@@ -13,8 +13,8 @@ about = {
"use_official_api": False, "use_official_api": False,
"require_api_key": False, "require_api_key": False,
"results": "HTML", "results": "HTML",
"language": "ja",
} }
language = "ja"
categories = ["videos"] categories = ["videos"]
paging = True paging = True
+2 -10
View File
@@ -4,7 +4,6 @@
.. _Odysee: https://github.com/OdyseeTeam/odysee-frontend .. _Odysee: https://github.com/OdyseeTeam/odysee-frontend
""" """
import time
from datetime import datetime from datetime import datetime
from urllib.parse import urlencode from urllib.parse import urlencode
@@ -12,6 +11,7 @@ import babel
from searx.enginelib.traits import EngineTraits from searx.enginelib.traits import EngineTraits
from searx.locales import language_tag from searx.locales import language_tag
from searx.utils import format_duration
# Engine metadata # Engine metadata
about = { about = {
@@ -26,6 +26,7 @@ about = {
# Engine configuration # Engine configuration
paging = True paging = True
time_range_support = True time_range_support = True
language_support = True
results_per_page = 20 results_per_page = 20
categories = ["videos"] categories = ["videos"]
@@ -61,15 +62,6 @@ def request(query, params):
return params return params
# Format the video duration
def format_duration(duration):
seconds = int(duration)
length = time.gmtime(seconds)
if length.tm_hour:
return time.strftime("%H:%M:%S", length)
return time.strftime("%M:%S", length)
def response(resp): def response(resp):
data = resp.json() data = resp.json()
results = [] results = []
+1
View File
@@ -25,6 +25,7 @@ about = {
"require_api_key": False, "require_api_key": False,
"results": "JSON", "results": "JSON",
} }
language_support = True
# engine dependent config # engine dependent config
categories = ["videos"] categories = ["videos"]
+2 -2
View File
@@ -28,7 +28,7 @@ search_string = 'api/?{query}&limit={limit}'
result_base_url = 'https://openstreetmap.org/{osm_type}/{osm_id}' result_base_url = 'https://openstreetmap.org/{osm_type}/{osm_id}'
# list of supported languages # list of supported languages
supported_languages = ['de', 'en', 'fr', 'it'] photon_supported_languages = ["de", "en", "fr", "it"]
# do search-request # do search-request
@@ -37,7 +37,7 @@ def request(query, params):
if params['language'] != 'all': if params['language'] != 'all':
language = params['language'].split('_')[0] language = params['language'].split('_')[0]
if language in supported_languages: if language in photon_supported_languages:
params['url'] = params['url'] + "&lang=" + language params['url'] = params['url'] + "&lang=" + language
# using SearXNG User-Agent # using SearXNG User-Agent
+62
View File
@@ -0,0 +1,62 @@
# SPDX-License-Identifier: AGPL-3.0-or-later
"""Podchaser (podcasts)"""
import typing as t
from datetime import datetime
from urllib.parse import urlencode
from searx.result_types import EngineResults
if t.TYPE_CHECKING:
from searx.extended_types import SXNG_Response
from searx.search.processors import OnlineParams
about = {
"website": "https://www.podchaser.com",
"official_api_documentation": "https://www.podchaser.com/api",
"use_official_api": False,
"require_api_key": False,
"results": "JSON",
}
categories = []
paging = True
base_url = "https://api.podchaser.com"
page_size = 25
def request(query: str, params: "OnlineParams") -> None:
args = {
"filters[term]": query,
"limit": page_size,
"offset": (params["pageno"] - 1) * page_size,
"sort_direction": "desc",
"sort_order": "SORT_ORDER_RELEVANCE",
}
params["url"] = f"{base_url}/podcasts?{urlencode(args)}"
params["headers"]["Accept"] = "application/prs.podchaser.v2+json"
def response(resp: "SXNG_Response"):
res = EngineResults()
json_results: list[dict[str, str]] = resp.json()["entities"] # pyright: ignore[reportAny]
for result in json_results:
metadata = [f"{result['number_of_episodes']} episodes"]
if result["categories"]:
metadata.append(", ".join(c["text"] for c in result["categories"])) # pyright: ignore[reportArgumentType]
res.add(
res.types.MainResult(
url=result["feed_url"],
title=result["title"],
content=result["description"],
thumbnail=result["image_url"],
publishedDate=datetime.strptime(result["created_at"], "%Y-%m-%d %H:%M:%S"),
metadata=" | ".join(metadata),
)
)
return res
+1 -1
View File
@@ -77,7 +77,7 @@ from searx.utils import gen_useragent, html_to_text, parse_duration_string
about = { about = {
"website": "https://presearch.io", "website": "https://presearch.io",
"wikidiata_id": "Q7240905", "wikidata_id": "Q7240905",
"official_api_documentation": "https://docs.presearch.io/nodes/api", "official_api_documentation": "https://docs.presearch.io/nodes/api",
"use_official_api": False, "use_official_api": False,
"require_api_key": False, "require_api_key": False,
+217
View File
@@ -0,0 +1,217 @@
# SPDX-License-Identifier: AGPL-3.0-or-later
"""Privacywall_ claims to be a "privacy-friendly" search engine,
but according to a `Privacyguides discussion`_ it's sharing private
user information with Microsoft and Amazon.
.. _Privacywall : https://www.privacywall.org
.. _`Privacyguides discussion` : https://discuss.privacyguides.net/t/how-is-privacy-wall-search-engine/29486
"""
import typing as t
from urllib.parse import urlencode, unquote_plus
from lxml import html
import babel
from searx.enginelib.traits import EngineTraits
from searx.utils import eval_xpath_list, eval_xpath, extract_text, get_embeded_stream_url, extr
from searx.locales import region_tag
from searx.result_types import EngineResults
if t.TYPE_CHECKING:
from lxml.etree import ElementBase
from searx.extended_types import SXNG_Response
from searx.search.processors import OnlineParams
about = {
"website": "https://privacywall.org",
"wikidata_id": None,
"official_api_documentation": None,
"use_official_api": False,
"require_api_key": False,
"results": "HTML",
}
paging = True
safesearch = True
time_range_support = True
base_url = "https://www.privacywall.org"
privacywall_category = "general"
"""Supported categories are ``general``, ``videos`` and ``images``."""
# corresponds to the "k" query param
safesearch_map = {0: "off", 1: "on", 2: "on"}
# page number sent for videos (is independent of the query) - certainly there's
# a pattern in this, but for our use case it's enough to just support the first
# 10 pages by hardcoding the page "numbers"
video_page_map = {
2: "CAoQAA",
3: "CBQQAA",
4: "CB4QAA",
5: "CCgQAA",
6: "CDIQAA",
7: "CDwQAA",
8: "CEYQAA",
9: "CFAQAA",
10: "CFoQAA",
}
def init(_):
if privacywall_category not in ("general", "images", "videos"):
raise ValueError("invalid category: %s" % privacywall_category)
def request(query: str, params: "OnlineParams") -> None:
if params["pageno"] > 10:
params["url"] = None
return
args = {"q": query, "safesearch": safesearch_map[params["safesearch"]]}
if params["searxng_locale"] != "all":
args["cc"] = traits.get_region(params["searxng_locale"]) or "US"
if params["time_range"]:
# time range uses the same "day", "week", "month", "year" naming scheme as SearXNG
args["time"] = params["time_range"]
if params["pageno"] > 1:
if privacywall_category == "images":
args["page"] = str(params["pageno"])
elif privacywall_category == "videos":
args["page"] = video_page_map[params["pageno"]]
else:
raise ValueError("general engine does not support pagination")
if privacywall_category == "general":
params["url"] = f"{base_url}/search/secure/?{urlencode(args)}"
else:
params["url"] = f"{base_url}/{privacywall_category}/?{urlencode(args)}"
def _general_results(doc: "ElementBase") -> EngineResults:
res = EngineResults()
for result in eval_xpath_list(doc, "//div[@id='pw-results-main']/div[contains(@class, 'result-card')]"):
(
res.add(
res.types.MainResult(
url=extract_text(eval_xpath(result, ".//a[contains(@class, 'result-url-anchor')]/@href")) or "",
title=extract_text(eval_xpath(result, ".//div[contains(@class, 'result_title')]")) or "",
content=extract_text(eval_xpath(result, ".//div[contains(@class, 'result-description')]")) or "",
),
)
)
return res
def _extract_thumbnail_url(url: str) -> str:
"""
Get the URL from strings like "/videos/video.php?id=<urlencoded-urlhere>".
"""
url_start = url.find("?id=") + len("?id=")
thumbnail = unquote_plus(url[url_start:])
return thumbnail
def _image_results(doc: "ElementBase") -> EngineResults:
res = EngineResults()
for result in eval_xpath_list(doc, "//div[@id='container']/div[contains(@class, 'imgcontainer')]"):
(
res.add(
res.types.Image(
url=extract_text(eval_xpath(result, "./a/@href")) or "",
content=extract_text(eval_xpath(result, "./a/@alt")) or "",
thumbnail_src=_extract_thumbnail_url(extract_text(eval_xpath(result, ".//img/@src")) or ""),
source=extract_text(eval_xpath(result, ".//div[contains(@class, 'image-source-badge')]")) or "",
),
)
)
return res
def _video_results(doc: "ElementBase") -> EngineResults:
res = EngineResults()
for result in eval_xpath_list(
doc, "//div[contains(@class, 'video-container')]/div[contains(@class, 'video-card')]"
):
url = extract_text(eval_xpath(result, "./a/@href")) or ""
if not url:
continue
thumbnail = None
# looks like <div style="background-image:url(/videos/video.php?id=<urlencoded-urlhere>);position:relative">
thumbnail_style = extract_text(eval_xpath(result, ".//div[contains(@class, 'video-img')]/@style"))
if thumbnail_style:
thumbnail = _extract_thumbnail_url(extr(thumbnail_style, ":url(", ")"))
res.add(
res.types.LegacyResult(
template="videos.html",
url=url,
title=extract_text(eval_xpath(result, ".//h2[contains(@class, 'video-card-title')]")) or "",
content=extract_text(eval_xpath(result, ".//p")) or "",
thumbnail=thumbnail or "",
iframe_src=get_embeded_stream_url(url) or "",
)
)
return res
def response(resp: "SXNG_Response") -> EngineResults:
doc = html.fromstring(resp.text)
match privacywall_category:
case "general":
return _general_results(doc)
case "images":
return _image_results(doc)
case "videos":
return _video_results(doc)
case _:
raise ValueError("invalid category: %s" % privacywall_category)
def fetch_traits(engine_traits: EngineTraits) -> None:
"""Fetch regions from Bing-Web."""
# pylint: disable=import-outside-toplevel
from searx.network import get # see https://github.com/searxng/searxng/issues/762
from searx.utils import gen_useragent
headers = {
"User-Agent": gen_useragent(),
}
resp = get(base_url, headers=headers)
if not resp.ok:
raise RuntimeError("Response from Privacywall is not OK.")
dom = html.fromstring(resp.text)
# <div class="dropdown-option" onclick="changeMenuLanguage(&quot;CZ&quot;)"></div>
for onclick_listener in eval_xpath(
dom, "//div[contains(@class, 'lang-menu')]//div[contains(@class, 'dropdown-option')]/@onclick"
):
# this is either a normal lang-country tag (e.g. cs-cz) or only a country code (e.g. de, at, ...)
country_tag = extr(onclick_listener, "(\"", "\")")
# the locale tag is only a country tag, so we get languages the from the list of official languages
# of the country
lang_tag: str
for lang_tag in babel.languages.get_official_languages(country_tag, de_facto=True): # pyright: ignore
try:
sxng_tag = region_tag(babel.Locale.parse(f"{lang_tag}_{country_tag.upper()}"))
except babel.UnknownLocaleError:
# silently ignore unknown languages
continue
conflict = engine_traits.regions.get(sxng_tag)
if conflict:
if conflict != sxng_tag:
print("CONFLICT: babel %s --> %s" % (sxng_tag, conflict))
continue
engine_traits.regions[sxng_tag] = country_tag
+1 -1
View File
@@ -16,8 +16,8 @@ about = {
"use_official_api": False, "use_official_api": False,
"require_api_key": False, "require_api_key": False,
"results": "HTML", "results": "HTML",
"language": "zh",
} }
language = "zh"
# Engine Configuration # Engine Configuration
categories = [] categories = []
+15 -1
View File
@@ -6,6 +6,7 @@
""" """
import os
import random import random
import socket import socket
from urllib.parse import urlencode from urllib.parse import urlencode
@@ -25,6 +26,7 @@ about = {
"require_api_key": False, "require_api_key": False,
"results": "JSON", "results": "JSON",
} }
language_support = True
paging = True paging = True
categories = ["music", "radio"] categories = ["music", "radio"]
@@ -59,7 +61,19 @@ seconds."""
def init(_): def init(_):
global CACHE # pylint: disable=global-statement global CACHE # pylint: disable=global-statement
CACHE = EngineCache("radio_browser") CACHE = EngineCache("radio_browser")
server_list()
# In an environment with competing processes, the initial loading of the
# cache is required only once.
eng_state: str | None = CACHE.get("eng_state")
if not eng_state or not eng_state.startswith("STATE:"):
CACHE.set("eng_state", f"STATE: being initialized by PID {os.getpid()}")
try:
server_list()
except Exception:
CACHE.set("eng_state", f"ERROR: initialization by PID {os.getpid()} failed.")
raise
else:
logger.debug(eng_state)
def server_list() -> list[str]: def server_list() -> list[str]:
+120
View File
@@ -0,0 +1,120 @@
# SPDX-License-Identifier: AGPL-3.0-or-later
"""Resulthunter_ is an American search engine with results from Brave.
.. _Resulthunter : https://resulthunter.com
"""
import typing as t
from urllib.parse import urlencode
from lxml import html
from searx import locales
from searx.result_types import EngineResults
from searx.utils import eval_xpath_list, eval_xpath, extract_text
# as it uses brave internally, it has the same locales and timerange/safesearch types
from searx.engines.brave import safesearch_map, time_range_map, fetch_traits # pylint: disable=unused-import
if t.TYPE_CHECKING:
from lxml.etree import ElementBase
from searx.extended_types import SXNG_Response
from searx.search.processors import OnlineParams
from searx.enginelib.traits import EngineTraits
traits: EngineTraits
about = {
"website": "https://resulthunter.com",
"wikidata_id": None,
"official_api_documentation": None,
"use_official_api": False,
"require_api_key": False,
"results": "HTML",
}
paging = True
safesearch = True
time_range_support = True
base_url = "https://resulthunter.com"
resulthunter_categ = "web"
"""Supported categories are ``web`` and ``images``."""
def init(_):
if resulthunter_categ not in ("web", "images"):
raise ValueError("invalid category: %s" % resulthunter_categ)
def request(query: str, params: "OnlineParams") -> None:
args = {
"q": query,
"search_type": resulthunter_categ,
"offset": params["pageno"] - 1,
}
# uses Brave's engine traits
ui_lang = locales.get_engine_locale(params["searxng_locale"], traits.custom["ui_lang"], "all")
if ui_lang and ui_lang != "all":
args["search_lang"] = ui_lang.split("-")[0]
engine_region = traits.get_region(params["searxng_locale"], "all")
if engine_region and engine_region != "all":
args["country"] = engine_region
if params["time_range"]:
args["freshness"] = time_range_map[params["time_range"]]
params["cookies"]["safesearch"] = safesearch_map[params["safesearch"]]
params["url"] = f"{base_url}/search?{urlencode(args)}"
def _general_results(doc: "ElementBase") -> EngineResults:
res = EngineResults()
for result in eval_xpath_list(
doc, "//div[contains(@class, 'organic-results-container')]/div/div[contains(@class, 'group')]"
):
url = extract_text(eval_xpath(result, ".//a/@href"))
if not url:
continue
(
res.add(
res.types.MainResult(
url=url,
title=extract_text(eval_xpath(result, ".//a/h3")) or "",
content=extract_text(eval_xpath(result, ".//p")) or "",
),
)
)
return res
def _image_results(doc: "ElementBase") -> EngineResults:
res = EngineResults()
for result in eval_xpath_list(
doc, "//div[contains(@class, 'organic-results-container')]//a[contains(@class, 'group')]"
):
(
res.add(
res.types.Image(
url=extract_text(eval_xpath(result, "./@href")) or "",
title=extract_text(eval_xpath(result, "./img/@alt")) or "",
thumbnail_src=extract_text(eval_xpath(result, "./img/@src")) or "",
),
)
)
return res
def response(resp: "SXNG_Response") -> EngineResults:
doc = html.fromstring(resp.text)
match resulthunter_categ:
case "web":
return _general_results(doc)
case "images":
return _image_results(doc)
case _:
raise ValueError("invalid resulthunter category: %s" % resulthunter_categ)
+98
View File
@@ -0,0 +1,98 @@
# SPDX-License-Identifier: AGPL-3.0-or-later
"""Search engines by System1 (general).
System1 is an advertising company, and provides all its search engines as a
subdomain of ``s1search.co``. As a result, it has more than 1000 subdomains, of
which some work, and some don't.
Some of the engines get their results from Google, others get them from Yahoo.
"""
import typing as t
from urllib.parse import urlencode, urlparse, parse_qs
from lxml import html
from searx.result_types import EngineResults
from searx.enginelib import EngineCache
from searx.utils import eval_xpath_list, eval_xpath, extract_text
if t.TYPE_CHECKING:
from searx.search.processors import OnlineParams
from searx.extended_types import SXNG_Response
about = {
"website": "https://s1search.co",
"official_api_documentation": None,
"use_official_api": False,
"require_api_key": False,
"results": "HTML",
}
base_url = "" # alternatively: search.gmx.net
categories = ["general"]
paging = True
CACHE: EngineCache
"""Cache to store verification tokens for pagination."""
def init(_):
if not base_url:
raise ValueError("base_url must be set")
def setup(engine_settings: dict[str, t.Any]) -> bool:
global CACHE # pylint: disable=global-statement
CACHE = EngineCache(engine_settings["name"])
return True
def _cache_key(query: str, pageno: int) -> str:
return f"{query}|{pageno}"
def request(query: str, params: "OnlineParams"):
args = {"q": query, "page": params["pageno"]}
if params["pageno"] > 1:
sc = CACHE.get(_cache_key(query, params["pageno"]))
# sc is required for pagination to avoid rate-limits
if not sc:
params["url"] = None
return
args["sc"] = sc
params["url"] = f"{base_url}/serp?{urlencode(args)}"
def response(resp: "SXNG_Response") -> EngineResults:
res = EngineResults()
doc = html.fromstring(resp.text)
for suggestion in eval_xpath_list(doc, "//div[@class='aylf-yahoo-bottom' or @class='aylf-yahoo-sidebar']/div"):
res.add(res.types.LegacyResult({"suggestion": extract_text(suggestion)}))
for result in eval_xpath_list(
doc, "//div[contains(@class, 'web-yahoo') or contains(@class, 'web-google')]/div[contains(@class, '__result')]"
):
res.add(
res.types.MainResult(
url=extract_text(eval_xpath(result, ".//a[contains(@class, 'title')]/@href")),
title=extract_text(eval_xpath(result, ".//a[contains(@class, 'title')]")),
content=extract_text(eval_xpath(result, ".//span[contains(@class, 'description') or @class='']")),
)
)
# store pagination keys to be able to access next pages
for page_href in eval_xpath_list(doc, "//a[contains(@class, 'pagination__num')]"):
# target_url looks like "/serp?q=test&page=2&sc=RVlBPMDPVhWR20"
target_url = extract_text(eval_xpath(page_href, "./@href"))
target_url = parse_qs(urlparse(target_url).query)
pageno = int(target_url["page"][0])
sc = target_url["sc"][0]
CACHE.set(_cache_key(resp.search_params["query"], pageno), sc)
return res
+113
View File
@@ -0,0 +1,113 @@
# SPDX-License-Identifier: AGPL-3.0-or-later
"""Seek ninja (general)"""
from json import loads
from hashlib import sha256
from urllib.parse import urlencode, quote_plus
import typing as t
from searx.extended_types import SXNG_Response
from searx.network import get
from searx.result_types import EngineResults
from searx.utils import extr, html_to_text
if t.TYPE_CHECKING:
from searx.search.processors import OnlineParams
about = {
"website": "https://seek.ninja",
"official_api_documentation": None,
"use_official_api": False,
"require_api_key": False,
"results": "JSON",
}
safesearch = True
base_url = "https://seek.ninja"
categories = ["general"]
safe_search_map = {0: "off", 1: "moderate", 2: "strict"}
PowChallenge = dict[str, t.Any]
def _get_challenge(query: str) -> PowChallenge:
"""Extract the challenge parameters (i.e. nonce, difficulty, ...) from the
search website."""
resp = get(f"{base_url}/s?q={quote_plus(query)}")
challenge_raw_json = "{" + extr(resp.text, "pow: {", "},") + "}"
return loads(challenge_raw_json)
def _solve_pow(challenge: PowChallenge) -> list[int]:
"""Solves a Proof of Work SHA256 challenges. This is a 1:1 port of the
site's JS code.
On a high-level, it tries to ``k`` amount of solutions, where its sha256
hash begins with: ``leading`` 0s, i.e.
.. code: js
sha256(nonce || solution).startswith("0" * leading)
"""
nonce = challenge["nonce"]
k = int(challenge["k"])
indifficulty = float(challenge["indifficulty"])
leading = int(indifficulty)
frac = indifficulty - leading
prefix = "".join("0" for _ in range(0, leading))
maxNib = 15 - int(frac * 16) if frac else 15
solutions: list[int] = []
ans = 0
while len(solutions) < k:
h = sha256(f"{nonce}{ans}".encode()).hexdigest()
if h.startswith(prefix) and (not frac or int(h[leading], base=16) <= maxNib):
solutions.append(ans)
ans += 1
return solutions
def request(query: str, params: 'OnlineParams') -> None:
challenge = _get_challenge(query)
solution = _solve_pow(challenge)
args = {
"q": query,
"panswers": ",".join(str(s) for s in solution),
"pid": challenge["challengeId"],
"adult": safe_search_map[params["safesearch"]],
}
params["url"] = f"{base_url}/search-sse?{urlencode(args)}"
def response(resp: 'SXNG_Response') -> EngineResults:
res = EngineResults()
# The response is a stream of server-side events,
# so it is split into `event: <type>` and `data: {"results": ...}`
# see https://developer.mozilla.org/en-US/docs/Web/API/Server-sent_events/
events = resp.text.split("\n\n")
for event in events:
event_parts = event.split("\n", maxsplit=2)
if len(event_parts) != 2:
continue
event_name, data = event_parts
if not event_name.endswith("resultsUpdate"):
continue
json_data = loads(data.removeprefix("data: "))
for result in json_data["results"]:
res.add(
res.types.MainResult(
url=result["url"],
title=result["title"],
content=html_to_text(result["blurb"]),
)
)
return res
+1 -1
View File
@@ -13,8 +13,8 @@ about = {
"use_official_api": False, "use_official_api": False,
"require_api_key": False, "require_api_key": False,
"results": 'JSON', "results": 'JSON',
'language': 'fr',
} }
language = "fr"
categories = ['movies'] categories = ['movies']
paging = True paging = True
+1
View File
@@ -25,6 +25,7 @@ about = {
"require_api_key": False, "require_api_key": False,
"results": 'JSON', "results": 'JSON',
} }
language_support = True
# engine dependent config # engine dependent config
categories = ['videos'] categories = ['videos']
+1 -1
View File
@@ -19,8 +19,8 @@ about = {
"use_official_api": False, "use_official_api": False,
"require_api_key": False, "require_api_key": False,
"results": "HTML", "results": "HTML",
"language": "cz",
} }
language = "cz"
categories = ['general', 'web'] categories = ['general', 'web']
base_url = 'https://search.seznam.cz/' base_url = 'https://search.seznam.cz/'
+1 -1
View File
@@ -16,8 +16,8 @@ about = {
"use_official_api": False, "use_official_api": False,
"require_api_key": False, "require_api_key": False,
"results": "HTML", "results": "HTML",
"language": "zh",
} }
language = "zh"
# Engine Configuration # Engine Configuration
categories = ["general"] categories = ["general"]
+1 -1
View File
@@ -11,8 +11,8 @@ about = {
"use_official_api": False, "use_official_api": False,
"require_api_key": False, "require_api_key": False,
"results": "JSON", "results": "JSON",
"language": "zh",
} }
language = "zh"
categories = ["videos"] categories = ["videos"]
paging = True paging = True
+1 -1
View File
@@ -14,8 +14,8 @@ about = {
"use_official_api": False, "use_official_api": False,
"require_api_key": False, "require_api_key": False,
"results": "HTML", "results": "HTML",
"language": "zh",
} }
language = "zh"
# Engine Configuration # Engine Configuration
categories = ["news"] categories = ["news"]
+5 -1
View File
@@ -131,6 +131,7 @@ max_page = 18
"""Tested 18 pages maximum (argument ``page``), to be save max is set to 20.""" """Tested 18 pages maximum (argument ``page``), to be save max is set to 20."""
time_range_support = True time_range_support = True
language_support = True
safesearch = True safesearch = True
time_range_dict = {"day": "d", "week": "w", "month": "m", "year": "y"} time_range_dict = {"day": "d", "week": "w", "month": "m", "year": "y"}
@@ -382,6 +383,9 @@ def _get_image_result(result) -> dict[str, t.Any] | None:
size_str = "".join(filter(str.isdigit, result["filesize"])) size_str = "".join(filter(str.isdigit, result["filesize"]))
filesize = humanize_bytes(int(size_str)) filesize = humanize_bytes(int(size_str))
img_format = result.get("format").upper()
if img_format == "UNKNOWN":
img_format = ""
return { return {
"template": "images.html", "template": "images.html",
"url": url, "url": url,
@@ -390,7 +394,7 @@ def _get_image_result(result) -> dict[str, t.Any] | None:
"img_src": result.get("rawImageUrl"), "img_src": result.get("rawImageUrl"),
"thumbnail_src": thumbnailUrl, "thumbnail_src": thumbnailUrl,
"resolution": resolution, "resolution": resolution,
"img_format": result.get("format"), "img_format": img_format,
"filesize": filesize, "filesize": filesize,
} }
+287
View File
@@ -0,0 +1,287 @@
# SPDX-License-Identifier: AGPL-3.0-or-later
# pylint: disable=invalid-name
"""Swisscows (general, images, videos)"""
import typing as t
import base64
import codecs
import hashlib
import json
import random
from datetime import datetime
from urllib.parse import urlencode
from babel.core import get_global
from searx.result_types import EngineResults, LegacyResult # pyright: ignore[reportPrivateLocalImportUsage]
from searx.utils import humanize_number, html_to_text
if t.TYPE_CHECKING:
from searx.extended_types import SXNG_Response
from searx.search.processors import OnlineParams
about = {
"website": "https://swisscows.com",
"wikidata_id": "Q22937452",
"official_api_documentation": None,
"use_official_api": False,
"require_api_key": False,
"results": "JSON",
}
categories = ["general"]
swisscows_category = "web" # possible: "web", "videos", "images"
results_per_page = 50
time_range_support = True
paging = True
base_url = "https://api.swisscows.com"
CAESAR_ALPHABET = "ABCDEFGHIJKLMNOPQRSTUVWXYZ"
NONCE_ALPHABET = "ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz0123456789-._~"
time_range_map = {"day": "Day", "week": "Week", "month": "Month", "year": "Year"}
# fmt: off
swisscows_regions: list[str] = [
"AR", "AU", "AT", "BE", "BR", "CA", "CL", "CN", "DK", "FI",
"FR", "DE", "HK", "HU", "IN", "ID", "IT", "JP", "KR", "LV",
"MY", "MX", "NL", "NZ", "NO", "PH", "PL", "PT", "RU", "SA",
"ZA", "ES", "SE", "CH", "TW", "TR", "UA", "GB", "US"
]
"""Regions supported by swisscows."""
# fmt: on
# swisscows_languages = [
# "GB", "DE", "ES", "FR", "IT", "LV", "HU", "NL", "PT", "RU", "UA"
# ]
def appropriate_locale(searxng_locale: str, regions: list[str], default: str) -> str:
"""Returns the appropriate swisscows locale for the region or language
selected by the user. If no value is determined, ``default`` is returned
"""
_locale = searxng_locale.split("-")
if _locale[0] == "all":
return default
if len(_locale) == 1 or _locale[1] in regions:
return searxng_locale
sxng_lang = _locale[0]
if sxng_lang.upper() in regions:
return f"{sxng_lang}-{sxng_lang.upper()}"
likely_subtag: str | None = get_global("likely_subtags").get(sxng_lang)
if likely_subtag:
_tag: list[str] = likely_subtag.split("_")
if _tag[-1] in regions:
return f"{_tag[0]}-{_tag[-1]}"
return default
def generate_nonce(length: int = 32) -> str:
"""
Generate a random char sequence with the given length.
"""
return "".join([random.choice(NONCE_ALPHABET) for _ in range(length)])
def caesar_shift_with_switch_case(s: str, offset: int = 13) -> str:
"""
Caesar shift by :py:obj:`offset` that additionally inverts the casing of all letters
(i.e. from lowercase to uppercase and vice versa).
"""
out = ""
for c in s:
if c.upper() in CAESAR_ALPHABET:
alphabet_index = ord(c.upper()) - ord("A")
shifted = CAESAR_ALPHABET[(alphabet_index + offset) % len(CAESAR_ALPHABET)]
case_switched = shifted.lower() if c.isupper() else shifted.upper()
out += case_switched
else:
out += c
return out
def sha256_hash_b64_url(s: str) -> str:
"""
Calculate the SHA256 hash and base64 URL-encodes it.
"""
hasher = hashlib.sha256()
hasher.update(s.encode())
hashed_bytes = hasher.digest()
# hashlib generates a byte digest, but since we need to convert it to base64, we
# need to do that by hand
hash_base64 = codecs.encode(hashed_bytes, "base64").decode("utf-8").rstrip('\n')
hash_base64_url_encoded = hash_base64.replace("=", "").replace("+", '-').replace("/", '_')
return hash_base64_url_encoded
def generate_nonce_and_signature(base_path: str, args: dict[str, t.Any]) -> tuple[str, str]:
"""
Generate "X-Request-Nonce" and "X-Request-Signature" which are required for accessing
Swisscows images (reverse engineered from their official website).
"""
nonce = generate_nonce()
nonce_shifted = caesar_shift_with_switch_case(nonce, 13)
# in the path, all keys must be sorted in alphabetic order,
# otherwise the generated signature won't be accepted!
# additionally, the values may not be URL encoded, they have to be plain text
# hence we don't use urlencode here
args_sorted = sorted(args.items(), key=lambda arg: arg[0])
query_string = "&".join(f"{key}={value}" for (key, value) in args_sorted)
full_path = f"{base_path}?{query_string}"
signature = sha256_hash_b64_url(full_path + nonce_shifted)
return (nonce, signature)
maximum_page_size = {"web": 20, "images": 50, "videos": 10}
def init(_):
if swisscows_category not in ("web", "images", "videos"):
raise ValueError("illegal swisscows category: %s" % swisscows_category)
if results_per_page > maximum_page_size[swisscows_category]:
raise ValueError(
"results_per_page for swisscows %s can be at most %d"
% (swisscows_category, maximum_page_size[swisscows_category])
)
def request(query: str, params: "OnlineParams") -> None:
# swisscows images only supports 2 pages
if swisscows_category == "images" and params["pageno"] > 2:
params["url"] = None
return
locale = appropriate_locale(params["searxng_locale"], swisscows_regions, "en-US")
base_path = ""
args = dict[str, t.Any]
if swisscows_category == "web":
freshness = "All"
if params["time_range"]:
freshness = time_range_map[params["time_range"]]
args = {
"freshness": freshness,
"itemsCount": results_per_page,
"locale": locale,
"offset": (params["pageno"] - 1) * results_per_page,
"query": query,
"spellcheck": True,
}
base_path = "/v5/web/search"
elif swisscows_category == "images":
args = {
"itemsCount": results_per_page,
"locale": locale,
"offset": (params["pageno"] - 1) * results_per_page,
"query": query,
"spellcheck": True,
}
base_path = "/v5/images/search"
else:
args = {
"itemsCount": results_per_page,
"offset": (params["pageno"] - 1) * results_per_page,
"query": query,
"region": locale,
"spellcheck": True,
}
base_path = "/v2/videos/search"
nonce, signature = generate_nonce_and_signature(base_path, args)
params["headers"].update(
{
"X-Request-Nonce": nonce,
"X-Request-Signature": signature,
}
)
params["url"] = f"{base_url}{base_path}?{urlencode(args)}"
def _video_result(result: dict[str, str]) -> LegacyResult:
published_date = None
if result.get("datePublished"):
published_date = datetime.fromisoformat(result["datePublished"])
view_count = None
if result.get("viewCount"):
view_count = humanize_number(result["viewCount"]) # pyright: ignore[reportArgumentType]
return LegacyResult(
{
"template": "videos.html",
"url": result["url"],
"title": html_to_text(result.get("title") or result["name"]),
"content": result["description"],
"thumbnail": result.get("thumbnailUrl")
or result.get("thumbnail", {}).get("url"), # pyright: ignore[reportAttributeAccessIssue]
"length": result.get("duration"),
"iframe_src": result.get("embedUrl"),
"publishedDate": published_date,
"views": view_count,
}
)
def response(resp: "SXNG_Response") -> EngineResults:
res = EngineResults()
json_data = resp.json()
# the payload encoding is only used for general and images,
# for videos the data gets returned directly as a normal JSON response
# payload is encoded as a JSON web token -> 3 parts, separated by "."
# the actual data is in the center of the encoded string
if "payload" in json_data:
payload = json_data["payload"].split(".")[1]
# pad with '=' to be valid base64
payload = payload + '=' * (4 - len(payload) % 4)
decoded = base64.urlsafe_b64decode(payload)
json_data = json.loads(decoded.decode())
result: dict[str, t.Any]
for result in json_data["items"]:
if result["type"] == "WebPage":
res.add(
res.types.MainResult(
url=result["url"],
title=result["name"],
content=html_to_text(result["description"]),
thumbnail=result.get("thumbnail", {}).get("url"),
)
)
elif swisscows_category == "videos" and result["type"] == "VideoCollection":
for video in result["hasPart"]:
res.add(_video_result(video))
elif result["type"] == "ImageObject":
res.add(
res.types.LegacyResult(
{
"template": "images.html",
"url": result["url"],
"thumbnail_src": result["thumbnail"]["url"],
"img_src": result["contentUrl"],
"title": result["name"],
}
)
)
elif result["type"] == "video":
res.add(_video_result(result))
return res
+83
View File
@@ -0,0 +1,83 @@
# SPDX-License-Identifier: AGPL-3.0-or-later
# pylint: disable=invalid-name
"""Swisscows news"""
from datetime import datetime
from urllib.parse import urlencode
import typing as t
from searx.utils import html_to_text
from searx.result_types import EngineResults
from searx.engines.swisscows import appropriate_locale
if t.TYPE_CHECKING:
from searx.extended_types import SXNG_Response
from searx.search.processors import OnlineParams
about = {
"website": "https://swisscows.com",
"wikidata_id": "Q22937452",
"official_api_documentation": None,
"use_official_api": False,
"require_api_key": False,
"results": "JSON",
}
categories = ["news"]
results_per_page = 20
time_range_support = True
paging = True
base_url = "https://api.swisscows.com"
time_range_map = {"day": "Day", "week": "Week", "month": "Month", "year": "Year"}
swisscows_regions: list[str] = ["DE"]
"""Regions supported by swisscows News."""
def request(query: str, params: "OnlineParams") -> None:
sxng_locale = params["searxng_locale"].split("-", maxsplit=1)[0]
locale: str = appropriate_locale(sxng_locale, swisscows_regions, default="de-DE")
if not locale:
return
freshness = "All"
if params["time_range"]:
freshness = time_range_map[params["time_range"]]
args = {
"query": query,
"itemsCount": results_per_page,
"region": locale,
"language": locale.split("-", maxsplit=1)[0],
"offset": (params["pageno"] - 1) * results_per_page,
"freshness": freshness,
"sortOrder": "Desc",
"sortBy": "Created",
}
url_path = f"/news/search?{urlencode(args)}"
params["url"] = base_url + url_path
def response(resp: "SXNG_Response") -> EngineResults:
res = EngineResults()
result: dict[str, str]
for result in resp.json()["items"]: # pyright: ignore[reportAny]
res.add(
res.types.MainResult(
url=result["uri"],
title=html_to_text(result["title"]),
content=result["description"],
publishedDate=datetime.fromisoformat(result["created"]),
thumbnail=result.get("og:image") or "",
)
)
return res
+2 -1
View File
@@ -27,8 +27,9 @@ about = {
'use_official_api': True, 'use_official_api': True,
'require_api_key': False, 'require_api_key': False,
'results': 'JSON', 'results': 'JSON',
'language': 'de',
} }
language = "de"
categories = ['general', 'news'] categories = ['general', 'news']
paging = True paging = True
+170
View File
@@ -0,0 +1,170 @@
# SPDX-License-Identifier: AGPL-3.0-or-later
"""Tiger_ is a Swiss meta search engine.
.. _Tiger: https://tiger.ch
"""
from json import loads
import random
from urllib.parse import urlencode
import typing as t
from dateutil import parser
from lxml import html
from searx.exceptions import SearxEngineAPIException
from searx.extended_types import SXNG_Response
from searx.network import get, post
from searx.result_types import EngineResults
from searx.utils import extr, eval_xpath_list, eval_xpath, extract_text
from searx.enginelib import EngineCache
if t.TYPE_CHECKING:
from searx.search.processors import OnlineParams
about = {
"website": "https://tiger.ch",
"official_api_documentation": None,
"use_official_api": False,
"require_api_key": False,
"results": "HTML",
}
paging = True
base_url = "https://tiger.ch"
categories = []
tiger_category = "Websuche"
"""
Possible values: "Websuche", "News".
"""
CACHE: EngineCache
"""Cache to store session codes (result of solved CAPTCHA)."""
def init(_):
if tiger_category not in ("Websuche", "News"):
raise ValueError("invalid search category: %s" % tiger_category)
def setup(engine_settings: dict[str, t.Any]) -> bool:
global CACHE # pylint: disable=global-statement
CACHE = EngineCache(engine_settings["name"])
return True
def _obtain_session_code() -> str:
"""The challenge works like this:
- We first generate 3 random numbers.
- Then we send them to /Human.svc/Make to get the operands (+, -) for the
math challenge (i.e. a simple calculation)
- Based on the operands, we calculate a result (usually done by the user by
hand)
- We send the result of the math calculation to the server to obtain a
session "code" that has to be sent as cookie parameter for all searches
E.g., challenges look like ``19-3+5``.
"""
cached_session = CACHE.get("session")
if cached_session:
return cached_session
results_page = get(f"{base_url}/_internCode.aspx")
doc = html.fromstring(results_page.text)
extra_data: dict[str, str] = {}
for extra_param in ("__VIEWSTATE", "__VIEWSTATEGENERATOR", "__EVENTVALIDATION"):
extra_data[extra_param] = doc.xpath(f"//input[@name='{extra_param}']/@value")[0]
# var z1 = Math.floor((Math.random() * 8) + 11);
# var z2 = Math.floor((Math.random() * 8) + 1);
# var z3 = Math.floor((Math.random() * 8) + 1);
num1 = random.randint(11, 19)
num2 = random.randint(1, 9)
num3 = random.randint(1, 9)
challenge = get(f"{base_url}/Services/Human.svc/Make?M1={num1}&M2={num2}&M3={num3}", cookies=results_page.cookies)
signs = loads(challenge.json()["d"])[0]
sign1 = signs["Z1"]
sign2 = signs["Z2"]
result = num1
for num, sign in [(num2, sign1), (num3, sign2)]:
if sign == "+":
result += num
else:
result -= num
logger.debug(f"got challenge: {num1} {sign1} {num2} {sign2} {num3} = {result}")
data = {
**extra_data,
"txtM": str(result),
"btnHuman": "OK",
}
challenge_response = post(
f"{base_url}/_internCode.aspx",
cookies=results_page.cookies,
data=data,
)
cookie = challenge_response.cookies["Tiger.ch"]
code = extr(cookie, "Code=", "&")
if not code:
raise SearxEngineAPIException("failed to obtain session code")
CACHE.set("session", code, expire=60 * 24 * 60) # cookie is valid for two months
return code
def request(query: str, params: "OnlineParams"):
code = _obtain_session_code()
args = {"w": query, "page": params["pageno"]}
params["url"] = f"{base_url}/{tiger_category}?{urlencode(args)}"
params["cookies"]["Tiger.ch"] = f"Code={code}"
def response(resp: "SXNG_Response") -> EngineResults:
res = EngineResults()
doc = html.fromstring(resp.text)
if tiger_category == "Websuche":
for result in eval_xpath_list(doc, "//div[@id='mainContainer']//table/tr"):
url = extract_text(eval_xpath(result, ".//a[contains(@class, 'weblink')]/@href"))
if not url:
continue
res.add(
res.types.MainResult(
url=url,
title=extract_text(eval_xpath(result, ".//a[contains(@class, 'weblink')]")) or "",
content=extract_text(eval_xpath(result, ".//*[contains(@class, 'webbodynopic')]")) or "",
)
)
elif tiger_category == "News":
for result in eval_xpath_list(doc, "//div[@id='panNews']/div"):
publishedDate = None
try:
date_str = extract_text(eval_xpath(result, ".//span[contains(@class, 'help')]/span")) or ""
date_str = date_str.strip().removeprefix("-").strip()
publishedDate = parser.parse(date_str)
except parser.ParserError:
pass
thumbnail = extract_text(eval_xpath(result, "./img/@src"))
if thumbnail:
thumbnail = base_url + thumbnail
res.add(
res.types.MainResult(
url=extract_text(eval_xpath(result, ".//a[contains(@class, 'webLink')]/@href")),
title=extract_text(eval_xpath(result, ".//a[contains(@class, 'webLink')]")) or "",
thumbnail=thumbnail or "",
publishedDate=publishedDate,
)
)
return res
+148
View File
@@ -0,0 +1,148 @@
# SPDX-License-Identifier: AGPL-3.0-or-later
"""T-Online_ is a German news portal, which is powered by Ströer, a German
advertising company, not by Deutsche Telekom (contrary to its name).
It gets its web results from Google, image results from Flickr and videos
results from YouTube.
.. _T-Online: https://www.t-online.de/
"""
import typing as t
from urllib.parse import urlencode
from lxml import html
from searx.utils import eval_xpath_list, eval_xpath, extract_text, get_embeded_stream_url, ElementType
from searx.result_types import EngineResults
from searx.enginelib import EngineAbout
if t.TYPE_CHECKING:
from searx.extended_types import SXNG_Response
from searx.search.processors import OnlineParams
about = EngineAbout(
website="https://www.t-online.de",
wikidata_id="Q590940",
results="HTML",
)
paging = True
time_range_support = True
base_url = "https://suche.t-online.de"
tonline_categ = "web"
"""Supported categories are ``web``, ``videos``, ``news`` and ``images``."""
time_range_map = {"day": "d", "week": "w", "month": "m", "year": "y"}
# result provider has to be specified during pagination, pagination can alternatively
# use "tonline" to only search for results from t-online news articles
tonline_channel_map = {"images": "flickr", "videos": "yt"}
language = "de"
def init(_):
if tonline_categ not in ("web", "images", "videos", "news"):
raise ValueError("invalid category: %s" % tonline_categ)
def request(query: str, params: "OnlineParams") -> None:
# "mandant", "dia" and "ptl" are not needed, but this might reduce changes of captchas
args = {"q": query, "mandant": "toi", "dia": "suche", "ptl": "std"}
if params["time_range"]:
args["age"] = time_range_map[params["time_range"]]
if params["pageno"] > 1 and tonline_categ in tonline_channel_map:
ch = tonline_channel_map[tonline_categ]
args["ch"] = ch
args[f"{ch}_page"] = str(params["pageno"])
else:
args["page"] = str(params["pageno"])
params["url"] = f"{base_url}/{tonline_categ}?{urlencode(args)}"
def _general_results(doc: ElementType, res: EngineResults):
result: ElementType
for result in eval_xpath_list(doc, "//div[@id='google_re']/div[contains(@class, 'doc')]"):
(
res.add(
res.types.MainResult(
url=extract_text(eval_xpath(result, "./a/@href") or ""),
title=extract_text(eval_xpath(result, ".//span[contains(@class, 'tMMReshl')]") or "") or "",
content=extract_text(eval_xpath(result, ".//div[contains(@class, 'tMMRest')]") or "") or "",
),
)
)
suggestion: ElementType
for suggestion in eval_xpath_list(doc, "//div[starts-with(@class, 'rsbl')]/a"):
res.add(res.types.LegacyResult({"suggestion": extract_text(suggestion)}))
def _image_results(doc: ElementType, res: EngineResults):
result: ElementType
for result in eval_xpath_list(doc, "//div[@class='doc']"):
(
res.add(
res.types.Image(
url=extract_text(eval_xpath(result, "./a/@href") or ""),
title=extract_text(eval_xpath(result, ".//div[contains(@class, 'doc_info')]") or "") or "",
thumbnail_src=extract_text(eval_xpath(result, ".//img/@src") or "") or "",
),
)
)
def _news_results(doc: ElementType, res: EngineResults):
result: ElementType
title_parts: list[ElementType]
for result in eval_xpath_list(doc, "//div[@id='portal_re']/div[contains(@class, 'doc')]"):
title_parts = eval_xpath(result, ".//a[starts-with(@class, 'tMMReshl')]")
(
res.add(
res.types.MainResult(
url=extract_text(eval_xpath(result, "(./a/@href)[1]") or ""),
title=" - ".join(extract_text(part) or "" for part in title_parts),
content=extract_text(eval_xpath(result, ".//div[contains(@class, 'tMMRest')]") or "") or "",
thumbnail=extract_text(eval_xpath(result, ".//img[contains(@class, 'desk')]/@src") or "") or "",
),
)
)
def _video_results(doc: ElementType, res: EngineResults):
result: ElementType
for result in eval_xpath_list(doc, "//div[@class='doc']"):
url: str | None = extract_text(eval_xpath(result, "./a/@href") or "")
if url is None:
continue
title_parts: list[ElementType] = eval_xpath(result, ".//a[starts-with(@class, 'tMMReshl')]")
res.add(
res.types.LegacyResult(
template="videos.html",
url=url,
title=" - ".join(extract_text(part) or "" for part in title_parts),
thumbnail=extract_text(eval_xpath(result, ".//img/@src") or "") or "",
iframe_src=get_embeded_stream_url(url) or "",
)
)
def response(resp: "SXNG_Response") -> EngineResults:
doc = html.fromstring(resp.text)
res = EngineResults()
match tonline_categ:
case "web":
_general_results(doc, res)
case "news":
_news_results(doc, res)
case "images":
_image_results(doc, res)
case "videos":
_video_results(doc, res)
case _:
raise ValueError("invalid category: %s" % tonline_categ)
return res
+114
View File
@@ -0,0 +1,114 @@
# SPDX-License-Identifier: AGPL-3.0-or-later
"""Vuhuv_ is a Turkish search engine, that also provides English results.
.. _Vuhuv : https://vuhuv.com
"""
import typing as t
from urllib.parse import urlencode
from lxml import html
from searx.result_types import EngineResults
from searx.utils import eval_xpath_list, eval_xpath, extract_text
if t.TYPE_CHECKING:
from lxml.etree import ElementBase
from searx.extended_types import SXNG_Response
from searx.search.processors import OnlineParams
about = {
"website": "https://vuhuv.com",
"wikidata_id": None,
"official_api_documentation": None,
"use_official_api": False,
"require_api_key": False,
"results": "HTML",
}
paging = True
base_url = "https://vuhuv.com"
vuhuv_category = "general"
"""Supported categories are ``general``, ``videos`` and ``images``."""
# corresponds to the "k" query param
category_map = {"general": 1, "images": 2, "videos": 3}
def init(_):
if vuhuv_category not in category_map:
raise ValueError("invalid category: %s" % vuhuv_category)
def request(query: str, params: "OnlineParams") -> None:
# the purpose of "d" and "dh" are unknown, but the website
# sends them, and without them the results are different
args = {"k": category_map[vuhuv_category], "p": params["pageno"], "q": query, "d": 1, "dh": 1}
params["url"] = f"{base_url}/veri2/?{urlencode(args)}"
params["headers"]["Referer"] = f"{base_url}/"
def _general_results(doc: "ElementBase") -> EngineResults:
res = EngineResults()
for result in eval_xpath_list(doc, "//div[contains(@class, 'sonuc')]/div"):
(
res.add(
res.types.MainResult(
url=extract_text(eval_xpath(result, "./a/@href")) or "",
title=extract_text(eval_xpath(result, "./a/span")) or "",
content=extract_text(eval_xpath(result, "./ins")) or "",
),
)
)
return res
def _image_results(doc: "ElementBase") -> EngineResults:
res = EngineResults()
for result in eval_xpath_list(doc, "//div[contains(@class, 'item gorsel')]"):
(
res.add(
res.types.Image(
url=extract_text(eval_xpath(result, "./a/@href")) or "",
title=extract_text(eval_xpath(result, "./a/@title")) or "",
resolution=extract_text(eval_xpath(result, "div[contains(@class, 'olculeri')]")) or "",
thumbnail_src="https:" + str(extract_text(eval_xpath(result, "./@data-kgorsel"))),
img_src=extract_text(eval_xpath(result, "./@data-resimurl")) or "",
),
)
)
return res
def _video_results(doc: "ElementBase") -> EngineResults:
res = EngineResults()
for result in eval_xpath_list(doc, "//div[contains(@class, 'item video')]"):
(
res.add(
res.types.MainResult(
template="videos.html",
url=extract_text(eval_xpath(result, "./a/@href")) or "",
title=extract_text(eval_xpath(result, "./a/@title")) or "",
content=extract_text(eval_xpath(result, ".//div[contains(@class, 'abaslik')]")) or "",
thumbnail=extract_text(eval_xpath(result, "./@data-kgorsel")) or "",
iframe_src=extract_text(eval_xpath(result, "./@data-embedurl")) or "",
),
)
)
return res
def response(resp: "SXNG_Response") -> EngineResults:
doc = html.fromstring(resp.text)
match vuhuv_category:
case "general":
return _general_results(doc)
case "images":
return _image_results(doc)
case "videos":
return _video_results(doc)
case _:
raise ValueError("invalid vuhuv category: %s" % vuhuv_category)

Some files were not shown because too many files have changed in this diff Show More