47 Commits

Author SHA1 Message Date
Bnyro 952896d29e [feat] image results: automatically guess mimetype based on path 2026-06-22 12:46:22 +02:00
Bnyro 4cc32b2457 [fix] kozmonavt: remove pagination and set to inactive by default
Pagination requires a different nextpage query parameter each
day as it seems, so it's not possible to implement this in the Xpath
engine.
2026-06-22 10:06:09 +02:00
Bnyro cce0957f54 [feat] engines: add support for iseek.com (general) 2026-06-22 09:51:57 +02:00
Bnyro 9375c0a6b6 [feat] engines: add netherlands startpagina (general, videos, images, news) 2026-06-22 09:50:19 +02:00
Bnyro a702741e4e [feat] engines: add giphy (images/videos) 2026-06-22 09:49:47 +02:00
Bnyro aeced67249 [feat] engines: add findfiles.net file search engine
FindFiles.net is a specialized file search engine designed to help you search
files online with precision. Unlike traditional search engines that mainly index
web pages, FindFiles focuses on finding real files on the internet - including
PDFs, documents, archives, videos, datasets, and more. [1]

[1] https://findfiles.net
2026-06-22 09:44:27 +02:00
Bnyro 199e03de1d [feat] engines: add kozmonavt.su (general) 2026-06-22 09:42:55 +02:00
Bnyro 9cd2439e5e [feat] engines: add kukei.eu (general) 2026-06-22 09:42:45 +02:00
Bnyro 9f4d8bca02 [feat] engines: add xonaly.com (general) 2026-06-22 09:41:29 +02:00
Bnyro de76a4a39b [feat] engines: add cl0q.com (foss domain search) 2026-06-22 09:41:18 +02:00
Bnyro a85a5e2794 [feat] engines: add unobtanium.rocks (personal websites search) 2026-06-22 09:41:07 +02:00
Bnyro 92abd98a55 [feat] engines: add tusksearch (web, news, videos, images) (#6267)
The code that reads the value of variable `x` from `embed.js`, decodes
it to ASCII and based on that sets `window["tuskheader"]` and `window["tuskkey"]`
is attached below. The only real way to figure out what this is doing is
by stepping through it with the debugger, otherwise it's almost hopeless.

```js
function fe() {
  const B = pe => pe.map(_e => String.fromCharCode(_e)).join(''),
  ae = window,
  o = ae.x;
  if (o?.length) {
    const pe = o.length / 2;
    for (let _e = 0; _e < pe; _e++) ae[B(o[_e])] = B(o[pe + _e]);
    ae.x = void 0
  }
}
```

Minimal script for testing the engine:

```py
import random
from json import loads
import requests

resp = requests.get("https://api.tusksearch.com/revcontent/embed.js")
data = loads(resp.text[6:])

def _decode(text: list[int]) -> str:
    return "".join([chr(x) for x in text])

header = _decode(data[3])
value = _decode(data[4])

resp = requests.get(
    "https://api.tusksearch.com/Search/Web?q=test&p=1&l=center&nextArgs=&prevArgs=",
    # "https://api.tusksearch.com/Search/Image?q=test&p=1&l=center",
    headers={
        header: value,
        'x-lon': str(random.random() * 90),
        'x-lat': str(random.random() * 90),
    },
)
print(resp.text)
```
2026-06-22 09:40:32 +02:00
Bnyro 93e867c6b1 [feat] engine categories: add blogs category
Category for searching personal blogs and websites.
Useful if searching for interesting articles on a topic
rather than the mainstream Wikipedia etc. results.
2026-06-22 09:39:40 +02:00
dependabot[bot] 75c1b1dade [upd] web-client (simple): Bump less (#6289)
Bumps the minor group in /client/simple with 1 update: [less](https://github.com/less/less.js).


Updates `less` from 4.6.4 to 4.6.6
- [Release notes](https://github.com/less/less.js/releases)
- [Changelog](https://github.com/less/less.js/blob/master/CHANGELOG.md)
- [Commits](https://github.com/less/less.js/commits/v4.6.6)

---
updated-dependencies:
- dependency-name: less
  dependency-version: 4.6.6
  dependency-type: direct:development
  update-type: version-update:semver-patch
  dependency-group: minor
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2026-06-22 08:03:15 +02:00
Bnyro 097ab64c70 [del] aol: remove engine (eol) (#6299) 2026-06-22 07:32:23 +02:00
dependabot[bot] 0e9f513efc [upd] pypi: Bump the minor group with 5 updates (#6291)
Bumps the minor group with 5 updates:

| Package | From | To |
| --- | --- | --- |
| [certifi](https://github.com/certifi/python-certifi) | `2026.5.20` | `2026.6.17` |
| [pylint](https://github.com/pylint-dev/pylint) | `4.0.5` | `4.0.6` |
| [selenium](https://github.com/SeleniumHQ/Selenium) | `4.44.0` | `4.45.0` |
| [sphinxcontrib-programoutput](https://github.com/OpenNTI/sphinxcontrib-programoutput) | `0.19` | `0.20` |
| [basedpyright](https://github.com/detachhead/basedpyright) | `1.39.7` | `1.39.8` |
2026-06-22 07:30:41 +02:00
Bnyro fd42d4fda1 [fix] chatnoir: don't re-use/cache session keys
They're invalidated very quickly, so even caching them for
60 seconds results in a lot of unauthorized access errors.
2026-06-20 21:52:14 +02:00
dependabot[bot] 5c38d2feab [upd] web-client (simple): Bump @types/node in /client/simple (#6290)
Bumps [@types/node](https://github.com/DefinitelyTyped/DefinitelyTyped/tree/HEAD/types/node) from 25.9.3 to 26.0.0.
- [Release notes](https://github.com/DefinitelyTyped/DefinitelyTyped/releases)
- [Commits](https://github.com/DefinitelyTyped/DefinitelyTyped/commits/HEAD/types/node)

---
updated-dependencies:
- dependency-name: "@types/node"
  dependency-version: 26.0.0
  dependency-type: direct:development
  update-type: version-update:semver-major
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2026-06-19 16:58:47 +02:00
dependabot[bot] 38b678c493 [upd] github-actions: Bump actions/checkout from 6.0.3 to 7.0.0 (#6288)
Bumps [actions/checkout](https://github.com/actions/checkout) from 6.0.3 to 7.0.0.
- [Release notes](https://github.com/actions/checkout/releases)
- [Changelog](https://github.com/actions/checkout/blob/main/CHANGELOG.md)
- [Commits](https://github.com/actions/checkout/compare/df4cb1c069e1874edd31b4311f1884172cec0e10...9c091bb21b7c1c1d1991bb908d89e4e9dddfe3e0)

---
updated-dependencies:
- dependency-name: actions/checkout
  dependency-version: 7.0.0
  dependency-type: direct:production
  update-type: version-update:semver-major
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2026-06-19 16:58:27 +02:00
github-actions[bot] fe1848673f [l10n] update translations from Weblate (#6293)
0f1c1d570 - 2026-06-18 - lugged9922 <lugged9922@noreply.codeberg.org>
81d208307 - 2026-06-18 - Raithlin <raithlin@noreply.codeberg.org>
bf09069e8 - 2026-06-17 - return42 <return42@noreply.codeberg.org>
c010ba929 - 2026-06-17 - return42 <return42@noreply.codeberg.org>
f92ba4e98 - 2026-06-17 - M Alif fadlan <maliffadlan@gmail.com>
442e504e2 - 2026-06-17 - return42 <return42@noreply.codeberg.org>
e2ffb2275 - 2026-06-17 - return42 <return42@noreply.codeberg.org>
cc26d0794 - 2026-06-17 - return42 <return42@noreply.codeberg.org>
9639f4e84 - 2026-06-17 - return42 <return42@noreply.codeberg.org>
63059d4e7 - 2026-06-15 - AndersNordh <andersnordh@noreply.codeberg.org>
460c5260f - 2026-06-15 - kratos <makesocialfoss32@keemail.me>
b212184d9 - 2026-06-16 - ghose <ghose@noreply.codeberg.org>
c9ac8e6d7 - 2026-06-15 - AndersNordh <andersnordh@noreply.codeberg.org>
cc1f5ab59 - 2026-06-15 - Fjuro <fjuro@noreply.codeberg.org>
84f985a9f - 2026-06-14 - Outbreak2096 <outbreak2096@noreply.codeberg.org>
bdb7e25bc - 2026-06-13 - SomeTr <sometr@noreply.codeberg.org>
c3eac4c37 - 2026-06-14 - Stephan-P <stephan-p@noreply.codeberg.org>
d94ab494b - 2026-06-13 - Priit Jõerüüt <jrtcdbrg@noreply.codeberg.org>
3387bab27 - 2026-06-13 - gallegonovato <gallegonovato@noreply.codeberg.org>
2026-06-19 15:11:48 +02:00
Bnyro 8b10095e8a [fix] settings.yml: explicitely set category for xpath engines (ayo, gabanza, zapmeta, abcnyheter) (#6282) 2026-06-19 09:10:27 +02:00
Jayant Sharma b5ef7ec8f3 [fix] calculator: move math.parse inside try-catch (#6278) (#6280)
* [fix] calculator: move math.parse inside try-catch (#6278)

* build static

---------

Co-authored-by: Ivan Gabaldon <igabaldon@inetol.net>
2026-06-18 17:36:47 +02:00
Bnyro bd73cc09ea [feat] engines: add support for search.ch/web (Swiss) 2026-06-18 14:02:52 +02:00
Butui Hu 4dfdc822cf [fix] engines: chinaso: handle empty upstream results gracefully (#6266)
Signed-off-by: Hu Butui <hot123tea123@gmail.com>
2026-06-17 19:36:22 +02:00
Ivan Gabaldon 502c820a25 [fix] container: setup minimal (#6268)
Start minimal, use defaults, and extend later on. The templates are no longer
checked for changes, which was confusing and annoying after a while.

See: https://github.com/searxng/searxng/issues/6261#issuecomment-4716008282
2026-06-16 15:32:47 +02:00
Markus Heiser 4fb49b4498 [chore] add DeprecationWarning for obsolete engine.about.language property (#6265)
The old property should still be supported for a transitional period; the
reasons for this can be seen from the discussion in [1] / the further procedure
is also discussed there.

[1] https://github.com/searxng/searxng/issues/6261

Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2026-06-16 10:31:21 +02:00
Markus Heiser cf1410af8d [fix] set language_support for engines with languages in traits (#6258)
In the past, the engine option ``language_support`` was not consistently
maintained; with this patch, a ValueError is now thrown if an engine has
languages in its traits but language_support is not set to True.

Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2026-06-15 10:52:00 +02:00
Markus Heiser 6c9dcd4242 [chore] complete and normalize the attributes of engine objects (#6258)
Drop outdated engine attributes: supported_languages, language_aliases

Complete, normalize and document the type definitions for the engine-module and
engine-class.

For the ``engine.about`` section of the configuration, a type check is performed
based on structure ``searx.enginelib.EngineAbout``.

The property ``engine.about.language`` no longer exists; existing values have
been migrated to ``engine.language``.

Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2026-06-15 10:52:00 +02:00
Bnyro b3e08f2a44 [feat] engines: add searchzee engine (general, news)
The results seem to be from Brave (i.e. they are exactly
the same). But it doesn't have any strict rate-limits,
so that's nice.

News support time ranges, but apart from that, unfortunately it doesn't
support any advanced features like safesearch or languages.
2026-06-14 09:59:39 +02:00
Bnyro a857041afc [feat] engines: add support for search.ayo.de 2026-06-14 09:32:58 +02:00
Bnyro 31a8a22aa6 [feat] engines: add German tonline engine (general, news, images, videos) (#6250)
T-Online_ is a German news portal.

It gets its web results from Google, image results from Flickr and videos results
from YouTube.

For images and videos, it additionally returns result from its
news catalog. However, for pagination we have to specify the result
type (e.g. either videos from YouTube or from T-Online), so we use
flickr/youtube there instead of tonline because the tonline results
are usually irrelevant.
2026-06-14 08:46:07 +02:00
Bnyro a29cda858c [feat] engines: add luxxle (general, news, images, videos)
Add support for https://luxxle.com

Localization is not yet supported because it doesn't seem to work on their
website either, no matter which language I select, it only returns English web
results
2026-06-13 20:39:31 +02:00
Bnyro 2e10a2f614 [feat] engines: add rawweb engine (foss, hand-indexed blogs) (#6234)
RawWeb is a search engine for personal websites / blog posts.
It has its own index and the personal websites were selected
by hand. Results are quite good for what it is imo. [^1]

[^1]: https://github.com/0x2E/RawWeb.org
2026-06-13 19:09:58 +02:00
Bnyro 2100eb04e1 [feat] engines: add reloado engine (general, german) (#6233)
- adds support for https://reloado.com (german)
- as it has its own index, the results are hit or miss and mostly German, 
  but still worth integrating imo
2026-06-13 19:06:18 +02:00
Bnyro c58391d673 [feat] engines: add fastbot engine (general) (#6232)
- adds support for https://fastbot.de
- the results are really fast and mostly in English (even though it's a German
  engine)
2026-06-13 19:04:39 +02:00
Bnyro c3284c8238 [chore] make data.traits (#6211) 2026-06-13 18:37:57 +02:00
Bnyro 290d3e0c6a [feat] engines: add privacywall engine (#6211)
- add https://privacywall.org support
- the engine seems to use the Bing index, but not 100% sure
- it claims to be privacy friendly, but it's not really by itself [1]

[1]: https://discuss.privacyguides.net/t/how-is-privacy-wall-search-engine/29486
2026-06-13 18:37:57 +02:00
Bnyro 0608dfa4d1 [feat] autocomplete: add privacywall autocompleter (#6211) 2026-06-13 18:37:57 +02:00
Bnyro 1184b3212f [feat] engines: add podchaser podcast engine (#6202)
- add podchaser podcast engine
- the motivation is that podcastindex had to be removed, see #6140
2026-06-13 18:04:21 +02:00
Bnyro 65e0e4c069 [feat] engines: add vuhuv engine (#6196) 2026-06-13 17:52:43 +02:00
Bnyro d14fa1f6e2 [chore] data: add resulthunter engine traits 2026-06-13 17:21:52 +02:00
Bnyro 2d248704fa [feat] engines: add resulthunter 2026-06-13 17:21:52 +02:00
Markus Heiser 3096b1218f [mod] add type definitions for engine's "about" section (#6231)
Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2026-06-13 17:05:59 +02:00
Bnyro 82a8a90230 [feat] engines: add abcnyheter engine (general, norway) (#6231)
Add support for https://startsiden.abcnyheter.no, a netherlandish search engine
that probably uses Google or Bing? idk it also returns English results, but
e.g. ``test`` returns mostly results from netherlands.
2026-06-13 17:05:59 +02:00
Bnyro e3d4fbe570 [feat] engines: add s1search general engine (#6186)
S1Search provides various different search services, which all seem
to be somewhat based on Google and Yahoo. The site looks kinda suspicious,
but the results are fine.

You can find a list of their engines by using a subdomain finder like
https://web-toolbox.dev/en/tools/subdomain-lookup and search for `s1search.co`.
2026-06-13 14:18:04 +02:00
Bnyro 031747f29e [feat] engines: add chatnoir general engine (#6183)
Chatnoir is an open source search engine developed by universities, based on
CommonCrawl (and others).  It's uncommented by default - we don't want to
overload the universities with bot traffic that targets SearXNG (sad truth why
we can't have nice things anymore)
2026-06-13 13:52:01 +02:00
Markus Heiser e3bd7f5df1 [mod] image results: add list of alternative formats (#6153)
* [mod] template images.html: reformatted for readability (no func change)

In preparation for upcoming changes, the template is being reformatted for
better readability; no functional changes are being made.

Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>

* [mod] image results: add list of alternative formats

To test alternatives formats apply patch from below, query ``!flaticon bmw`` and
open the detail view for the image.

    diff --git a/searx/engines/flaticon.py b/searx/engines/flaticon.py
    index 06b6a8e25..d88388705 100644
    --- a/searx/engines/flaticon.py
    +++ b/searx/engines/flaticon.py
    @@ -8,7 +8,7 @@ from urllib.parse import urlencode

     import typing as t

    -from searx.result_types import EngineResults
    +from searx.result_types import EngineResults, ImageRef

     if t.TYPE_CHECKING:
         from searx.extended_types import SXNG_Response
    @@ -61,6 +61,14 @@ def response(resp: "SXNG_Response"):
                     thumbnail_src=_fix_url(result["png"]),
                     img_src=_fix_url(result["png512"]),
                     author=result["team_name"],
    +                formats=[
    +                    ImageRef(label="PNG 100x100", url="https://example.org/test.png", subtype="png"),
    +                    ImageRef(label="SVG", url="https://example.org/test.svg", subtype="svg+xml"),
    +                    ImageRef(url="https://example.org/test.jpg", subtype="jpeg"),
    +                    ImageRef(url="https://example.org/test.bmp", subtype="bmp"),
    +                    ImageRef(url="https://example.org/test.ico", subtype="x-icon"),
    +                    ImageRef(url="https://example.org/test.tif", subtype="tiff"),
    +                ],
                 )
             )

Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>

---------

Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2026-06-13 13:28:05 +02:00
224 changed files with 4877 additions and 1808 deletions
+1
View File
@@ -1,5 +1,6 @@
*
!container/*.template.*
!container/entrypoint.sh
!searx/**
!requirements*.txt
+3 -3
View File
@@ -78,7 +78,7 @@ jobs:
python-version: "${{ env.PYTHON_VERSION }}"
- name: Checkout
uses: actions/checkout@df4cb1c069e1874edd31b4311f1884172cec0e10 # v6.0.3
uses: actions/checkout@9c091bb21b7c1c1d1991bb908d89e4e9dddfe3e0 # v7.0.0
with:
persist-credentials: "false"
fetch-depth: "0"
@@ -141,7 +141,7 @@ jobs:
steps:
- name: Checkout
uses: actions/checkout@df4cb1c069e1874edd31b4311f1884172cec0e10 # v6.0.3
uses: actions/checkout@9c091bb21b7c1c1d1991bb908d89e4e9dddfe3e0 # v7.0.0
with:
persist-credentials: "false"
@@ -175,7 +175,7 @@ jobs:
steps:
- name: Checkout
uses: actions/checkout@df4cb1c069e1874edd31b4311f1884172cec0e10 # v6.0.3
uses: actions/checkout@9c091bb21b7c1c1d1991bb908d89e4e9dddfe3e0 # v7.0.0
with:
persist-credentials: "false"
+1 -1
View File
@@ -46,7 +46,7 @@ jobs:
python-version: "${{ env.PYTHON_VERSION }}"
- name: Checkout
uses: actions/checkout@df4cb1c069e1874edd31b4311f1884172cec0e10 # v6.0.3
uses: actions/checkout@9c091bb21b7c1c1d1991bb908d89e4e9dddfe3e0 # v7.0.0
with:
persist-credentials: "false"
+1 -1
View File
@@ -37,7 +37,7 @@ jobs:
python-version: "${{ env.PYTHON_VERSION }}"
- name: Checkout
uses: actions/checkout@df4cb1c069e1874edd31b4311f1884172cec0e10 # v6.0.3
uses: actions/checkout@9c091bb21b7c1c1d1991bb908d89e4e9dddfe3e0 # v7.0.0
with:
persist-credentials: "false"
fetch-depth: "0"
+2 -2
View File
@@ -39,7 +39,7 @@ jobs:
python-version: "${{ matrix.python-version }}"
- name: Checkout
uses: actions/checkout@df4cb1c069e1874edd31b4311f1884172cec0e10 # v6.0.3
uses: actions/checkout@9c091bb21b7c1c1d1991bb908d89e4e9dddfe3e0 # v7.0.0
with:
persist-credentials: "false"
@@ -67,7 +67,7 @@ jobs:
python-version: "${{ env.PYTHON_VERSION }}"
- name: Checkout
uses: actions/checkout@df4cb1c069e1874edd31b4311f1884172cec0e10 # v6.0.3
uses: actions/checkout@9c091bb21b7c1c1d1991bb908d89e4e9dddfe3e0 # v7.0.0
with:
persist-credentials: "false"
+2 -2
View File
@@ -40,7 +40,7 @@ jobs:
python-version: "${{ env.PYTHON_VERSION }}"
- name: Checkout
uses: actions/checkout@df4cb1c069e1874edd31b4311f1884172cec0e10 # v6.0.3
uses: actions/checkout@9c091bb21b7c1c1d1991bb908d89e4e9dddfe3e0 # v7.0.0
with:
token: "${{ secrets.WEBLATE_GITHUB_TOKEN }}"
fetch-depth: "0"
@@ -88,7 +88,7 @@ jobs:
python-version: "${{ env.PYTHON_VERSION }}"
- name: Checkout
uses: actions/checkout@df4cb1c069e1874edd31b4311f1884172cec0e10 # v6.0.3
uses: actions/checkout@9c091bb21b7c1c1d1991bb908d89e4e9dddfe3e0 # v7.0.0
with:
token: "${{ secrets.WEBLATE_GITHUB_TOKEN }}"
fetch-depth: "0"
+1 -1
View File
@@ -24,7 +24,7 @@ jobs:
steps:
- name: Checkout
uses: actions/checkout@df4cb1c069e1874edd31b4311f1884172cec0e10 # v6.0.3
uses: actions/checkout@9c091bb21b7c1c1d1991bb908d89e4e9dddfe3e0 # v7.0.0
with:
persist-credentials: "false"
+20 -43
View File
@@ -16,11 +16,11 @@
},
"devDependencies": {
"@biomejs/biome": "2.5.0",
"@types/node": "^25.9.3",
"@types/node": "^26.0.0",
"browserslist": "^4.28.2",
"browserslist-to-esbuild": "^2.1.1",
"edge.js": "^6.5.1",
"less": "^4.6.4",
"less": "^4.6.6",
"mathjs": "^15.2.0",
"sharp": "~0.35.1",
"sort-package-json": "^4.0.0",
@@ -1570,13 +1570,13 @@
}
},
"node_modules/@types/node": {
"version": "25.9.3",
"resolved": "https://registry.npmjs.org/@types/node/-/node-25.9.3.tgz",
"integrity": "sha512-603BddQMv3pUcr4U2dhujk83N2tTDVr/34wII2B6bJy6g+8WD6yUb11jszNs0gdi4PesVWl7ABt8nYMVpnLUcg==",
"version": "26.0.0",
"resolved": "https://registry.npmjs.org/@types/node/-/node-26.0.0.tgz",
"integrity": "sha512-vf2YFi1iY9lHGwNJMs01biZFbKJkrZR1T6/MlzjhJLPdntOHLhTrDSnSVcdtvjihi4VQNlrFRIxLsDBlQpAipA==",
"dev": true,
"license": "MIT",
"dependencies": {
"undici-types": ">=7.24.0 <7.24.7"
"undici-types": "~8.3.0"
}
},
"node_modules/@types/pluralize": {
@@ -2890,9 +2890,9 @@
"license": "Apache-2.0"
},
"node_modules/less": {
"version": "4.6.4",
"resolved": "https://registry.npmjs.org/less/-/less-4.6.4.tgz",
"integrity": "sha512-OJmO5+HxZLLw0RLzkqaNHzcgEAQG7C0y3aMbwtCzIUFZsLMNNq/1IdAdHEycQ58CwUO3jPTHmoN+tE5I7FQxNg==",
"version": "4.6.6",
"resolved": "https://registry.npmjs.org/less/-/less-4.6.6.tgz",
"integrity": "sha512-ooPSwQGQ2sVe8Dh1jVsbKKsRR2gd8lFK72BDkeSzjnD1T5aIHL65hCMfO0GVmtriKgDKrQv6xp9UrihUsWuAzA==",
"dev": true,
"license": "Apache-2.0",
"dependencies": {
@@ -2909,7 +2909,7 @@
"errno": "^0.1.1",
"graceful-fs": "^4.1.2",
"image-size": "~0.5.0",
"make-dir": "^2.1.0",
"make-dir": "^5.1.0",
"mime": "^1.4.1",
"needle": "^3.1.0",
"source-map": "~0.6.0"
@@ -3191,18 +3191,17 @@
"license": "MIT"
},
"node_modules/make-dir": {
"version": "2.1.0",
"resolved": "https://registry.npmjs.org/make-dir/-/make-dir-2.1.0.tgz",
"integrity": "sha512-LS9X+dc8KLxXCb8dni79fLIIUA5VyZoyjSMCwTluaXA0o27cCK0bhXkpgw+sTXVpPy/lSO57ilRixqk0vDmtRA==",
"version": "5.1.0",
"resolved": "https://registry.npmjs.org/make-dir/-/make-dir-5.1.0.tgz",
"integrity": "sha512-IfpFq6UM39dUNiphpA6uDezNx/AvWyhwfICWPR3t1VspkgkMZrL+Rk1RbN1bx+aeNYwOrqGJgEgV3yotk+ZUVw==",
"dev": true,
"license": "MIT",
"optional": true,
"dependencies": {
"pify": "^4.0.1",
"semver": "^5.6.0"
},
"engines": {
"node": ">=6"
"node": ">=18"
},
"funding": {
"url": "https://github.com/sponsors/sindresorhus"
}
},
"node_modules/mathjs": {
@@ -3491,17 +3490,6 @@
"url": "https://github.com/sponsors/jonschlinkert"
}
},
"node_modules/pify": {
"version": "4.0.1",
"resolved": "https://registry.npmjs.org/pify/-/pify-4.0.1.tgz",
"integrity": "sha512-uB80kBFb/tfd68bVleG9T5GGsGPjJrLAUpR5PZIrhBnIaRTQRjqdJSsIKkOP6OAIFbj7GOrcudc5pNjZ+geV2g==",
"dev": true,
"license": "MIT",
"optional": true,
"engines": {
"node": ">=6"
}
},
"node_modules/pluralize": {
"version": "8.0.0",
"resolved": "https://registry.npmjs.org/pluralize/-/pluralize-8.0.0.tgz",
@@ -3861,17 +3849,6 @@
"dev": true,
"license": "MIT"
},
"node_modules/semver": {
"version": "5.7.2",
"resolved": "https://registry.npmjs.org/semver/-/semver-5.7.2.tgz",
"integrity": "sha512-cBznnQ9KjJqU67B52RMC65CMarK2600WFnbkcaiwWq3xy/5haFJlshgnpjovMVJ+Hff49d8GEn0b87C5pDQ10g==",
"dev": true,
"license": "ISC",
"optional": true,
"bin": {
"semver": "bin/semver"
}
},
"node_modules/sharp": {
"version": "0.35.1",
"resolved": "https://registry.npmjs.org/sharp/-/sharp-0.35.1.tgz",
@@ -4515,9 +4492,9 @@
}
},
"node_modules/undici-types": {
"version": "7.24.6",
"resolved": "https://registry.npmjs.org/undici-types/-/undici-types-7.24.6.tgz",
"integrity": "sha512-WRNW+sJgj5OBN4/0JpHFqtqzhpbnV0GuB+OozA9gCL7a993SmU+1JBZCzLNxYsbMfIeDL+lTsphD5jN5N+n0zg==",
"version": "8.3.0",
"resolved": "https://registry.npmjs.org/undici-types/-/undici-types-8.3.0.tgz",
"integrity": "sha512-j375ScV60dom+YkPFIfTLcOiPxkN/buHz5GobjLhixFuANaNs3C9l4GmrWqejgXWJ7BbJcFYpTEUkS1Ge8bpZQ==",
"dev": true,
"license": "MIT"
},
+2 -2
View File
@@ -30,11 +30,11 @@
},
"devDependencies": {
"@biomejs/biome": "2.5.0",
"@types/node": "^25.9.3",
"@types/node": "^26.0.0",
"browserslist": "^4.28.2",
"browserslist-to-esbuild": "^2.1.1",
"edge.js": "^6.5.1",
"less": "^4.6.4",
"less": "^4.6.6",
"mathjs": "^15.2.0",
"sharp": "~0.35.1",
"sort-package-json": "^4.0.0",
+1 -1
View File
@@ -77,9 +77,9 @@ export default class Calculator extends Plugin {
protected async run(): Promise<string | undefined> {
const searchInput = getElement<HTMLInputElement>("q");
const node = Calculator.math.parse(searchInput.value);
try {
const node = Calculator.math.parse(searchInput.value);
return `${node.toString()} = ${node.evaluate()}`;
} catch {
// not a compatible math expression
+1 -4
View File
@@ -21,8 +21,6 @@ RUN --mount=type=cache,id=uv,target=/root/.cache/uv set -eux -o pipefail; \
COPY --exclude=./searx/version_frozen.py ./searx/ ./searx/
ARG TIMESTAMP_SETTINGS="0"
RUN set -eux -o pipefail; \
python -m compileall -q -f -j 0 --invalidation-mode=unchecked-hash ./searx/; \
find ./searx/static/ -type f \
@@ -30,5 +28,4 @@ RUN set -eux -o pipefail; \
-exec gzip -9 -k {} + \
-exec brotli -9 -k {} + \
-exec gzip --test {}.gz + \
-exec brotli --test {}.br +; \
touch -c --date="@$TIMESTAMP_SETTINGS" ./searx/settings.yml
-exec brotli --test {}.br +
+9 -30
View File
@@ -77,43 +77,23 @@ volume_handler() {
setup_ownership "$target" "directory"
}
# Handle configuration file updates
config_handler() {
local target="$1"
local template="$2"
local new_template_target="$target.new"
setup() {
local template_settings="/usr/local/searxng/settings.template.yml"
local target_settings="$__SEARXNG_CONFIG_PATH/settings.yml"
# Create/Update the configuration file
if [ -f "$target" ]; then
setup_ownership "$target" "file"
if [ "$template" -nt "$target" ]; then
cp -pfT "$template" "$new_template_target"
cat <<EOF
...
... INFORMATION
... Update available for "$target"
... It is recommended to update the configuration file to ensure proper functionality
...
... New version placed at "$new_template_target"
... Please review and merge changes
...
EOF
fi
else
if [ ! -f "$target_settings" ]; then
cat <<EOF
...
... INFORMATION
... "$target" does not exist, creating from template...
... "$target_settings" does not exist, creating from template...
...
EOF
cp -pfT "$template" "$target"
cp -pfT "$template_settings" "$target_settings"
sed -i "s/ultrasecretkey/$(head -c 24 /dev/urandom | base64 | tr -dc 'a-zA-Z0-9')/g" "$target"
sed -i "s/ultrasecretkey/$(head -c 24 /dev/urandom | base64 | tr -dc 'a-zA-Z0-9')/g" "$target_settings"
fi
check_file "$target"
check_file "$target_settings"
}
cat <<EOF
@@ -124,8 +104,7 @@ EOF
volume_handler "$__SEARXNG_CONFIG_PATH"
volume_handler "$__SEARXNG_DATA_PATH"
# Check for files
config_handler "$__SEARXNG_SETTINGS_PATH" "/usr/local/searxng/searx/settings.yml"
setup
# root only features
if [ "$(id -u)" -eq 0 ]; then
+8
View File
@@ -0,0 +1,8 @@
# Read the documentation before extending the defaults:
# https://docs.searxng.org/admin/settings/
use_default_settings: true
server:
secret_key: "ultrasecretkey"
image_proxy: true
+1
View File
@@ -43,6 +43,7 @@
- ``google``
- ``mwmbl``
- ``naver``
- ``privacywall``
- ``quark``
- ``qwant``
- ``seznam``
-8
View File
@@ -1,8 +0,0 @@
.. _aol engine:
===
AOL
===
.. automodule:: searx.engines.aol
:members:
+1 -1
View File
@@ -87,7 +87,7 @@ Parameters
``autocomplete`` : default from :ref:`settings search`
[ ``google``, ``dbpedia``, ``duckduckgo``, ``mwmbl``, ``startpage``,
``wikipedia``, ``swisscows``, ``qwant`` ]
``privacywall``, ``wikipedia``, ``swisscows``, ``qwant`` ]
Service which completes words as you type.
+2 -2
View File
@@ -58,8 +58,8 @@ Configured Engines
{% for mod in engines %}
* - `{{mod.name}} <{{mod.about and mod.about.website}}>`_
{%- if mod.about and mod.about.language %}
({{mod.about.language | upper}})
{%- if mod.language %}
({{mod.language | upper}})
{%- endif %}
- ``!{{mod.shortcut}}``
- {%- if 'searx.engines.' + mod.__name__ in documented_modules %}
+4 -4
View File
@@ -2,16 +2,16 @@ mock==5.2.0
nose2[coverage_plugin]==0.16.0
cov-core==1.15.0
black==25.9.0
pylint==4.0.5
pylint==4.0.6
splinter==0.21.0
selenium==4.44.0
selenium==4.45.0
Sphinx==8.2.3;python_version <= "3.11"
Sphinx==9.1.0; python_version > "3.11"
sphinx-issues==6.0.0
sphinx-jinja==2.0.2
sphinx-tabs==3.5.0
furo==2025.12.19
sphinxcontrib-programoutput==0.19
sphinxcontrib-programoutput==0.20
sphinx-autobuild==2025.8.25
sphinx-notfound-page==1.1.0
myst-parser==5.0.0
@@ -24,5 +24,5 @@ docutils>=0.21.2;python_version <= "3.11"
docutils>=0.22.4; python_version > "3.11"
parameterized==0.9.0
granian[reload]==2.7.6
basedpyright==1.39.7
basedpyright==1.39.8
types-lxml==2026.2.16
+1 -1
View File
@@ -1,4 +1,4 @@
certifi==2026.5.20
certifi==2026.6.17
babel==2.18.0
flask-babel==4.0.0
flask==3.1.3
+18
View File
@@ -179,6 +179,23 @@ def naver(query: str, _sxng_locale: str) -> list[str]:
return results
def privacywall(query: str, sxng_locale: str) -> list[str]:
# Privacywall search autocompleter
country = None
if "-" in sxng_locale:
country = sxng_locale.split("-")[1]
args = {'q': query, 'cc': country}
url = f"https://www.privacywall.org/search/secure/suggestions.php?{urlencode(args)}"
response = get(url)
if not response.ok:
return []
data: list[list[str]] = response.json()
return data[1]
def qihu360search(query: str, _sxng_locale: str) -> list[str]:
# 360Search search autocompleter
url = f"https://sug.so.360.cn/suggest?{urlencode({'format': 'json', 'word': query})}"
@@ -361,6 +378,7 @@ backends: dict[str, t.Callable[[str, str], list[str]]] = {
'google': google_complete,
'mwmbl': mwmbl,
'naver': naver,
'privacywall': privacywall,
'quark': quark,
'qwant': qwant,
'seznam': seznam,
+466 -1
View File
@@ -6634,6 +6634,255 @@
},
"regions": {}
},
"privacywall": {
"all_locale": null,
"custom": {},
"data_type": "traits_v1",
"languages": {},
"regions": {
"bg-BG": "BG",
"cs-CZ": "CZ",
"da-DK": "DK",
"de-AT": "AT",
"de-BE": "BE",
"de-CH": "CH",
"de-DE": "DE",
"de-LI": "LI",
"de-LU": "LU",
"el-CY": "CY",
"el-GR": "GR",
"en-AU": "AU",
"en-CA": "CA",
"en-GB": "GB",
"en-HK": "HK",
"en-IE": "IE",
"en-IN": "IN",
"en-MT": "MT",
"en-NZ": "NZ",
"en-PH": "PH",
"en-SG": "SG",
"en-US": "US",
"es-AR": "AR",
"es-CL": "CL",
"es-CO": "CO",
"es-ES": "ES",
"es-MX": "MX",
"es-PE": "PE",
"es-VE": "VE",
"et-EE": "EE",
"fi-FI": "FI",
"fil-PH": "PH",
"fr-BE": "BE",
"fr-CA": "CA",
"fr-CH": "CH",
"fr-FR": "FR",
"fr-LU": "LU",
"ga-IE": "IE",
"gsw-CH": "CH",
"gsw-LI": "LI",
"hi-IN": "IN",
"hr-HR": "HR",
"hu-HU": "HU",
"id-ID": "ID",
"it-CH": "CH",
"it-IT": "IT",
"ja-JP": "JP",
"ko-KR": "KR",
"lb-LU": "LU",
"lt-LT": "LT",
"lv-LV": "LV",
"mi-NZ": "NZ",
"ms-MY": "MY",
"ms-SG": "SG",
"mt-MT": "MT",
"nb-NO": "NO",
"nl-BE": "BE",
"nl-NL": "NL",
"nn-NO": "NO",
"pl-PL": "PL",
"pt-BR": "BR",
"pt-PT": "PT",
"qu-PE": "PE",
"ro-RO": "RO",
"sk-SK": "SK",
"sl-SI": "SI",
"sv-FI": "FI",
"sv-SE": "SE",
"ta-SG": "SG",
"th-TH": "TH",
"tr-CY": "CY",
"vi-VN": "VN",
"zh-HK": "HK",
"zh-SG": "SG",
"zh-TW": "TW"
}
},
"privacywall images": {
"all_locale": null,
"custom": {},
"data_type": "traits_v1",
"languages": {},
"regions": {
"bg-BG": "BG",
"cs-CZ": "CZ",
"da-DK": "DK",
"de-AT": "AT",
"de-BE": "BE",
"de-CH": "CH",
"de-DE": "DE",
"de-LI": "LI",
"de-LU": "LU",
"el-CY": "CY",
"el-GR": "GR",
"en-AU": "AU",
"en-CA": "CA",
"en-GB": "GB",
"en-HK": "HK",
"en-IE": "IE",
"en-IN": "IN",
"en-MT": "MT",
"en-NZ": "NZ",
"en-PH": "PH",
"en-SG": "SG",
"en-US": "US",
"es-AR": "AR",
"es-CL": "CL",
"es-CO": "CO",
"es-ES": "ES",
"es-MX": "MX",
"es-PE": "PE",
"es-VE": "VE",
"et-EE": "EE",
"fi-FI": "FI",
"fil-PH": "PH",
"fr-BE": "BE",
"fr-CA": "CA",
"fr-CH": "CH",
"fr-FR": "FR",
"fr-LU": "LU",
"ga-IE": "IE",
"gsw-CH": "CH",
"gsw-LI": "LI",
"hi-IN": "IN",
"hr-HR": "HR",
"hu-HU": "HU",
"id-ID": "ID",
"it-CH": "CH",
"it-IT": "IT",
"ja-JP": "JP",
"ko-KR": "KR",
"lb-LU": "LU",
"lt-LT": "LT",
"lv-LV": "LV",
"mi-NZ": "NZ",
"ms-MY": "MY",
"ms-SG": "SG",
"mt-MT": "MT",
"nb-NO": "NO",
"nl-BE": "BE",
"nl-NL": "NL",
"nn-NO": "NO",
"pl-PL": "PL",
"pt-BR": "BR",
"pt-PT": "PT",
"qu-PE": "PE",
"ro-RO": "RO",
"sk-SK": "SK",
"sl-SI": "SI",
"sv-FI": "FI",
"sv-SE": "SE",
"ta-SG": "SG",
"th-TH": "TH",
"tr-CY": "CY",
"vi-VN": "VN",
"zh-HK": "HK",
"zh-SG": "SG",
"zh-TW": "TW"
}
},
"privacywall videos": {
"all_locale": null,
"custom": {},
"data_type": "traits_v1",
"languages": {},
"regions": {
"bg-BG": "BG",
"cs-CZ": "CZ",
"da-DK": "DK",
"de-AT": "AT",
"de-BE": "BE",
"de-CH": "CH",
"de-DE": "DE",
"de-LI": "LI",
"de-LU": "LU",
"el-CY": "CY",
"el-GR": "GR",
"en-AU": "AU",
"en-CA": "CA",
"en-GB": "GB",
"en-HK": "HK",
"en-IE": "IE",
"en-IN": "IN",
"en-MT": "MT",
"en-NZ": "NZ",
"en-PH": "PH",
"en-SG": "SG",
"en-US": "US",
"es-AR": "AR",
"es-CL": "CL",
"es-CO": "CO",
"es-ES": "ES",
"es-MX": "MX",
"es-PE": "PE",
"es-VE": "VE",
"et-EE": "EE",
"fi-FI": "FI",
"fil-PH": "PH",
"fr-BE": "BE",
"fr-CA": "CA",
"fr-CH": "CH",
"fr-FR": "FR",
"fr-LU": "LU",
"ga-IE": "IE",
"gsw-CH": "CH",
"gsw-LI": "LI",
"hi-IN": "IN",
"hr-HR": "HR",
"hu-HU": "HU",
"id-ID": "ID",
"it-CH": "CH",
"it-IT": "IT",
"ja-JP": "JP",
"ko-KR": "KR",
"lb-LU": "LU",
"lt-LT": "LT",
"lv-LV": "LV",
"mi-NZ": "NZ",
"ms-MY": "MY",
"ms-SG": "SG",
"mt-MT": "MT",
"nb-NO": "NO",
"nl-BE": "BE",
"nl-NL": "NL",
"nn-NO": "NO",
"pl-PL": "PL",
"pt-BR": "BR",
"pt-PT": "PT",
"qu-PE": "PE",
"ro-RO": "RO",
"sk-SK": "SK",
"sl-SI": "SI",
"sv-FI": "FI",
"sv-SE": "SE",
"ta-SG": "SG",
"th-TH": "TH",
"tr-CY": "CY",
"vi-VN": "VN",
"zh-HK": "HK",
"zh-SG": "SG",
"zh-TW": "TW"
}
},
"qwant": {
"all_locale": null,
"custom": {},
@@ -7175,6 +7424,222 @@
},
"regions": {}
},
"resulthunter": {
"all_locale": "all",
"custom": {
"ui_lang": {
"az": "az",
"bg": "bg",
"br": "br",
"ca": "ca",
"cs": "cs",
"cy": "cy",
"da": "da",
"de-DE": "de-de",
"el": "el",
"en-CA": "en-ca",
"en-GB": "en-gb",
"en-IN": "en-in",
"en-US": "en-us",
"es": "es",
"et": "et",
"eu": "eu",
"fi-FI": "fi-fi",
"fr-CA": "fr-ca",
"fr-FR": "fr-fr",
"gl": "gl",
"hr": "hr",
"hu": "hu",
"id": "id",
"it": "it",
"ja-JP": "ja-jp",
"ka": "ka",
"ko": "ko",
"lt": "lt",
"lv": "lv",
"ms": "ms",
"nb": "nb",
"nl": "nl",
"pl": "pl",
"pt-BR": "pt-br",
"ro": "ro",
"ru": "ru",
"sk": "sk",
"sl": "sl",
"sq-AL": "sq-al",
"sr": "sr",
"sr_Latn": "sr-latn",
"sv": "sv",
"sw-KE": "sw-ke",
"th": "th",
"tr": "tr",
"uk": "uk",
"vi": "vi",
"zh": "zh",
"zh-TW": "zh-tw"
}
},
"data_type": "traits_v1",
"languages": {},
"regions": {
"ar-SA": "sa",
"da-DK": "dk",
"de-AT": "at",
"de-BE": "be",
"de-CH": "ch",
"de-DE": "de",
"en-AU": "au",
"en-CA": "ca",
"en-GB": "gb",
"en-HK": "hk",
"en-IN": "in",
"en-NZ": "nz",
"en-PH": "ph",
"en-US": "us",
"en-ZA": "za",
"es-AR": "ar",
"es-CL": "cl",
"es-ES": "es",
"es-MX": "mx",
"fi-FI": "fi",
"fil-PH": "ph",
"fr-BE": "be",
"fr-CA": "ca",
"fr-CH": "ch",
"fr-FR": "fr",
"gsw-CH": "ch",
"hi-IN": "in",
"id-ID": "id",
"it-CH": "ch",
"it-IT": "it",
"ja-JP": "jp",
"ko-KR": "kr",
"mi-NZ": "nz",
"ms-MY": "my",
"nb-NO": "no",
"nl-BE": "be",
"nl-NL": "nl",
"nn-NO": "no",
"pl-PL": "pl",
"pt-BR": "br",
"pt-PT": "pt",
"ru-RU": "ru",
"sv-FI": "fi",
"sv-SE": "se",
"tr-TR": "tr",
"zh-CN": "cn",
"zh-HK": "hk",
"zh-TW": "tw"
}
},
"resulthunter images": {
"all_locale": "all",
"custom": {
"ui_lang": {
"az": "az",
"bg": "bg",
"br": "br",
"ca": "ca",
"cs": "cs",
"cy": "cy",
"da": "da",
"de-DE": "de-de",
"el": "el",
"en-CA": "en-ca",
"en-GB": "en-gb",
"en-IN": "en-in",
"en-US": "en-us",
"es": "es",
"et": "et",
"eu": "eu",
"fi-FI": "fi-fi",
"fr-CA": "fr-ca",
"fr-FR": "fr-fr",
"gl": "gl",
"hr": "hr",
"hu": "hu",
"id": "id",
"it": "it",
"ja-JP": "ja-jp",
"ka": "ka",
"ko": "ko",
"lt": "lt",
"lv": "lv",
"ms": "ms",
"nb": "nb",
"nl": "nl",
"pl": "pl",
"pt-BR": "pt-br",
"ro": "ro",
"ru": "ru",
"sk": "sk",
"sl": "sl",
"sq-AL": "sq-al",
"sr": "sr",
"sr_Latn": "sr-latn",
"sv": "sv",
"sw-KE": "sw-ke",
"th": "th",
"tr": "tr",
"uk": "uk",
"vi": "vi",
"zh": "zh",
"zh-TW": "zh-tw"
}
},
"data_type": "traits_v1",
"languages": {},
"regions": {
"ar-SA": "sa",
"da-DK": "dk",
"de-AT": "at",
"de-BE": "be",
"de-CH": "ch",
"de-DE": "de",
"en-AU": "au",
"en-CA": "ca",
"en-GB": "gb",
"en-HK": "hk",
"en-IN": "in",
"en-NZ": "nz",
"en-PH": "ph",
"en-US": "us",
"en-ZA": "za",
"es-AR": "ar",
"es-CL": "cl",
"es-ES": "es",
"es-MX": "mx",
"fi-FI": "fi",
"fil-PH": "ph",
"fr-BE": "be",
"fr-CA": "ca",
"fr-CH": "ch",
"fr-FR": "fr",
"gsw-CH": "ch",
"hi-IN": "in",
"id-ID": "id",
"it-CH": "ch",
"it-IT": "it",
"ja-JP": "jp",
"ko-KR": "kr",
"mi-NZ": "nz",
"ms-MY": "my",
"nb-NO": "no",
"nl-BE": "be",
"nl-NL": "nl",
"nn-NO": "no",
"pl-PL": "pl",
"pt-BR": "br",
"pt-PT": "pt",
"ru-RU": "ru",
"sv-FI": "fi",
"sv-SE": "se",
"tr-TR": "tr",
"zh-CN": "cn",
"zh-HK": "hk",
"zh-TW": "tw"
}
},
"sepiasearch": {
"all_locale": null,
"custom": {},
@@ -9120,4 +9585,4 @@
},
"regions": {}
}
}
}
+161 -111
View File
@@ -3,6 +3,7 @@
- :py:obj:`searx.enginelib.EngineCache`
- :py:obj:`searx.enginelib.Engine`
- :py:obj:`searx.enginelib.EngineAbout`
- :py:obj:`searx.enginelib.traits`
There is a command line for developer purposes and for deeper analysis. Here is
@@ -23,7 +24,7 @@ an example in which the command line is called in the development environment::
"""
__all__ = ["EngineCache", "Engine", "ENGINES_CACHE"]
__all__ = ["EngineCache", "Engine", "EngineAbout", "ENGINES_CACHE"]
import typing as t
import abc
@@ -31,6 +32,7 @@ from collections.abc import Callable
import logging
import string
import typer
import msgspec
from ..cache import ExpireCacheSQLite, ExpireCacheCfg
@@ -39,7 +41,7 @@ if t.TYPE_CHECKING:
from searx.enginelib.traits import EngineTraits
from searx.extended_types import SXNG_Response
from searx.result_types import EngineResults
from searx.search.processors import OfflineParamTypes, OnlineParamTypes
from searx.search.processors import OfflineParamTypes, OnlineParamTypes, ProcessorType
ENGINES_CACHE: ExpireCacheSQLite = ExpireCacheSQLite.build_cache(
ExpireCacheCfg(
@@ -178,111 +180,7 @@ class EngineCache:
return ENGINES_CACHE.secret_hash(name=name)
class Engine(abc.ABC): # pylint: disable=too-few-public-methods
"""Class of engine instances build from YAML settings.
Further documentation see :ref:`general engine configuration`.
.. hint::
This class is currently never initialized and only used for type hinting.
"""
logger: logging.Logger
# Common options in the engine module
engine_type: str
"""Type of the engine (:ref:`searx.search.processors`)"""
paging: bool
"""Engine supports multiple pages."""
max_page: int = 0
"""If the engine supports paging, then this is the value for the last page
that is still supported. ``0`` means unlimited numbers of pages."""
time_range_support: bool
"""Engine supports search time range."""
safesearch: bool
"""Engine supports SafeSearch"""
language_support: bool
"""Engine supports languages (locales) search."""
language: str
"""For an engine, when there is ``language: ...`` in the YAML settings the engine
does support only this one language:
.. code:: yaml
- name: google french
engine: google
language: fr
"""
region: str
"""For an engine, when there is ``region: ...`` in the YAML settings the engine
does support only this one region::
.. code:: yaml
- name: google belgium
engine: google
region: fr-BE
"""
fetch_traits: "Callable[[EngineTraits, bool], None]"
"""Function to to fetch engine's traits from origin."""
traits: "traits.EngineTraits"
"""Traits of the engine."""
# settings.yml
categories: list[str]
"""Specifies to which :ref:`engine categories` the engine should be added."""
name: str
"""Name that will be used across SearXNG to define this engine. In settings, on
the result page .."""
engine: str
"""Name of the python file used to handle requests and responses to and from
this search engine (file name from :origin:`searx/engines` without
``.py``)."""
enable_http: bool
"""Enable HTTP (by default only HTTPS is enabled)."""
shortcut: str
"""Code used to execute bang requests (``!foo``)"""
timeout: float
"""Specific timeout for search-engine."""
display_error_messages: bool
"""Display error messages on the web UI."""
proxies: dict[str, dict[str, str]]
"""Set proxies for a specific engine (YAML):
.. code:: yaml
proxies :
http: socks5://proxy:port
https: socks5://proxy:port
"""
disabled: bool
"""To disable by default the engine, but not deleting it. It will allow the
user to manually activate it in the settings."""
inactive: bool
"""Remove the engine from the settings (*disabled & removed*)."""
about: dict[str, dict[str, str]]
class EngineAbout(msgspec.Struct, kw_only=True):
"""Additional fields describing the engine.
.. code:: yaml
@@ -296,21 +194,173 @@ class Engine(abc.ABC): # pylint: disable=too-few-public-methods
results: HTML
"""
using_tor_proxy: bool
# pylint: disable=too-few-public-methods
website: str = ""
"""Official web-site of the origin."""
wikidata_id: str = ""
"""`Wikidata ID <https://www.wikidata.org/wiki/Wikidata:Identifiers>`_"""
official_api_documentation: str = ""
"""URL of the official API (regardless of whether it is used)"""
use_official_api: bool = False
"""SearXNG engine makes use of the official API or not"""
require_api_key: bool = False
"""API requires a key or not."""
results: str = ""
"""Data format of the source (online-engines: of the response)."""
description: str = ""
"""Brief description of the engine and where it gets its data from.
This value should only be set as long as no description of the data source
is available via a :py:obj:`EngineAbout.wikidata_id`.
"""
language: str = ""
"""Deprecated! Migrate your setting from `engine.about.language` to
`engine.language`"""
class Engine(abc.ABC): # pylint: disable=too-few-public-methods
"""Class of engine instances build from YAML settings.
Further documentation see :ref:`general engine configuration`.
The defaults are taken from :py:obj:`searx.engines.ENGINE_DEFAULT_ARGS`.
.. hint::
This class is currently never initialized and only used for type hinting.
"""
logger: logging.Logger
# Common options of the engine module
engine_type: "ProcessorType" = "online"
"""Type of the engine (:ref:`searx.search.processors`)"""
paging: bool = False
"""Engine supports multiple pages."""
max_page: int = 0
"""If the engine supports paging, then this is the value for the last page
that is still supported. ``0`` means unlimited numbers of pages."""
time_range_support: bool = False
"""Engine supports search time range."""
safesearch: bool = False
"""Engine supports SafeSearch"""
language_support: bool = False
"""Engine supports languages (locales) search."""
fetch_traits: "Callable[[EngineTraits, bool], None]"
"""Function to to fetch engine's traits from origin."""
traits: "traits.EngineTraits"
"""Traits of the engine."""
# settings.yml
name: str
"""Name that will be used across SearXNG to define this engine. In settings, on
the result page .."""
engine: str
"""Name of the python file used to handle requests and responses to and from
this search engine (file name from :origin:`searx/engines` without
``.py``)."""
categories: list[str] = ["general"]
"""Specifies to which :ref:`engine categories` the engine should be added."""
language: str = ""
"""If the engine supports only one language, this language is specified here
(``en``, ``de``, ``"no"`` or ..); otherwise, the value remains empty. For
the YAML configuration: think of the `YAML-Norway problem
<https://ruuda.nl/2023/the-yaml-document-from-hell#the-norway-problem>`_
.. code:: yaml
- name: google norway
engine: google
language: "no"
Depending on ``language_support``, this value has similar but also slightly
different meanings.
- When ``language_support`` is **true**, the map of
:py:obj:`traits.EngineTraits.languages` is reduced to the selected
language
- When ``language_support`` is **false**, then the implementation of the
engine only supports this one ``language``
"""
region: str = ""
"""For an engine, when there is ``region: ...`` in the YAML settings the engine
does support only this one region::
.. code:: yaml
- name: google belgium
engine: google
region: fr-BE
"""
enable_http: bool
"""Enable HTTP (by default only HTTPS is enabled)."""
shortcut: str
"""Code used to execute bang requests (``!foo``)"""
timeout: float
"""Specific timeout for search-engine."""
display_error_messages: bool
"""Display error messages on the web UI."""
disabled: bool = False
"""To disable by default the engine, but not deleting it. It will allow the
user to manually activate it in the settings."""
inactive: bool = False
"""Remove the engine from the settings (*disabled & removed*)."""
about: EngineAbout = EngineAbout()
"""Additional fields describing the engine."""
using_tor_proxy: bool = False
"""Using tor proxy (``true``) or not (``false``) for this engine."""
send_accept_language_header: bool
send_accept_language_header: bool = True
"""When this option is activated (default), the language (locale) that is
selected by the user is used to build and send a ``Accept-Language`` header
in the request to the origin search engine."""
tokens: list[str]
tokens: list[str] = []
"""A list of secret tokens to make this engine *private*, more details see
:ref:`private engines`."""
weight: int
weight: float = 1.0
"""Weighting of the results of this engine (:ref:`weight <settings engines>`)."""
proxies: dict[str, dict[str, str]]
"""Set proxies for a specific engine (YAML):
.. code:: yaml
proxies :
http: socks5://proxy:port
https: socks5://proxy:port
"""
def setup(self, engine_settings: dict[str, t.Any]) -> bool: # pylint: disable=unused-argument
"""Dynamic setup of the engine settings.
+15 -12
View File
@@ -142,11 +142,11 @@ class EngineTraits:
"""
if self.data_type == "traits_v1":
self._set_traits_v1(engine)
self._set_traits_v1(engine) # pyright: ignore[reportArgumentType]
else:
raise TypeError("engine traits of type %s is unknown" % self.data_type)
def _set_traits_v1(self, engine: "Engine | types.ModuleType") -> None:
def _set_traits_v1(self, engine: "Engine") -> None:
# For an engine, when there is `language: ...` in the YAML settings the engine
# does support only this one language (region)::
#
@@ -159,22 +159,25 @@ class EngineTraits:
_msg = "settings.yml - engine: '%s' / %s: '%s' not supported"
languages = traits.languages
if hasattr(engine, "language"):
if engine.language not in languages:
raise ValueError(_msg % (engine.name, "language", engine.language))
traits.languages = {engine.language: languages[engine.language]}
if engine.language:
if engine.language_support:
if not len(traits.languages) > 1:
raise ValueError(
f"engine {engine.name}: activated language_support with just one or less languages"
)
if engine.language not in traits.languages:
raise ValueError(_msg % (engine.name, "language", engine.language))
traits.languages = {engine.language: traits.languages[engine.language]}
regions = traits.regions
if hasattr(engine, "region"):
if engine.region not in regions:
if engine.region:
if engine.region not in traits.regions:
raise ValueError(_msg % (engine.name, "region", engine.region))
traits.regions = {engine.region: regions[engine.region]}
traits.regions = {engine.region: traits.regions[engine.region]}
engine.language_support = bool(traits.languages or traits.regions)
# set the copied & modified traits in engine's namespace
engine.traits = traits # pyright: ignore[reportAttributeAccessIssue]
engine.traits = traits
class EngineTraitsMap(dict[str, EngineTraits]):
+1 -1
View File
@@ -22,8 +22,8 @@ about = {
"use_official_api": False,
"require_api_key": False,
"results": "HTML",
"language": "zh",
}
language = "zh"
# Engine Configuration
categories = ["general"]
+6 -6
View File
@@ -5,19 +5,19 @@ intended monkey patching of the engine modules.
.. attention::
Monkey-patching modules is a practice from the past that shouldn't be
expanded upon. In the long run, there should be an engine class that can be
inherited. However, as long as this class doesn't exist, and as long as all
engine modules aren't converted to an engine class, these builtin types will
still be needed.
expanded upon. In the long run, engines should be instances of
:py:obj:`searx.enginelib.Engine`. However, as long as long as all engine
modules aren't converted to this class, these builtin types will still be
needed.
"""
import logging
from searx.enginelib import traits as _traits
logger: logging.Logger
supported_languages: str
language_aliases: str
language_support: bool
language: str
region: str
traits: _traits.EngineTraits
# from searx.engines.ENGINE_DEFAULT_ARGS
+46 -8
View File
@@ -14,40 +14,48 @@ import sys
import copy
import os
from os.path import realpath, dirname
import warnings
import types
import inspect
import msgspec
from searx import logger, settings
from searx.utils import load_module
if t.TYPE_CHECKING:
from searx.enginelib import Engine
from searx.data import ENGINE_TRAITS
from searx.enginelib import Engine, EngineAbout
logger = logger.getChild('engines')
ENGINE_DIR = dirname(realpath(__file__))
# Defaults for the namespace of an engine module, see load_engine()
ENGINE_DEFAULT_ARGS: dict[str, int | str | list[t.Any] | dict[str, t.Any] | bool] = {
ENGINE_DEFAULT_ARGS: dict[str, t.Any] = {
# Common options in the engine module
"engine_type": "online",
"paging": False,
"max_page": 0,
"time_range_support": False,
"safesearch": False,
"language_support": False,
# settings.yml
"categories": ["general"],
"language": "",
"region": "",
"enable_http": False,
"shortcut": "-",
"timeout": settings["outgoing"]["request_timeout"],
"display_error_messages": True,
"disabled": False,
"inactive": False,
"about": {},
"about": EngineAbout(),
"using_tor_proxy": False,
"send_accept_language_header": True,
"tokens": [],
"max_page": 0,
"weight": 1.0,
}
"""Default values that are set in an engine of type *module*, please compare
with the class :py:obj:`searx.enginelib.Engine`."""
# set automatically when an engine does not have any tab category
DEFAULT_CATEGORY = 'other'
@@ -177,14 +185,41 @@ def set_loggers(engine: "Engine|types.ModuleType", engine_name: str):
def update_engine_attributes(engine: "Engine | types.ModuleType", engine_data: dict[str, t.Any]):
# pylint: disable=too-many-branches
# set engine attributes from engine_data
kvargs: dict[str, t.Any]
if isinstance(engine.about, EngineAbout):
kvargs = {**msgspec.to_builtins(engine.about), **engine_data.get("about", {})}
else:
kvargs = {**engine.about, **engine_data.get("about", {})}
try:
engine.about = EngineAbout(**kvargs)
except TypeError as exc:
raise TypeError(
f"engine '{engine_data['name']}' ({engine_data['engine']}) - in the about section --> {exc}"
) from exc
# warn about deprecated engine settings
if engine.about.language:
if hasattr(engine, "language") and not engine.language:
engine.language = engine.about.language
warnings.warn(
f"engine '{engine_data['name']}' ({engine_data['engine']})"
f" - migrate engine.about.language to engine.language!",
DeprecationWarning,
2,
)
for param_name, param_value in engine_data.items():
if param_name == "about":
continue
if param_name == 'categories':
if isinstance(param_value, str):
param_value = list(map(str.strip, param_value.split(',')))
engine.categories = param_value # type: ignore
elif hasattr(engine, 'about') and param_name == 'about':
engine.about = {**engine.about, **engine_data['about']} # type: ignore
else:
setattr(engine, param_name, param_value)
@@ -193,6 +228,9 @@ def update_engine_attributes(engine: "Engine | types.ModuleType", engine_data: d
if not hasattr(engine, arg_name):
setattr(engine, arg_name, copy.deepcopy(arg_value))
if ENGINE_TRAITS.get(engine.name, {}).get("languages") and not engine.language_support:
raise ValueError(f"engine '{engine.name}' ({engine_data['engine']}) language_support should be set to True")
def update_attributes_for_tor(engine: "Engine | types.ModuleType"):
if using_tor_proxy(engine) and hasattr(engine, 'onion_url'):
+1 -1
View File
@@ -16,12 +16,12 @@ about = {
"use_official_api": False,
"require_api_key": False,
"results": "HTML",
"language": "zh",
}
# Engine Configuration
categories = ["videos"]
paging = True
language = "zh"
# Base URL
base_url = "https://www.acfun.cn"
+1
View File
@@ -64,6 +64,7 @@ about: dict[str, t.Any] = {
# engine dependent config
categories = ["files", "books"]
paging: bool = True
language_support = True
# search-url
base_url: list[str] | str = []
+1 -1
View File
@@ -42,8 +42,8 @@ about = {
'use_official_api': False,
'require_api_key': False,
'results': 'HTML',
'language': 'it',
}
language = "it"
def request(query, params):
-210
View File
@@ -1,210 +0,0 @@
# SPDX-License-Identifier: AGPL-3.0-or-later
"""AOL supports WEB, image, and video search. Internally, it uses the Bing
index.
AOL doesn't seem to support setting the language via request parameters, instead
the results are based on the URL. For example, there is
- `search.aol.com <https://search.aol.com>`_ for English results
- `suche.aol.de <https://suche.aol.de>`_ for German results
However, AOL offers its services only in a few regions:
- en-US: search.aol.com
- de-DE: suche.aol.de
- fr-FR: recherche.aol.fr
- en-GB: search.aol.co.uk
- en-CA: search.aol.ca
In order to still offer sufficient support for language and region, the `search
keywords`_ known from Bing, ``language`` and ``loc`` (region), are added to the
search term (AOL is basically just a proxy for Bing).
.. _search keywords:
https://support.microsoft.com/en-us/topic/advanced-search-keywords-ea595928-5d63-4a0b-9c6b-0b769865e78a
"""
from urllib.parse import urlencode, unquote_plus
import typing as t
from lxml import html
from dateutil import parser
from searx.result_types import EngineResults
from searx.utils import eval_xpath_list, eval_xpath, extract_text
if t.TYPE_CHECKING:
from searx.extended_types import SXNG_Response
from searx.search.processors import OnlineParams
about = {
"website": "https://www.aol.com",
"wikidata_id": "Q27585",
"official_api_documentation": None,
"use_official_api": False,
"require_api_key": False,
"results": "HTML",
}
categories = ["general"]
search_type = "search" # supported: search, image, video
paging = True
safesearch = True
time_range_support = True
results_per_page = 10
base_url = "https://search.aol.com"
time_range_map = {"day": "1d", "week": "1w", "month": "1m", "year": "1y"}
safesearch_map = {0: "p", 1: "r", 2: "i"}
enable_http2 = False
def init(_):
if search_type not in ("search", "image", "video"):
raise ValueError(f"unsupported search type {search_type}")
def request(query: str, params: "OnlineParams") -> None:
language, region = (params["searxng_locale"].split("-") + [None])[:2]
if language and language != "all":
query = f"{query} language:{language}"
if region:
query = f"{query} loc:{region}"
args: dict[str, str | int | None] = {
"q": query,
"b": params["pageno"] * results_per_page + 1, # page is 1-indexed
"pz": results_per_page,
}
if params["time_range"]:
args["fr2"] = "time"
args["age"] = params["time_range"]
else:
args["fr2"] = "sb-top-search"
params["cookies"]["sB"] = f"vm={safesearch_map[params['safesearch']]}"
params["url"] = f"{base_url}/aol/{search_type}?{urlencode(args)}"
logger.debug(params)
def _deobfuscate_url(obfuscated_url: str) -> str | None:
# URL looks like "https://search.aol.com/click/_ylt=AwjFSDjd;_ylu=JfsdjDFd/RV=2/RE=1774058166/RO=10/RU=https%3a%2f%2fen.wikipedia.org%2fwiki%2fTree/RK=0/RS=BP2CqeMLjscg4n8cTmuddlEQA2I-" # pylint: disable=line-too-long
if not obfuscated_url:
return None
for part in obfuscated_url.split("/"):
if part.startswith("RU="):
return unquote_plus(part[3:])
# pattern for de-obfuscating URL not found, fall back to Yahoo's tracking link
return obfuscated_url
def _general_results(doc: html.HtmlElement) -> EngineResults:
res = EngineResults()
for result in eval_xpath_list(doc, "//div[@id='web']//ol/li[not(contains(@class, 'first'))]"):
obfuscated_url = extract_text(eval_xpath(result, ".//h3/a/@href"))
if not obfuscated_url:
continue
url = _deobfuscate_url(obfuscated_url)
if not url:
continue
res.add(
res.types.MainResult(
url=url,
title=extract_text(eval_xpath(result, ".//h3/a")) or "",
content=extract_text(eval_xpath(result, ".//div[contains(@class, 'compText')]")) or "",
thumbnail=extract_text(eval_xpath(result, ".//a[contains(@class, 'thm')]/img/@data-src")) or "",
)
)
return res
def _video_results(doc: html.HtmlElement) -> EngineResults:
res = EngineResults()
for result in eval_xpath_list(doc, "//div[contains(@class, 'results')]//ol/li"):
obfuscated_url = extract_text(eval_xpath(result, ".//a/@href"))
if not obfuscated_url:
continue
url = _deobfuscate_url(obfuscated_url)
if not url:
continue
published_date_raw = extract_text(eval_xpath(result, ".//div[contains(@class, 'v-age')]"))
try:
published_date = parser.parse(published_date_raw or "")
except parser.ParserError:
published_date = None
res.add(
res.types.LegacyResult(
{
"template": "videos.html",
"url": url,
"title": extract_text(eval_xpath(result, ".//h3")),
"content": extract_text(eval_xpath(result, ".//div[contains(@class, 'compText')]")),
"thumbnail": extract_text(eval_xpath(result, ".//img[contains(@class, 'thm')]/@src")),
"length": extract_text(eval_xpath(result, ".//span[contains(@class, 'v-time')]")),
"publishedDate": published_date,
}
)
)
return res
def _image_results(doc: html.HtmlElement) -> EngineResults:
res = EngineResults()
for result in eval_xpath_list(doc, "//section[@id='results']//ul/li"):
obfuscated_url = extract_text(eval_xpath(result, "./a/@href"))
if not obfuscated_url:
continue
url = _deobfuscate_url(obfuscated_url)
if not url:
continue
res.add(
res.types.LegacyResult(
{
"template": "images.html",
# results don't have an extra URL, only the image source
"url": url,
"title": extract_text(eval_xpath(result, ".//a/@aria-label")),
"thumbnail_src": extract_text(eval_xpath(result, ".//img/@src")),
"img_src": url,
}
)
)
return res
def response(resp: "SXNG_Response") -> EngineResults:
doc = html.fromstring(resp.text)
match search_type:
case "search":
results = _general_results(doc)
case "image":
results = _image_results(doc)
case "video":
results = _video_results(doc)
case _:
raise ValueError("unsupported search type")
for suggestion in eval_xpath_list(doc, ".//ol[contains(@class, 'searchRightBottom')]//table//a"):
results.add(results.types.LegacyResult({"suggestion": extract_text(suggestion)}))
return results
+1
View File
@@ -35,6 +35,7 @@ about = {
categories = ["it", "software wikis"]
paging = True
main_wiki = "wiki.archlinux.org"
language_support = True
def request(query, params):
+1 -1
View File
@@ -54,8 +54,8 @@ about = {
"use_official_api": True,
"require_api_key": True,
"results": "JSON",
"language": "en",
}
language = "en"
CACHE: EngineCache
"""Persistent (SQLite) key/value cache that deletes its values after ``expire``
+1 -1
View File
@@ -23,8 +23,8 @@ about = {
"use_official_api": False,
"require_api_key": False,
"results": "JSON",
"language": "zh",
}
language = "zh"
paging = True
categories = []
+1
View File
@@ -34,6 +34,7 @@ about = {
categories = ["general", "social media"]
paging = True
time_range_support = True
language_support = True
base_url = "https://boardreader.com"
time_range_map = {"day": "1", "week": "7", "month": "30", "year": "365"}
+1 -1
View File
@@ -13,8 +13,8 @@ about = {
'use_official_api': False,
'require_api_key': False,
'results': 'JSON',
'language': 'de',
}
language = "de"
paging = True
categories = ['general']
+115
View File
@@ -0,0 +1,115 @@
# SPDX-License-Identifier: AGPL-3.0-or-later
"""Chatnoir is an open source search engine developed by Webis, a network of
researchers from the universities of Weimar, Halle and Leipzig. It supports
different different text corpora as indexes, e.g. CommonCrawl. See its
`announcement`_ for more information.
.. _announcement : https://groups.google.com/g/common-crawl/c/3o2dOHpeRxo/m/H2Osqz9dAAAJ
"""
import typing as t
from searx.exceptions import SearxEngineAPIException
from searx.extended_types import SXNG_Response
from searx.network import get, post
from searx.result_types import EngineResults
from searx.utils import html_to_text
if t.TYPE_CHECKING:
from searx.search.processors import OnlineParams
about = {
"website": "https://www.chatnoir.eu",
"official_api_documentation": "https://www.chatnoir.eu/docs/api-general",
"use_official_api": True,
"require_api_key": False,
"results": "JSON",
}
base_url = "https://www.chatnoir.eu"
categories = ["general"]
paging = True
page_size = 10
api_key = ""
"""You can optionally provide your own API key here. This one will then be used
instead of scraping an API key."""
search_index = "cw22"
"""Search index to browse in. See `the API documentation
<https://www.chatnoir.eu/docs/api-general>`_ for a full list."""
def _obtain_api_key() -> tuple[str, str, str]:
home_resp = get(base_url)
if not home_resp.ok:
raise SearxEngineAPIException("failed to obtain api key")
csrf_token = home_resp.cookies["csrftoken"]
token_resp = post(
"https://www.chatnoir.eu/?init",
headers={
"Referer": f"{base_url}/",
"X-Requested-With": "XMLHttpRequest",
"X-Csrf-Token": csrf_token,
},
cookies=home_resp.cookies,
)
if not token_resp.ok:
raise SearxEngineAPIException("failed to obtain api key")
session_id = token_resp.cookies["sessionid"]
scraped_api_key = token_resp.json()["token"]["token"]
return csrf_token, session_id, scraped_api_key
def request(query: str, params: "OnlineParams"):
if api_key:
# use user-provided API key instead of scraping one
headers = {
"Authorization": f"Bearer {api_key}",
}
params["headers"].update(headers)
else:
csrf_token, session_id, scraped_api_key = _obtain_api_key()
headers = {
"Authorization": f"Bearer {scraped_api_key}",
"X-Csrf-Token": csrf_token,
}
params["headers"].update(headers)
params["cookies"] = {"csrftoken": session_id, "sessionid": session_id}
params["url"] = f"{base_url}/api/v1/_search"
params["method"] = "POST"
json_data = {
"query": query,
"index": [
search_index,
],
"from": (params["pageno"] - 1) * page_size,
"size": page_size,
"_extended_meta": True,
}
params["json"] = json_data
def response(resp: "SXNG_Response") -> EngineResults:
res = EngineResults()
results = resp.json()["results"]
for result in results:
res.add(
res.types.MainResult(
url=result["target_uri"],
title=html_to_text(result["title"]),
content=html_to_text(result["snippet"]),
)
)
return res
+1 -1
View File
@@ -10,8 +10,8 @@ about = {
'use_official_api': False,
'require_api_key': False,
'results': 'JSON',
'language': 'de',
}
language = "de"
paging = True
categories = []
+8 -1
View File
@@ -70,13 +70,13 @@ about = {
"use_official_api": False,
"require_api_key": False,
"results": "JSON",
"language": "zh",
}
paging = True
time_range_support = True
results_per_page = 10
categories = []
language = "zh"
ChinasoCategoryType = t.Literal['news', 'videos', 'images']
"""ChinaSo supports news, videos, images search.
@@ -156,6 +156,13 @@ def response(resp):
except Exception as e:
raise SearxEngineAPIException(f"Invalid response: {e}") from e
# Upstream returns {'status': 0, 'msg': 'empty result', 'data': {}} when there
# are no results; this is a valid empty result rather than an API error.
if not isinstance(data, dict) or "data" not in data:
raise SearxEngineAPIException("Invalid response")
if not data["data"]:
return []
parsers = {'news': parse_news, 'images': parse_images, 'videos': parse_videos}
return parsers[chinaso_category](data)
+1
View File
@@ -40,6 +40,7 @@ categories = ["videos"]
paging = True
page_size = 10
language_support = True
time_range_support = True
time_delta_dict = {
"day": timedelta(days=1),
+6 -8
View File
@@ -24,7 +24,7 @@ import typing as t
import json
from searx.result_types import EngineResults
from searx.enginelib import EngineCache
from searx.enginelib import EngineCache, EngineAbout
if t.TYPE_CHECKING:
from searx.search.processors import RequestParams
@@ -35,13 +35,11 @@ categories = ["general"]
disabled = True
timeout = 2.0
about = {
"wikidata_id": None,
"official_api_documentation": None,
"use_official_api": False,
"require_api_key": False,
"results": "JSON",
}
language = "en"
about = EngineAbout(
results="JSON",
description="Demo offline engine Engine with results in the English language.",
)
# if there is a need for globals, use a leading underline
_my_offline_engine: str = ""
+9 -8
View File
@@ -25,6 +25,7 @@ import typing as t
from urllib.parse import urlencode
from searx.result_types import EngineResults
from searx.enginelib import EngineAbout
if t.TYPE_CHECKING:
from searx.extended_types import SXNG_Response
@@ -43,14 +44,14 @@ page_size = 20
search_api = "https://api.artic.edu/api/v1/artworks/search"
image_api = "https://www.artic.edu/iiif/2/"
about = {
"website": "https://www.artic.edu",
"wikidata_id": "Q239303",
"official_api_documentation": "http://api.artic.edu/docs/",
"use_official_api": True,
"require_api_key": False,
"results": "JSON",
}
about = EngineAbout(
website="https://www.artic.edu",
wikidata_id="Q239303",
official_api_documentation="http://api.artic.edu/docs/",
use_official_api=True,
require_api_key=False,
results="JSON",
)
# if there is a need for globals, use a leading underline
+1 -1
View File
@@ -11,8 +11,8 @@ about = {
'use_official_api': False,
'require_api_key': False,
'results': 'HTML',
'language': 'de',
}
language = "de"
categories = []
paging = True
+1
View File
@@ -203,6 +203,7 @@ about: dict[str, str | bool] = {
categories: list[str] = ["general", "web"]
paging: bool = True
time_range_support: bool = True
language_support = True
safesearch: bool = True
"""DDG-lite: user can't select but the results are filtered."""
+1
View File
@@ -28,6 +28,7 @@ about = {
"require_api_key": False,
"results": "JSON (site requires js to get images)",
}
language_support = True
# engine dependent config
categories = []
+1
View File
@@ -26,6 +26,7 @@ about = {
"require_api_key": False,
"results": "JSON",
}
language_support = True
# engine dependent config
categories = ["weather"]
+1 -1
View File
@@ -14,8 +14,8 @@ about = {
"use_official_api": False,
"require_api_key": False,
"results": 'HTML',
"language": 'de',
}
language = "de"
categories = ['dictionaries']
paging = True
+1 -1
View File
@@ -55,7 +55,7 @@ about = {
'official_api_documentation': 'https://www.elastic.co/guide/en/elasticsearch/reference/current/search-search.html',
'use_official_api': True,
'require_api_key': False,
'format': 'JSON',
"results": "JSON",
}
base_url = 'http://localhost:9200'
+118
View File
@@ -0,0 +1,118 @@
# SPDX-License-Identifier: AGPL-3.0-or-later
"""FindFiles.net_ is a Germany-based file search engine.
FindFiles.net_ is a specialized file search engine designed to help you search
files online with precision. Unlike traditional search engines that mainly index
web pages, FindFiles focuses on finding real files on the internet - including
PDFs, documents, archives, videos, datasets, and more.
.. _FindFiles.net: https://findfiles.net
"""
from os.path import basename
from urllib.parse import urlencode
import typing as t
from lxml import html
from searx.result_types import EngineResults
from searx.utils import extract_text, eval_xpath, eval_xpath_list
if t.TYPE_CHECKING:
from extended_types import SXNG_Response
from search.processors import OnlineParams
about = {
"website": "https://findfiles.net",
"wikidata_id": None,
"official_api_documentation": None,
"use_official_api": False,
"require_api_key": False,
"results": "HTML",
}
base_url = "https://findfiles.net"
categories = ["files"]
paging = True
safeserach = True
safesearch_map = {
0: "contentguard.off",
1: "contentguard.moderate",
2: "contentguard.strict",
}
FindFilesCategory = t.Literal[
"all",
"document",
"text",
"image",
"audio",
"video",
]
FINDFILES_CATEGORIES = t.get_args(FindFilesCategory)
findfiles_categ: FindFilesCategory = "all"
"""Category to search in."""
def setup(_: dict[str, t.Any]) -> bool:
if findfiles_categ not in FINDFILES_CATEGORIES:
raise ValueError("invalid category: %s" % findfiles_categ)
return True
def request(query: str, params: "OnlineParams") -> None:
args = {
"query": query,
"contentguard": safesearch_map[params["safesearch"]],
"page": params["pageno"],
}
# the language in the path doesn't change anything about the results, it
# only changes the UI
params["url"] = f"{base_url}/en/serp/{findfiles_categ}/?{urlencode(args)}"
def response(resp: "SXNG_Response") -> EngineResults:
res = EngineResults()
dom = html.fromstring(resp.text)
if findfiles_categ == "image":
for result in eval_xpath_list(
dom, "//div[contains(@class, 'image-mosaic')]/div[contains(@class, 'image-item')]"
):
res.add(
res.types.Image(
url=extract_text(eval_xpath(result, ".//div[contains(@class, 'caption')]/a/@href")) or "",
title=extract_text(eval_xpath(result, ".//div[contains(@class, 'caption')]/a")) or "",
thumbnail_src=extract_text(eval_xpath(result, ".//img/@src")) or "",
)
)
elif findfiles_categ == "video":
for result in eval_xpath_list(
dom, "//div[contains(@class, 'video-mosaic')]/div[contains(@class, 'video-item')]"
):
video_src = extract_text(eval_xpath(result, ".//video/@src")) or ""
res.add(
res.types.LegacyResult(
template="videos.html",
url=video_src,
title=extract_text(eval_xpath(result, ".//div[contains(@class, 'caption')]/span")) or "",
iframe_src=video_src or "",
)
)
else:
for result in eval_xpath_list(dom, "//ol/li[contains(@class, 'result-item')]/article"):
filename = basename(extract_text(eval_xpath(result, ".//h3")) or "")
res.add(
res.types.File(
url=extract_text(eval_xpath(result, ".//h3/a/@href")) or "",
title=filename,
content=" ".join(extract_text(el) or "" for el in eval_xpath_list(result, "./div/span")),
filename=filename,
size=extract_text(eval_xpath(result, "(.//span[@id])[1]")) or "",
embedded=extract_text(eval_xpath(result, ".//audio/@src")) or "",
)
)
return res
+1
View File
@@ -63,6 +63,7 @@ def response(resp: "SXNG_Response"):
url=_fix_url(result["slug"]),
thumbnail_src=_fix_url(result["png"]),
img_src=_fix_url(result["png512"]),
img_format="PNG",
author=result["team_name"],
)
)
+1 -1
View File
@@ -27,8 +27,8 @@ about = {
'official_api_documentation': None,
'require_api_key': False,
'results': 'HTML',
'language': 'de',
}
language = "de"
paging = True
categories = ['shopping']
+127
View File
@@ -0,0 +1,127 @@
# SPDX-License-Identifier: AGPL-3.0-or-later
"""Giphy (images)"""
import random
from urllib.parse import urlencode
import re
import typing as t
from lxml import html
from searx.enginelib import EngineCache
from searx.exceptions import SearxEngineAPIException
from searx.network import get
from searx.result_types import EngineResults
from searx.result_types.image import ImageRef
from searx.utils import eval_xpath_list, humanize_bytes
if t.TYPE_CHECKING:
from searx.extended_types import SXNG_Response
from searx.search.processors import OnlineParams
about = {
"website": "https://giphy.com",
"wikidata_id": "Q17054335",
"official_api_documentation": None,
"use_official_api": False,
"require_api_key": False,
"results": "JSON",
}
base_url = "https://giphy.com"
api_url = "https://api.giphy.com"
categories = ["images"]
paging = True
page_size = 15
GiphyCategs = t.Literal["gifs", "stickers", "clips"]
giphy_categ: GiphyCategs = "gifs"
"""Giphy category to search in."""
CACHE: EngineCache
"""Cache for storing the extracted api key."""
_GIPHY_API_KEY_RE = re.compile(r"[Aa]piKey\s*:\s*\"(\w+)\"")
def setup(engine_settings: dict[str, str]) -> bool:
if giphy_categ not in t.get_args(GiphyCategs):
raise ValueError("invalid category: %s" % giphy_categ)
global CACHE # pylint: disable=global-statement
CACHE = EngineCache(engine_settings["name"])
return True
def _get_api_key() -> str:
"""
Extract the Giphy API key from the JavaScript code. There are different API keys
(e.g. for mobile, desktop, ...), so we just pick a random one of these.
"""
cached = CACHE.get("api_key")
if cached:
return cached
homepage_resp = get(base_url)
homepage_doc = html.fromstring(homepage_resp.text)
for script_src in eval_xpath_list(homepage_doc, "//script[contains(@src, 'layout')]/@src"):
script_resp = get(base_url + script_src)
api_keys = _GIPHY_API_KEY_RE.findall(script_resp.text)
if api_keys:
api_key = random.choice(api_keys)
CACHE.set("api_key", api_key, expire=60 * 60 * 6) # 6 hours
return api_key
raise SearxEngineAPIException("failed to extract api keys")
def request(query: str, params: "OnlineParams") -> None:
args = {
"q": query,
"api_key": _get_api_key(),
"limit": page_size,
"offset": (params["pageno"] - 1) * page_size,
"type": giphy_categ,
}
params["url"] = f"{api_url}/v1/{giphy_categ}/search?{urlencode(args)}"
def response(resp: "SXNG_Response"):
res = EngineResults()
result: dict[str, t.Any]
for result in resp.json()["data"]:
img = result['images']['original']
formats = [
ImageRef(url=img["mp4"], subtype="mp4"), # type: ignore
ImageRef(url=img["webp"], subtype="webp"), # type: ignore
]
thumb = (
result["images"].get("downsized")
or result["images"].get("downsized_medium")
or result["images"].get("downsized_small")
or result["images"].get("downsized_large")
)
res.add(
res.types.Image(
title=result["title"],
content=", ".join(result.get("tags", [])),
url=result["url"],
thumbnail_src=thumb.get("url") or img["url"],
img_src=img["url"],
resolution=f"{img['width']}x{img['height']}",
img_format="GIF",
formats=formats,
author=result["username"],
filesize=humanize_bytes(int(img["size"])),
source=result.get("source_tld") or "",
)
)
return res
+1
View File
@@ -57,6 +57,7 @@ max_page = 50
.. _Google max 50 pages: https://github.com/searxng/searxng/issues/2982
"""
time_range_support = True
language_support = True
safesearch = True
time_range_dict = {"day": "d", "week": "w", "month": "m", "year": "y"}
+1
View File
@@ -43,6 +43,7 @@ max_page = 50
"""
time_range_support = True
language_support = True
safesearch = True
filter_mapping = {0: 'images', 1: 'active', 2: 'active'}
+1
View File
@@ -66,6 +66,7 @@ about = {
categories = ["news"]
paging = False
time_range_support = False
language_support = True
# Google-News results are always *SafeSearch*. Option 'safesearch' is set to
# False here.
+1 -1
View File
@@ -34,8 +34,8 @@ about = {
"use_official_api": True,
"require_api_key": False,
"results": "JSON",
"language": "it",
}
language = "it"
def request(query, params):
+1 -1
View File
@@ -16,8 +16,8 @@ about = {
"use_official_api": False,
"require_api_key": False,
"results": 'HTML',
"language": 'fr',
}
language = "fr"
# engine dependent config
categories = ['videos']
+1 -1
View File
@@ -14,9 +14,9 @@ about = {
"use_official_api": False,
"require_api_key": False,
"results": "JSON",
"language": "zh",
}
language = "zh"
paging = True
time_range_support = True
categories = ["videos"]
+88
View File
@@ -0,0 +1,88 @@
# SPDX-License-Identifier: AGPL-3.0-or-later
"""iseek_ is a search engine by the AI company Vantage Labs LLC,
that focuses on medical and educational applicances.
Although it's an AI company, it doesn't include any AI stuff in its results.
.. _iseek : https://www.iseek.ai/
"""
import base64
from hashlib import sha256
import typing as t
from urllib.parse import urlencode
from searx.result_types import EngineResults
if t.TYPE_CHECKING:
from searx.search.processors import OnlineParams
from searx.extended_types import SXNG_Response
about = {
"website": 'https://www.iseek.com',
"wikidata_id": None,
"official_api_documentation": None,
"use_official_api": False,
"require_api_key": False,
"results": "JSON",
}
categories = ["general"]
paging = True
base_url = "https://api.iseek.com"
page_size = 10
def _get_new_token(query: str, pageno: int) -> str:
"""Create a new ``qToken``. This reduced the time for fetching subsequent pages
from 4 seconds to 200ms when testing."""
# The website uses a random value as qToken for the first page. For our use case,
# it's easier if the qToken can be deterministically re-calculated based on the search query,
# so that we can the same result when calling _get_new_token for the second, third, ... page
#
# var qToken = Math.ceil(Math.random() * parseInt("ZZZZ", 36)).toString(36);
# while (qToken.length < 4) qToken = '0' + qToken;
# qToken = qToken + "_" + pageno
query_hash = sha256(query.encode()).digest()
hash_start = base64.b64encode(query_hash).decode()[0:4]
return f"{hash_start}_{pageno}"
def request(query: str, params: "OnlineParams"):
offset = (params["pageno"] - 1) * page_size
# always seems to find 20 results max
if offset >= 20:
params["url"] = None
return
args = {
"q": query,
"key": "core-web",
"num": str(page_size),
"off": offset,
"rSort": "__metasearch_score_d:desc",
# it supports many more fields, but none of them are really relevant
"names": "title_t,content_txt,url_s",
"qNames": "title_t",
"qToken": _get_new_token(query, params["pageno"]),
}
params["url"] = f"{base_url}/search?{urlencode(args)}"
def response(resp: "SXNG_Response") -> EngineResults:
res = EngineResults()
for group in resp.json()["data"]:
group: dict[str, t.Any]
for result in group["doclist"]["docs"]:
result: dict[str, str]
res.add(
res.types.MainResult(
url=result["url_s"],
title=result["title_t"],
content="".join(result["content_txt"]),
)
)
return res
+3 -3
View File
@@ -13,8 +13,8 @@ about = {
"use_official_api": True,
"require_api_key": False,
"results": 'JSON',
"language": 'ja',
}
language = "ja"
categories = ['dictionaries']
paging = False
@@ -110,8 +110,8 @@ def get_infobox(alt_forms, result_url, definitions):
# definitions
infobox_content.append(
'''
<small><a href="https://www.edrdg.org/wiki/index.php/JMdict-EDICT_Dictionary_Project">JMdict</a>
and <a href="https://www.edrdg.org/enamdict/enamdict_doc.html">JMnedict</a>
<small><a href="https://www.edrdg.org/wiki/index.php/JMdict-EDICT_Dictionary_Project">JMdict</a>
and <a href="https://www.edrdg.org/enamdict/enamdict_doc.html">JMnedict</a>
by <a href="https://www.edrdg.org/edrdg/licence.html">EDRDG</a>, CC BY-SA 3.0.</small>
<ul>
'''
+3
View File
@@ -79,6 +79,9 @@ from json import loads
from urllib.parse import urlencode
from searx.utils import to_string, html_to_text
from searx.network import raise_for_httperror
from searx.enginelib import EngineAbout
about = EngineAbout()
search_url = None
"""
+210
View File
@@ -0,0 +1,210 @@
# SPDX-License-Identifier: AGPL-3.0-or-later
"""Luxxle_ is an American search engine focusing on providing "unbiased"
results.
.. _Luxxle: https://luxxle.com
"""
from json import dumps
from urllib.parse import quote_plus, unquote_plus
import typing as t
from lxml import html
from searx.result_types import EngineResults
from searx.network import get
from searx.utils import (
extr,
gen_useragent,
eval_xpath_list,
extract_text,
eval_xpath,
parse_duration_string,
ElementType,
)
if t.TYPE_CHECKING:
from searx.search.processors import OnlineParams
from searx.extended_types import SXNG_Response
about = {
"website": "https://luxxle.com",
"official_api_documentation": None,
"use_official_api": False,
"require_api_key": False,
"results": "HTML",
}
categories = []
safeseach = True
base_url = "https://luxxle.com"
luxxle_categ = "search"
"""Supported categories: "search", "news", "images", "videos"."""
# otherwise all requests get blocked (http2-fingerprinted probably)
enable_http2 = False
safe_search_map = {0: "Off", 1: "Moderate", 2: "Strict"}
def init(_):
if luxxle_categ not in ("search", "images", "videos", "news"):
raise ValueError("invalid luxxle category: %s" % luxxle_categ)
def _obtain_telemetry_data(query: str) -> dict[str, str]:
"""This data is required for sending search queries.
The luxsearch page (for general results) has a JS dict called ``telemetryData``
that contains all the important info, but the others don't, so we don't use it
here. But it's useful to understand which info is needed.
.. code-block:: javascript
var telemetryData = {
errorInformation: errorInformation,
query: "youapps club",
ip: "10.10.10.10",
timeOf: "1781119224",
authorization: "db889e0ae67d3c320858ad97f51cc4f0a4d8e1913c4f5ebe5d2eafef606521dd",
};
This data is only valid for very short times
"""
resp = get(
f"{base_url}/lux{luxxle_categ}?q={quote_plus(query)}", headers={"User-Agent": gen_useragent(), "Sec-GPC": "1"}
)
def extr_js_variable(name: str) -> str:
val = extr(resp.text, f"var {name} = \"", "\";")
if not val:
val = extr(resp.text, f"var {name} = '", "';")
return val
return {
"ip": extr_js_variable("ip"),
"timeOf": extr_js_variable("timeOf"),
"authorization": extr_js_variable("authorization"),
"preferencesCookie": extr_js_variable("preferencesCookie"),
}
def request(query: str, params: "OnlineParams") -> None:
telemetry_data = _obtain_telemetry_data(query)
market = params["searxng_locale"]
if market == "all":
market = "en-US"
params["url"] = f"{base_url}/load_{luxxle_categ}.php"
search_data = {
**telemetry_data,
"query": query,
"market": market,
"safeSearch": safe_search_map[params["safesearch"]],
"freshness": "",
"language": "english", # UI language
}
if luxxle_categ == "images":
# for some reason this is sent as form data
params["data"] = {"searchData": dumps(search_data)}
else:
params["json"] = {"searchData": search_data}
params["method"] = "POST"
def _extract_url_from_redirect(url: str):
# urls usually look like "/redirect?url=<url>"
query_start_idx = url.find("?url=")
if query_start_idx < 0:
return url
url_start_idx = query_start_idx + len("?url=")
return unquote_plus(url[url_start_idx:])
def _general_results(doc: ElementType, res: EngineResults):
for result in eval_xpath_list(doc, "//div[@id='mainResults']/div[contains(@class, 'resultsContainer')]"):
res.add(
res.types.MainResult(
url=_extract_url_from_redirect(
extract_text(eval_xpath(result, "./div[contains(@class, 'urlAddressLink')]/a/@href")) or ""
),
title=extract_text(eval_xpath(result, "./div[contains(@class, 'urlname')]")) or "",
content=extract_text(eval_xpath(result, "./div[contains(@class, 'urlSnippet')]")) or "",
)
)
def _news_results(doc: ElementType, res: EngineResults):
for result in eval_xpath_list(
doc, "//div[contains(@class, 'newsResults')]/div[contains(@class, 'mediaResultNewsPage')]"
):
res.add(
res.types.MainResult(
url=_extract_url_from_redirect(
extract_text(eval_xpath(result, ".//div[contains(@class, 'mediaResultNewsPageTitle')]/a/@href"))
or ""
),
title=extract_text(eval_xpath(result, ".//div[contains(@class, 'mediaResultNewsPageTitle')]/a")) or "",
content=extract_text(eval_xpath(result, ".//div[contains(@class, 'mediaResultNewsPageDescription')]"))
or "",
thumbnail=extract_text(eval_xpath(result, ".//div[contains(@class, 'mediaResultThumbnail')]//img/@src"))
or "",
)
)
def _video_results(doc: ElementType, res: EngineResults):
for result in eval_xpath_list(doc, "//div[@id='mainResults']/div[contains(@class, 'mediaResult')]"):
res.add(
res.types.MainResult(
template="videos.html",
url=extract_text(eval_xpath(result, "./@data-url")) or "",
title=extract_text(eval_xpath(result, ".//div[contains(@class, 'mediaResultTitleVideo')]/a")) or "",
content=extract_text(eval_xpath(result, ".//div[contains(@class, 'mediaResultDescription')]")) or "",
thumbnail=extract_text(eval_xpath(result, ".//img[contains(@class, 'videoThumbnail')]/@src")) or "",
author=extract_text(eval_xpath(result, ".//div[contains(@class, 'videoCreator')]")) or "",
length=parse_duration_string(
extract_text(eval_xpath(result, ".//span[contains(@class, 'mediaResultDuration')]")) or ""
),
)
)
def _image_results(doc: ElementType, res: EngineResults):
for result in eval_xpath_list(doc, "//div[contains(@class, 'imageResultsWrapper')]/div"):
res.add(
res.types.Image(
url=_extract_url_from_redirect(
extract_text(eval_xpath(result, ".//a[contains(@class, 'imageResultSource')]/@href")) or ""
),
title=extract_text(eval_xpath(result, ".//a[contains(@class, 'imageResultTitle')]")) or "",
source=extract_text(eval_xpath(result, ".//div[contains(@class, 'imageResultSource')]")) or "",
thumbnail_src=extract_text(eval_xpath(result, "./@data-thumbnail-src")) or "",
img_src=extract_text(eval_xpath(result, "./@data-image-src")) or "",
)
)
def response(resp: "SXNG_Response") -> EngineResults:
doc = html.fromstring(resp.text)
res = EngineResults()
match luxxle_categ:
case "search":
_general_results(doc, res)
case "images":
_image_results(doc, res)
case "videos":
_video_results(doc, res)
case "news":
_news_results(doc, res)
case _:
raise ValueError("unsupported category: %s" % luxxle_categ)
return res
+1 -1
View File
@@ -44,7 +44,7 @@ about = {
base_url = "https://api2.marginalia-search.com"
safesearch = True
categories = ["general"]
categories = ["general", "blogs"]
paging = True
results_per_page = 20
api_key = None
+1 -1
View File
@@ -11,9 +11,9 @@ about = {
"use_official_api": True,
"require_api_key": False,
"results": 'JSON',
"language": "de",
}
language = "de"
categories = ['videos']
paging = True
time_range_support = False
+1
View File
@@ -20,6 +20,7 @@ about = {
}
paging = True # paging is only supported for general search
safesearch = True
language_support = True
time_range_support = True # time range search is supported for general and news
max_page = 10
+2 -1
View File
@@ -35,8 +35,9 @@ about = {
'use_official_api': False,
'require_api_key': False,
'results': 'JSON',
'language': 'de',
}
language = "de"
paging = True
categories = ["movies"]
+1 -1
View File
@@ -26,8 +26,8 @@ about = {
"use_official_api": False,
"require_api_key": False,
"results": "HTML",
"language": "ko",
}
language = "ko"
categories = []
paging = True
+1 -1
View File
@@ -13,8 +13,8 @@ about = {
"use_official_api": False,
"require_api_key": False,
"results": "HTML",
"language": "ja",
}
language = "ja"
categories = ["videos"]
paging = True
+1
View File
@@ -26,6 +26,7 @@ about = {
# Engine configuration
paging = True
time_range_support = True
language_support = True
results_per_page = 20
categories = ["videos"]
+1
View File
@@ -25,6 +25,7 @@ about = {
"require_api_key": False,
"results": "JSON",
}
language_support = True
# engine dependent config
categories = ["videos"]
+2 -2
View File
@@ -28,7 +28,7 @@ search_string = 'api/?{query}&limit={limit}'
result_base_url = 'https://openstreetmap.org/{osm_type}/{osm_id}'
# list of supported languages
supported_languages = ['de', 'en', 'fr', 'it']
photon_supported_languages = ["de", "en", "fr", "it"]
# do search-request
@@ -37,7 +37,7 @@ def request(query, params):
if params['language'] != 'all':
language = params['language'].split('_')[0]
if language in supported_languages:
if language in photon_supported_languages:
params['url'] = params['url'] + "&lang=" + language
# using SearXNG User-Agent
+62
View File
@@ -0,0 +1,62 @@
# SPDX-License-Identifier: AGPL-3.0-or-later
"""Podchaser (podcasts)"""
import typing as t
from datetime import datetime
from urllib.parse import urlencode
from searx.result_types import EngineResults
if t.TYPE_CHECKING:
from searx.extended_types import SXNG_Response
from searx.search.processors import OnlineParams
about = {
"website": "https://www.podchaser.com",
"official_api_documentation": "https://www.podchaser.com/api",
"use_official_api": False,
"require_api_key": False,
"results": "JSON",
}
categories = []
paging = True
base_url = "https://api.podchaser.com"
page_size = 25
def request(query: str, params: "OnlineParams") -> None:
args = {
"filters[term]": query,
"limit": page_size,
"offset": (params["pageno"] - 1) * page_size,
"sort_direction": "desc",
"sort_order": "SORT_ORDER_RELEVANCE",
}
params["url"] = f"{base_url}/podcasts?{urlencode(args)}"
params["headers"]["Accept"] = "application/prs.podchaser.v2+json"
def response(resp: "SXNG_Response"):
res = EngineResults()
json_results: list[dict[str, str]] = resp.json()["entities"] # pyright: ignore[reportAny]
for result in json_results:
metadata = [f"{result['number_of_episodes']} episodes"]
if result["categories"]:
metadata.append(", ".join(c["text"] for c in result["categories"])) # pyright: ignore[reportArgumentType]
res.add(
res.types.MainResult(
url=result["feed_url"],
title=result["title"],
content=result["description"],
thumbnail=result["image_url"],
publishedDate=datetime.strptime(result["created_at"], "%Y-%m-%d %H:%M:%S"),
metadata=" | ".join(metadata),
)
)
return res
+1 -1
View File
@@ -77,7 +77,7 @@ from searx.utils import gen_useragent, html_to_text, parse_duration_string
about = {
"website": "https://presearch.io",
"wikidiata_id": "Q7240905",
"wikidata_id": "Q7240905",
"official_api_documentation": "https://docs.presearch.io/nodes/api",
"use_official_api": False,
"require_api_key": False,
+217
View File
@@ -0,0 +1,217 @@
# SPDX-License-Identifier: AGPL-3.0-or-later
"""Privacywall_ claims to be a "privacy-friendly" search engine,
but according to a `Privacyguides discussion`_ it's sharing private
user information with Microsoft and Amazon.
.. _Privacywall : https://www.privacywall.org
.. _`Privacyguides discussion` : https://discuss.privacyguides.net/t/how-is-privacy-wall-search-engine/29486
"""
import typing as t
from urllib.parse import urlencode, unquote_plus
from lxml import html
import babel
from searx.enginelib.traits import EngineTraits
from searx.utils import eval_xpath_list, eval_xpath, extract_text, get_embeded_stream_url, extr
from searx.locales import region_tag
from searx.result_types import EngineResults
if t.TYPE_CHECKING:
from lxml.etree import ElementBase
from searx.extended_types import SXNG_Response
from searx.search.processors import OnlineParams
about = {
"website": "https://privacywall.org",
"wikidata_id": None,
"official_api_documentation": None,
"use_official_api": False,
"require_api_key": False,
"results": "HTML",
}
paging = True
safesearch = True
time_range_support = True
base_url = "https://www.privacywall.org"
privacywall_category = "general"
"""Supported categories are ``general``, ``videos`` and ``images``."""
# corresponds to the "k" query param
safesearch_map = {0: "off", 1: "on", 2: "on"}
# page number sent for videos (is independent of the query) - certainly there's
# a pattern in this, but for our use case it's enough to just support the first
# 10 pages by hardcoding the page "numbers"
video_page_map = {
2: "CAoQAA",
3: "CBQQAA",
4: "CB4QAA",
5: "CCgQAA",
6: "CDIQAA",
7: "CDwQAA",
8: "CEYQAA",
9: "CFAQAA",
10: "CFoQAA",
}
def init(_):
if privacywall_category not in ("general", "images", "videos"):
raise ValueError("invalid category: %s" % privacywall_category)
def request(query: str, params: "OnlineParams") -> None:
if params["pageno"] > 10:
params["url"] = None
return
args = {"q": query, "safesearch": safesearch_map[params["safesearch"]]}
if params["searxng_locale"] != "all":
args["cc"] = traits.get_region(params["searxng_locale"]) or "US"
if params["time_range"]:
# time range uses the same "day", "week", "month", "year" naming scheme as SearXNG
args["time"] = params["time_range"]
if params["pageno"] > 1:
if privacywall_category == "images":
args["page"] = str(params["pageno"])
elif privacywall_category == "videos":
args["page"] = video_page_map[params["pageno"]]
else:
raise ValueError("general engine does not support pagination")
if privacywall_category == "general":
params["url"] = f"{base_url}/search/secure/?{urlencode(args)}"
else:
params["url"] = f"{base_url}/{privacywall_category}/?{urlencode(args)}"
def _general_results(doc: "ElementBase") -> EngineResults:
res = EngineResults()
for result in eval_xpath_list(doc, "//div[@id='pw-results-main']/div[contains(@class, 'result-card')]"):
(
res.add(
res.types.MainResult(
url=extract_text(eval_xpath(result, ".//a[contains(@class, 'result-url-anchor')]/@href")) or "",
title=extract_text(eval_xpath(result, ".//div[contains(@class, 'result_title')]")) or "",
content=extract_text(eval_xpath(result, ".//div[contains(@class, 'result-description')]")) or "",
),
)
)
return res
def _extract_thumbnail_url(url: str) -> str:
"""
Get the URL from strings like "/videos/video.php?id=<urlencoded-urlhere>".
"""
url_start = url.find("?id=") + len("?id=")
thumbnail = unquote_plus(url[url_start:])
return thumbnail
def _image_results(doc: "ElementBase") -> EngineResults:
res = EngineResults()
for result in eval_xpath_list(doc, "//div[@id='container']/div[contains(@class, 'imgcontainer')]"):
(
res.add(
res.types.Image(
url=extract_text(eval_xpath(result, "./a/@href")) or "",
content=extract_text(eval_xpath(result, "./a/@alt")) or "",
thumbnail_src=_extract_thumbnail_url(extract_text(eval_xpath(result, ".//img/@src")) or ""),
source=extract_text(eval_xpath(result, ".//div[contains(@class, 'image-source-badge')]")) or "",
),
)
)
return res
def _video_results(doc: "ElementBase") -> EngineResults:
res = EngineResults()
for result in eval_xpath_list(
doc, "//div[contains(@class, 'video-container')]/div[contains(@class, 'video-card')]"
):
url = extract_text(eval_xpath(result, "./a/@href")) or ""
if not url:
continue
thumbnail = None
# looks like <div style="background-image:url(/videos/video.php?id=<urlencoded-urlhere>);position:relative">
thumbnail_style = extract_text(eval_xpath(result, ".//div[contains(@class, 'video-img')]/@style"))
if thumbnail_style:
thumbnail = _extract_thumbnail_url(extr(thumbnail_style, ":url(", ")"))
res.add(
res.types.LegacyResult(
template="videos.html",
url=url,
title=extract_text(eval_xpath(result, ".//h2[contains(@class, 'video-card-title')]")) or "",
content=extract_text(eval_xpath(result, ".//p")) or "",
thumbnail=thumbnail or "",
iframe_src=get_embeded_stream_url(url) or "",
)
)
return res
def response(resp: "SXNG_Response") -> EngineResults:
doc = html.fromstring(resp.text)
match privacywall_category:
case "general":
return _general_results(doc)
case "images":
return _image_results(doc)
case "videos":
return _video_results(doc)
case _:
raise ValueError("invalid category: %s" % privacywall_category)
def fetch_traits(engine_traits: EngineTraits) -> None:
"""Fetch regions from Bing-Web."""
# pylint: disable=import-outside-toplevel
from searx.network import get # see https://github.com/searxng/searxng/issues/762
from searx.utils import gen_useragent
headers = {
"User-Agent": gen_useragent(),
}
resp = get(base_url, headers=headers)
if not resp.ok:
raise RuntimeError("Response from Privacywall is not OK.")
dom = html.fromstring(resp.text)
# <div class="dropdown-option" onclick="changeMenuLanguage(&quot;CZ&quot;)"></div>
for onclick_listener in eval_xpath(
dom, "//div[contains(@class, 'lang-menu')]//div[contains(@class, 'dropdown-option')]/@onclick"
):
# this is either a normal lang-country tag (e.g. cs-cz) or only a country code (e.g. de, at, ...)
country_tag = extr(onclick_listener, "(\"", "\")")
# the locale tag is only a country tag, so we get languages the from the list of official languages
# of the country
lang_tag: str
for lang_tag in babel.languages.get_official_languages(country_tag, de_facto=True): # pyright: ignore
try:
sxng_tag = region_tag(babel.Locale.parse(f"{lang_tag}_{country_tag.upper()}"))
except babel.UnknownLocaleError:
# silently ignore unknown languages
continue
conflict = engine_traits.regions.get(sxng_tag)
if conflict:
if conflict != sxng_tag:
print("CONFLICT: babel %s --> %s" % (sxng_tag, conflict))
continue
engine_traits.regions[sxng_tag] = country_tag
+1 -1
View File
@@ -16,8 +16,8 @@ about = {
"use_official_api": False,
"require_api_key": False,
"results": "HTML",
"language": "zh",
}
language = "zh"
# Engine Configuration
categories = []
+1
View File
@@ -26,6 +26,7 @@ about = {
"require_api_key": False,
"results": "JSON",
}
language_support = True
paging = True
categories = ["music", "radio"]
+120
View File
@@ -0,0 +1,120 @@
# SPDX-License-Identifier: AGPL-3.0-or-later
"""Resulthunter_ is an American search engine with results from Brave.
.. _Resulthunter : https://resulthunter.com
"""
import typing as t
from urllib.parse import urlencode
from lxml import html
from searx import locales
from searx.result_types import EngineResults
from searx.utils import eval_xpath_list, eval_xpath, extract_text
# as it uses brave internally, it has the same locales and timerange/safesearch types
from searx.engines.brave import safesearch_map, time_range_map, fetch_traits # pylint: disable=unused-import
if t.TYPE_CHECKING:
from lxml.etree import ElementBase
from searx.extended_types import SXNG_Response
from searx.search.processors import OnlineParams
from searx.enginelib.traits import EngineTraits
traits: EngineTraits
about = {
"website": "https://resulthunter.com",
"wikidata_id": None,
"official_api_documentation": None,
"use_official_api": False,
"require_api_key": False,
"results": "HTML",
}
paging = True
safesearch = True
time_range_support = True
base_url = "https://resulthunter.com"
resulthunter_categ = "web"
"""Supported categories are ``web`` and ``images``."""
def init(_):
if resulthunter_categ not in ("web", "images"):
raise ValueError("invalid category: %s" % resulthunter_categ)
def request(query: str, params: "OnlineParams") -> None:
args = {
"q": query,
"search_type": resulthunter_categ,
"offset": params["pageno"] - 1,
}
# uses Brave's engine traits
ui_lang = locales.get_engine_locale(params["searxng_locale"], traits.custom["ui_lang"], "all")
if ui_lang and ui_lang != "all":
args["search_lang"] = ui_lang.split("-")[0]
engine_region = traits.get_region(params["searxng_locale"], "all")
if engine_region and engine_region != "all":
args["country"] = engine_region
if params["time_range"]:
args["freshness"] = time_range_map[params["time_range"]]
params["cookies"]["safesearch"] = safesearch_map[params["safesearch"]]
params["url"] = f"{base_url}/search?{urlencode(args)}"
def _general_results(doc: "ElementBase") -> EngineResults:
res = EngineResults()
for result in eval_xpath_list(
doc, "//div[contains(@class, 'organic-results-container')]/div/div[contains(@class, 'group')]"
):
url = extract_text(eval_xpath(result, ".//a/@href"))
if not url:
continue
(
res.add(
res.types.MainResult(
url=url,
title=extract_text(eval_xpath(result, ".//a/h3")) or "",
content=extract_text(eval_xpath(result, ".//p")) or "",
),
)
)
return res
def _image_results(doc: "ElementBase") -> EngineResults:
res = EngineResults()
for result in eval_xpath_list(
doc, "//div[contains(@class, 'organic-results-container')]//a[contains(@class, 'group')]"
):
(
res.add(
res.types.Image(
url=extract_text(eval_xpath(result, "./@href")) or "",
title=extract_text(eval_xpath(result, "./img/@alt")) or "",
thumbnail_src=extract_text(eval_xpath(result, "./img/@src")) or "",
),
)
)
return res
def response(resp: "SXNG_Response") -> EngineResults:
doc = html.fromstring(resp.text)
match resulthunter_categ:
case "web":
return _general_results(doc)
case "images":
return _image_results(doc)
case _:
raise ValueError("invalid resulthunter category: %s" % resulthunter_categ)
+98
View File
@@ -0,0 +1,98 @@
# SPDX-License-Identifier: AGPL-3.0-or-later
"""Search engines by System1 (general).
System1 is an advertising company, and provides all its search engines as a
subdomain of ``s1search.co``. As a result, it has more than 1000 subdomains, of
which some work, and some don't.
Some of the engines get their results from Google, others get them from Yahoo.
"""
import typing as t
from urllib.parse import urlencode, urlparse, parse_qs
from lxml import html
from searx.result_types import EngineResults
from searx.enginelib import EngineCache
from searx.utils import eval_xpath_list, eval_xpath, extract_text
if t.TYPE_CHECKING:
from searx.search.processors import OnlineParams
from searx.extended_types import SXNG_Response
about = {
"website": "https://s1search.co",
"official_api_documentation": None,
"use_official_api": False,
"require_api_key": False,
"results": "HTML",
}
base_url = "" # alternatively: search.gmx.net
categories = ["general"]
paging = True
CACHE: EngineCache
"""Cache to store verification tokens for pagination."""
def init(_):
if not base_url:
raise ValueError("base_url must be set")
def setup(engine_settings: dict[str, t.Any]) -> bool:
global CACHE # pylint: disable=global-statement
CACHE = EngineCache(engine_settings["name"])
return True
def _cache_key(query: str, pageno: int) -> str:
return f"{query}|{pageno}"
def request(query: str, params: "OnlineParams"):
args = {"q": query, "page": params["pageno"]}
if params["pageno"] > 1:
sc = CACHE.get(_cache_key(query, params["pageno"]))
# sc is required for pagination to avoid rate-limits
if not sc:
params["url"] = None
return
args["sc"] = sc
params["url"] = f"{base_url}/serp?{urlencode(args)}"
def response(resp: "SXNG_Response") -> EngineResults:
res = EngineResults()
doc = html.fromstring(resp.text)
for suggestion in eval_xpath_list(doc, "//div[@class='aylf-yahoo-bottom' or @class='aylf-yahoo-sidebar']/div"):
res.add(res.types.LegacyResult({"suggestion": extract_text(suggestion)}))
for result in eval_xpath_list(
doc, "//div[contains(@class, 'web-yahoo') or contains(@class, 'web-google')]/div[contains(@class, '__result')]"
):
res.add(
res.types.MainResult(
url=extract_text(eval_xpath(result, ".//a[contains(@class, 'title')]/@href")),
title=extract_text(eval_xpath(result, ".//a[contains(@class, 'title')]")),
content=extract_text(eval_xpath(result, ".//span[contains(@class, 'description') or @class='']")),
)
)
# store pagination keys to be able to access next pages
for page_href in eval_xpath_list(doc, "//a[contains(@class, 'pagination__num')]"):
# target_url looks like "/serp?q=test&page=2&sc=RVlBPMDPVhWR20"
target_url = extract_text(eval_xpath(page_href, "./@href"))
target_url = parse_qs(urlparse(target_url).query)
pageno = int(target_url["page"][0])
sc = target_url["sc"][0]
CACHE.set(_cache_key(resp.search_params["query"], pageno), sc)
return res
+1 -1
View File
@@ -13,8 +13,8 @@ about = {
"use_official_api": False,
"require_api_key": False,
"results": 'JSON',
'language': 'fr',
}
language = "fr"
categories = ['movies']
paging = True
+1
View File
@@ -25,6 +25,7 @@ about = {
"require_api_key": False,
"results": 'JSON',
}
language_support = True
# engine dependent config
categories = ['videos']
+1 -1
View File
@@ -19,8 +19,8 @@ about = {
"use_official_api": False,
"require_api_key": False,
"results": "HTML",
"language": "cz",
}
language = "cz"
categories = ['general', 'web']
base_url = 'https://search.seznam.cz/'
+1 -1
View File
@@ -16,8 +16,8 @@ about = {
"use_official_api": False,
"require_api_key": False,
"results": "HTML",
"language": "zh",
}
language = "zh"
# Engine Configuration
categories = ["general"]
+1 -1
View File
@@ -11,8 +11,8 @@ about = {
"use_official_api": False,
"require_api_key": False,
"results": "JSON",
"language": "zh",
}
language = "zh"
categories = ["videos"]
paging = True
+1 -1
View File
@@ -14,8 +14,8 @@ about = {
"use_official_api": False,
"require_api_key": False,
"results": "HTML",
"language": "zh",
}
language = "zh"
# Engine Configuration
categories = ["news"]
+5 -1
View File
@@ -131,6 +131,7 @@ max_page = 18
"""Tested 18 pages maximum (argument ``page``), to be save max is set to 20."""
time_range_support = True
language_support = True
safesearch = True
time_range_dict = {"day": "d", "week": "w", "month": "m", "year": "y"}
@@ -382,6 +383,9 @@ def _get_image_result(result) -> dict[str, t.Any] | None:
size_str = "".join(filter(str.isdigit, result["filesize"]))
filesize = humanize_bytes(int(size_str))
img_format = result.get("format").upper()
if img_format == "UNKNOWN":
img_format = ""
return {
"template": "images.html",
"url": url,
@@ -390,7 +394,7 @@ def _get_image_result(result) -> dict[str, t.Any] | None:
"img_src": result.get("rawImageUrl"),
"thumbnail_src": thumbnailUrl,
"resolution": resolution,
"img_format": result.get("format"),
"img_format": img_format,
"filesize": filesize,
}
+107
View File
@@ -0,0 +1,107 @@
# SPDX-License-Identifier: AGPL-3.0-or-later
"""Startpagina is a Netherlands search engine by `Kompas`_. It takes all its
results from Google.
.. _Kompas: https://www.kompaspublishing.nl/
"""
import typing as t
from urllib.parse import urlencode
from dateutil import parser
from searx.utils import format_duration
from searx.result_types import EngineResults
if t.TYPE_CHECKING:
from searx.extended_types import SXNG_Response
from searx.search.processors import OnlineParams
about = {
"website": "https://startpagina.nl",
"wikidata_id": None,
"official_api_documentation": None,
"use_official_api": False,
"require_api_key": False,
"results": "JSON",
}
language = "ne"
paging = True
safesearch = True
categories = ["general"]
startpagina_categ = "web"
"""Category to search in. Can be either "web", "images", "videos" or "news"."""
page_size = 10
api_url = "https://search.kompas.services"
def init(_):
if startpagina_categ not in ("web", "images", "videos", "news"):
raise ValueError("invalid search type: %s" % startpagina_categ)
def request(query: str, params: "OnlineParams") -> None:
args = {"q": query, "page_size": page_size, "page": params["pageno"]}
params["url"] = f"{api_url}/api/v2/search/{startpagina_categ}/?{urlencode(args)}"
def response(resp: "SXNG_Response"):
res = EngineResults()
json_resp = resp.json()
for result in json_resp["results"]:
if startpagina_categ == "web":
res.add(
res.types.MainResult(
url=result["original_url"],
title=result["title"],
content=result["description"],
)
)
elif startpagina_categ == "news":
publishedDate = None
try:
publishedDate = parser.parse(result["date"])
except parser.ParserError:
pass
res.add(
res.types.MainResult(
url=result["original_url"],
title=result["title"],
content=result["description"],
thumbnail=result["image"]["thumbnail_url"],
publishedDate=publishedDate,
)
)
elif startpagina_categ == "videos":
res.add(
res.types.LegacyResult(
template="videos.html",
url=result["original_url"],
title=result["title"],
content=result["description"],
thumbnail=result["video"]["thumbnail_url"],
length=format_duration(result["video"]["duration"]),
)
)
elif startpagina_categ == "images":
res.add(
res.types.Image(
url=result["original_url"],
title=result["title"],
content=result["description"],
thumbnail_src=result["image"]["thumbnail_url"],
resolution=f"{result['image']['width']}x{result['image']['height']}",
)
)
for related in json_resp["related_searches"]:
res.add(res.types.LegacyResult(suggestion=related["query"]))
return res
+2 -1
View File
@@ -27,8 +27,9 @@ about = {
'use_official_api': True,
'require_api_key': False,
'results': 'JSON',
'language': 'de',
}
language = "de"
categories = ['general', 'news']
paging = True
+148
View File
@@ -0,0 +1,148 @@
# SPDX-License-Identifier: AGPL-3.0-or-later
"""T-Online_ is a German news portal, which is powered by Ströer, a German
advertising company, not by Deutsche Telekom (contrary to its name).
It gets its web results from Google, image results from Flickr and videos
results from YouTube.
.. _T-Online: https://www.t-online.de/
"""
import typing as t
from urllib.parse import urlencode
from lxml import html
from searx.utils import eval_xpath_list, eval_xpath, extract_text, get_embeded_stream_url, ElementType
from searx.result_types import EngineResults
from searx.enginelib import EngineAbout
if t.TYPE_CHECKING:
from searx.extended_types import SXNG_Response
from searx.search.processors import OnlineParams
about = EngineAbout(
website="https://www.t-online.de",
wikidata_id="Q590940",
results="HTML",
)
paging = True
time_range_support = True
base_url = "https://suche.t-online.de"
tonline_categ = "web"
"""Supported categories are ``web``, ``videos``, ``news`` and ``images``."""
time_range_map = {"day": "d", "week": "w", "month": "m", "year": "y"}
# result provider has to be specified during pagination, pagination can alternatively
# use "tonline" to only search for results from t-online news articles
tonline_channel_map = {"images": "flickr", "videos": "yt"}
language = "de"
def init(_):
if tonline_categ not in ("web", "images", "videos", "news"):
raise ValueError("invalid category: %s" % tonline_categ)
def request(query: str, params: "OnlineParams") -> None:
# "mandant", "dia" and "ptl" are not needed, but this might reduce changes of captchas
args = {"q": query, "mandant": "toi", "dia": "suche", "ptl": "std"}
if params["time_range"]:
args["age"] = time_range_map[params["time_range"]]
if params["pageno"] > 1 and tonline_categ in tonline_channel_map:
ch = tonline_channel_map[tonline_categ]
args["ch"] = ch
args[f"{ch}_page"] = str(params["pageno"])
else:
args["page"] = str(params["pageno"])
params["url"] = f"{base_url}/{tonline_categ}?{urlencode(args)}"
def _general_results(doc: ElementType, res: EngineResults):
result: ElementType
for result in eval_xpath_list(doc, "//div[@id='google_re']/div[contains(@class, 'doc')]"):
(
res.add(
res.types.MainResult(
url=extract_text(eval_xpath(result, "./a/@href") or ""),
title=extract_text(eval_xpath(result, ".//span[contains(@class, 'tMMReshl')]") or "") or "",
content=extract_text(eval_xpath(result, ".//div[contains(@class, 'tMMRest')]") or "") or "",
),
)
)
suggestion: ElementType
for suggestion in eval_xpath_list(doc, "//div[starts-with(@class, 'rsbl')]/a"):
res.add(res.types.LegacyResult({"suggestion": extract_text(suggestion)}))
def _image_results(doc: ElementType, res: EngineResults):
result: ElementType
for result in eval_xpath_list(doc, "//div[@class='doc']"):
(
res.add(
res.types.Image(
url=extract_text(eval_xpath(result, "./a/@href") or ""),
title=extract_text(eval_xpath(result, ".//div[contains(@class, 'doc_info')]") or "") or "",
thumbnail_src=extract_text(eval_xpath(result, ".//img/@src") or "") or "",
),
)
)
def _news_results(doc: ElementType, res: EngineResults):
result: ElementType
title_parts: list[ElementType]
for result in eval_xpath_list(doc, "//div[@id='portal_re']/div[contains(@class, 'doc')]"):
title_parts = eval_xpath(result, ".//a[starts-with(@class, 'tMMReshl')]")
(
res.add(
res.types.MainResult(
url=extract_text(eval_xpath(result, "(./a/@href)[1]") or ""),
title=" - ".join(extract_text(part) or "" for part in title_parts),
content=extract_text(eval_xpath(result, ".//div[contains(@class, 'tMMRest')]") or "") or "",
thumbnail=extract_text(eval_xpath(result, ".//img[contains(@class, 'desk')]/@src") or "") or "",
),
)
)
def _video_results(doc: ElementType, res: EngineResults):
result: ElementType
for result in eval_xpath_list(doc, "//div[@class='doc']"):
url: str | None = extract_text(eval_xpath(result, "./a/@href") or "")
if url is None:
continue
title_parts: list[ElementType] = eval_xpath(result, ".//a[starts-with(@class, 'tMMReshl')]")
res.add(
res.types.LegacyResult(
template="videos.html",
url=url,
title=" - ".join(extract_text(part) or "" for part in title_parts),
thumbnail=extract_text(eval_xpath(result, ".//img/@src") or "") or "",
iframe_src=get_embeded_stream_url(url) or "",
)
)
def response(resp: "SXNG_Response") -> EngineResults:
doc = html.fromstring(resp.text)
res = EngineResults()
match tonline_categ:
case "web":
_general_results(doc, res)
case "news":
_news_results(doc, res)
case "images":
_image_results(doc, res)
case "videos":
_video_results(doc, res)
case _:
raise ValueError("invalid category: %s" % tonline_categ)
return res
+162
View File
@@ -0,0 +1,162 @@
# SPDX-License-Identifier: AGPL-3.0-or-later
"""Tusksearch_ is an American search engine that claims to fight censorship.
Its search results are (at least partially) from Brave.
.. _Tusksearch: https://tusksearch.com/about
"""
from json import loads
import random
import typing as t
from urllib.parse import urlencode
from dateutil import parser
from searx.exceptions import SearxEngineAPIException
from searx.network import get
from searx.utils import html_to_text
from searx.result_types import EngineResults
if t.TYPE_CHECKING:
from searx.extended_types import SXNG_Response
from searx.search.processors import OnlineParams
about = {
"website": "https://tusksearch.com",
"wikidata_id": None,
"official_api_documentation": None,
"use_official_api": False,
"require_api_key": False,
"results": "JSON",
}
paging = True
categories = ["general"]
tusk_categ = "web"
"""Category to search in. Can be either "web", "images", "videos" or "news"."""
api_url = "https://api.tusksearch.com"
def init(_):
if tusk_categ not in ("web", "images", "videos", "news"):
raise ValueError("invalid search type: %s" % tusk_categ)
def _obtain_x_sid() -> tuple[str, str]:
"""
The session ID ("sid") is encoded as a byte array in ``embed.js``.
It is only valid for exactly one request, so we can't cache it.
The header key is usually called `x-sid-{UUIDv4}`, and the value is
usually a plain UUIDv4 (but a different one than in the header key).
"""
resp = get(f"{api_url}/revcontent/embed.js")
if not resp.ok:
raise SearxEngineAPIException("failed to obtain request x-sid token")
# data is prefixed by 'var x='
data_array = loads(resp.text[6:])
def _byte_array_to_ascii(text: list[int]) -> str:
"""
Converts a byte array (e.g. [81, 101, 97, 114, 88, 78, 71]) to the ASCII
string representation (e.g. "SearXNG").
"""
return "".join([chr(x) for x in text])
x_sid_header = _byte_array_to_ascii(data_array[3])
x_sid_value = _byte_array_to_ascii(data_array[4])
return x_sid_header, x_sid_value
def request(query: str, params: "OnlineParams") -> None:
# images don't support pagination, news and videos only support two pages
if tusk_categ == "images" and params["pageno"] > 1 or tusk_categ in ("news", "videos") and params["pageno"] > 2:
params["url"] = None
return
args = {
"q": query,
"p": params["pageno"],
"l": "center", # political direction: "left", "center" or "right"
}
if tusk_categ == "images":
params["url"] = f"{api_url}/Search/Image?{urlencode(args)}"
else:
# web response also contains news and videos
params["url"] = f"{api_url}/Search/Web?{urlencode(args)}"
x_sid_header, x_sid_value = _obtain_x_sid()
params["headers"] = {
x_sid_header: x_sid_value,
# required - we send a random longitude and latitude instead of the actual user location
'x-lon': str(random.random() * 90),
'x-lat': str(random.random() * 90),
}
def response(resp: "SXNG_Response"):
res = EngineResults()
json_resp = resp.json()["results"]
if tusk_categ == "web":
for result in (json_resp.get("web") or {}).get("results", []):
res.add(
res.types.MainResult(
url=result["url"],
title=html_to_text(result["title"]),
content=html_to_text(result["description"]),
thumbnail=(result["thumbnail"] or {}).get("src") or "",
)
)
elif tusk_categ == "news":
for result in (json_resp.get("news") or {}).get("results", []):
publishedDate = None
try:
publishedDate = parser.parse(result["age"])
except parser.ParserError:
pass
res.add(
res.types.MainResult(
url=result["url"],
title=html_to_text(result["title"]),
content=html_to_text(result["description"]),
thumbnail=result["thumbnail"]["src"],
publishedDate=publishedDate,
)
)
elif tusk_categ == "videos":
for result in (json_resp.get("videos") or {}).get("results", []):
publishedDate = None
try:
publishedDate = parser.parse(result["age"])
except parser.ParserError:
pass
res.add(
res.types.LegacyResult(
template="videos.html",
url=result["url"],
title=html_to_text(result["title"]),
content=html_to_text(result["description"]),
thumbnail=result["thumbnail"]["src"],
publishedDate=publishedDate,
length=result["video"].get("duration"),
)
)
elif tusk_categ == "images":
for result in json_resp:
res.add(
res.types.Image(
url=result["url"],
title=html_to_text(result["title"]),
img_src=result["properties"]["url"],
thumbnail_src=result["thumbnail"]["src"],
)
)
return res
+114
View File
@@ -0,0 +1,114 @@
# SPDX-License-Identifier: AGPL-3.0-or-later
"""Vuhuv_ is a Turkish search engine, that also provides English results.
.. _Vuhuv : https://vuhuv.com
"""
import typing as t
from urllib.parse import urlencode
from lxml import html
from searx.result_types import EngineResults
from searx.utils import eval_xpath_list, eval_xpath, extract_text
if t.TYPE_CHECKING:
from lxml.etree import ElementBase
from searx.extended_types import SXNG_Response
from searx.search.processors import OnlineParams
about = {
"website": "https://vuhuv.com",
"wikidata_id": None,
"official_api_documentation": None,
"use_official_api": False,
"require_api_key": False,
"results": "HTML",
}
paging = True
base_url = "https://vuhuv.com"
vuhuv_category = "general"
"""Supported categories are ``general``, ``videos`` and ``images``."""
# corresponds to the "k" query param
category_map = {"general": 1, "images": 2, "videos": 3}
def init(_):
if vuhuv_category not in category_map:
raise ValueError("invalid category: %s" % vuhuv_category)
def request(query: str, params: "OnlineParams") -> None:
# the purpose of "d" and "dh" are unknown, but the website
# sends them, and without them the results are different
args = {"k": category_map[vuhuv_category], "p": params["pageno"], "q": query, "d": 1, "dh": 1}
params["url"] = f"{base_url}/veri2/?{urlencode(args)}"
params["headers"]["Referer"] = f"{base_url}/"
def _general_results(doc: "ElementBase") -> EngineResults:
res = EngineResults()
for result in eval_xpath_list(doc, "//div[contains(@class, 'sonuc')]/div"):
(
res.add(
res.types.MainResult(
url=extract_text(eval_xpath(result, "./a/@href")) or "",
title=extract_text(eval_xpath(result, "./a/span")) or "",
content=extract_text(eval_xpath(result, "./ins")) or "",
),
)
)
return res
def _image_results(doc: "ElementBase") -> EngineResults:
res = EngineResults()
for result in eval_xpath_list(doc, "//div[contains(@class, 'item gorsel')]"):
(
res.add(
res.types.Image(
url=extract_text(eval_xpath(result, "./a/@href")) or "",
title=extract_text(eval_xpath(result, "./a/@title")) or "",
resolution=extract_text(eval_xpath(result, "div[contains(@class, 'olculeri')]")) or "",
thumbnail_src="https:" + str(extract_text(eval_xpath(result, "./@data-kgorsel"))),
img_src=extract_text(eval_xpath(result, "./@data-resimurl")) or "",
),
)
)
return res
def _video_results(doc: "ElementBase") -> EngineResults:
res = EngineResults()
for result in eval_xpath_list(doc, "//div[contains(@class, 'item video')]"):
(
res.add(
res.types.MainResult(
template="videos.html",
url=extract_text(eval_xpath(result, "./a/@href")) or "",
title=extract_text(eval_xpath(result, "./a/@title")) or "",
content=extract_text(eval_xpath(result, ".//div[contains(@class, 'abaslik')]")) or "",
thumbnail=extract_text(eval_xpath(result, "./@data-kgorsel")) or "",
iframe_src=extract_text(eval_xpath(result, "./@data-embedurl")) or "",
),
)
)
return res
def response(resp: "SXNG_Response") -> EngineResults:
doc = html.fromstring(resp.text)
match vuhuv_category:
case "general":
return _general_results(doc)
case "images":
return _image_results(doc)
case "videos":
return _video_results(doc)
case _:
raise ValueError("invalid vuhuv category: %s" % vuhuv_category)
+1
View File
@@ -40,6 +40,7 @@ about = {
"require_api_key": False,
"results": 'JSON',
}
language_support = True
display_type = ["infobox"]
"""A list of display types composed from ``infobox`` and ``list``. The latter
+1
View File
@@ -72,6 +72,7 @@ about = {
"require_api_key": False,
"results": "JSON",
}
language_support = True
display_type = ["infobox"]
"""A list of display types composed from ``infobox`` and ``list``. The latter
+4 -1
View File
@@ -76,6 +76,9 @@ from lxml import html
from searx.utils import extract_text, extract_url, eval_xpath, eval_xpath_list
from searx.network import raise_for_httperror
from searx.result_types import EngineResults
from searx.enginelib import EngineAbout
about = EngineAbout()
search_url = None
"""
@@ -289,7 +292,7 @@ def response(resp) -> EngineResults: # pylint: disable=too-many-branches
if results_xpath:
for result in eval_xpath_list(dom, results_xpath):
url = extract_url(eval_xpath_list(result, url_xpath, min_len=1), search_url)
url = extract_url(eval_xpath(result, url_xpath), search_url)
title = extract_text(eval_xpath_list(result, title_xpath, min_len=1))
content = extract_text(eval_xpath_list(result, content_xpath))
tmp_result = {'url': url, 'title': title, 'content': content}
+1
View File
@@ -22,6 +22,7 @@ about = {
"require_api_key": False,
"results": "JSON",
}
language_support = True
base_url = "https://api.yep.com"
web_base_url = "https://yep.com"
+1
View File
@@ -61,6 +61,7 @@ about: dict[str, t.Any] = {
categories: list[str] = ["files", "books"]
paging: bool = True
language_support = True
base_url: str = "https://zlibrary-global.se"
zlib_year_from: str = ""
+3 -1
View File
@@ -24,6 +24,8 @@ __all__ = [
"Code",
"Paper",
"File",
"Image",
"ImageRef",
]
import typing as t
@@ -35,7 +37,7 @@ from .keyvalue import KeyValue
from .code import Code
from .paper import Paper
from .file import File
from .image import Image
from .image import Image, ImageRef
class ResultList(list[Result | LegacyResult], abc.ABC):
+81 -4
View File
@@ -7,14 +7,52 @@ template.
:members:
:show-inheritance:
.. autoclass:: ImageRef
:members:
"""
# pylint: disable=too-few-public-methods
__all__ = ["Image", "ImageRef"]
__all__ = ["Image"]
import mimetypes
import types
import typing as t
from collections.abc import Callable
import msgspec
from ._base import MainResult, Result, log, LegacyResult
MimeSubType = t.Literal["png", "svg+xml", "jpeg", "bmp", "x-icon", "tiff"]
MIMESUB: dict[MimeSubType, str] = {
"png": "PNG",
"svg+xml": "SVG",
"jpeg": "JPG",
"bmp": "BMP",
"x-icon": "ICO",
"tiff": "TIF",
}
from ._base import MainResult
class ImageRef(msgspec.Struct, kw_only=True):
"""Reference to an (alternative) image format"""
url: str
"""URL of the image reference."""
subtype: MimeSubType
"""Subtype (mimetype) of the image format."""
label: str = ""
"""Label of the reference, default is build from the uppercase of
:py:obj:`Image.ImageRef.subtype`."""
mtype: t.Literal["image"] = "image"
def __post_init__(self):
if not self.label:
self.label = MIMESUB.get(self.subtype, self.subtype.upper())
@t.final
@@ -34,7 +72,7 @@ class Image(MainResult, kw_only=True):
"""The resolution of the image (e.g. ``1920 x 1080`` pixel)"""
img_format: str = ""
"""The format of the image (e.g. ``png``)."""
"""The format of the image :py:obj:`.MainResult.img_src` (e.g. ``png``)."""
source: str = ""
"""Source of the image."""
@@ -42,3 +80,42 @@ class Image(MainResult, kw_only=True):
filesize: str = ""
"""Size of bytes in :py:obj:`human readable <searx.humanize_bytes>` notation
(e.g. ``1MB`` for ``1024*1024`` Bytes filesize)."""
formats: list[ImageRef] = []
"""List of links to alternative image formats."""
def __post_init__(self):
super().__post_init__()
if not self.img_format:
# automatically guess the image format based on the path of the image
mimetype = mimetypes.guess_type(self.img_src)[0]
if mimetype:
subtype = mimetype.split("/")[-1]
if subtype in MIMESUB:
self.img_format = MIMESUB[subtype]
else:
self.img_format = subtype.upper()
def filter_urls(self, filter_func: "Callable[[Result | LegacyResult, str, str], str | bool ]"):
for _ref in self.formats[:]:
_name = f"Image.formats:{_ref.label}"
try:
_url = filter_func(self, _name, _ref.url)
except Exception as exc: # pylint: disable=broad-exception-caught
# pylint: disable=no-member
_tb: types.TracebackType = exc.__traceback__.tb_next.tb_next # type: ignore
_fn = _tb.tb_frame.f_code.co_filename
_lno = _tb.tb_lineno
log.error("filter_urls: [%s] ignore %s from callback %s:%s", _name, repr(exc), _fn, _lno)
continue
if isinstance(_url, str):
log.debug("filter_urls: [%s] URL %s -> %s", _name, _ref.url, _url)
_ref.url = _url
elif not _url:
log.debug("filter_urls: [%s] drop ref %s", _name, _ref)
self.formats.remove(_ref)
return super().filter_urls(filter_func)
+9
View File
@@ -11,6 +11,7 @@ __all__ = [
"PROCESSORS",
"ParamTypes",
"RequestParams",
"ProcessorType",
]
import typing as t
@@ -27,6 +28,14 @@ from .online_url_search import OnlineUrlSearchProcessor, OnlineUrlSearchParams
logger = logger.getChild("search.processors")
ProcessorType = t.Literal[
"offline",
"online",
"online_currency",
"online_dictionary",
"online_url_search",
]
OnlineParamTypes: t.TypeAlias = OnlineParams | OnlineDictParams | OnlineCurrenciesParams | OnlineUrlSearchParams
OfflineParamTypes: t.TypeAlias = RequestParams
ParamTypes: t.TypeAlias = OfflineParamTypes | OnlineParamTypes
+511 -27
View File
@@ -41,8 +41,8 @@ search:
# Filter results. 0: None, 1: Moderate, 2: Strict
safe_search: 0
# Existing autocomplete backends: "360search", "baidu", "bing", "brave", "dbpedia", "duckduckgo", "google",
# "yandex", "mwmbl", "naver", "seznam", "sogou", "startpage", "swisscows", "quark", "qwant", "wikipedia" -
# leave blank to turn it off by default.
# "yandex", "privacywall", "mwmbl", "naver", "seznam", "sogou", "startpage", "swisscows", "quark", "qwant",
# "wikipedia" - leave blank to turn it off by default.
autocomplete: ""
# minimun characters to type before autocompleter starts
autocomplete_min: 4
@@ -320,6 +320,24 @@ engines:
shortcut: 9g
disabled: true
- name: abcnyheter
engine: xpath
categories: general
paging: true
search_url: https://startsiden.abcnyheter.no/sok/?q={query}&page={pageno}
shortcut: abc
disabled: true
results_xpath: //ul[contains(@class, "results__list")]/li[contains(@class, "result")]
url_xpath: ./a/@href
title_xpath: ./a/h3
content_xpath: ./div
language: "no"
about:
website: https://abcnyheter.no
use_official_api: false
require_api_key: false
results: HTML
- name: acfun
engine: acfun
shortcut: acf
@@ -426,27 +444,6 @@ engines:
shortcut: conda
disabled: true
- name: aol
engine: aol
search_type: search
categories: [general]
shortcut: aol
disabled: true
- name: aol images
engine: aol
search_type: image
categories: [images]
shortcut: aoli
disabled: true
- name: aol videos
engine: aol
search_type: video
categories: [videos]
shortcut: aolv
disabled: true
- name: arch linux wiki
engine: archlinux
shortcut: al
@@ -474,6 +471,23 @@ engines:
engine: arxiv
shortcut: arx
- name: ayo
engine: xpath
categories: general
shortcut: ayo
search_url: https://search.ayo.de/search?q={query}
results_xpath: //div[contains(@class, 'search-result')]/div
url_xpath: .//a/@href
title_xpath: .//h3
content_xpath: .//p
suggestion_xpath: .//a[starts-with(@href, "https://search.ayo.de")]
disabled: true
about:
website: https://serach.ayo.de
use_official_api: false
require_api_key: false
results: HTML
- name: azure
engine: azure
shortcut: az
@@ -609,6 +623,12 @@ engines:
shortcut: ca
disabled: true
# - name: chatnoir
# engine: chatnoir
# shortcut: cha
# search_index: cw22
# disabled: true
- name: chefkoch
engine: chefkoch
shortcut: chef
@@ -647,6 +667,26 @@ engines:
disabled: true
inactive: true
- name: cl0q
engine: json_engine
shortcut: cl
categories: general
paging: true
first_page_num: 0
page_size: 20
search_url: https://cl0q.com/search?q={query}&limit=20&offset={pageno}
results_query: results
url_query: domain
url_prefix: https://
title_query: title
content_query: description
disabled: true
inactive: true
about:
website: https://cl0q.com
description: "Open source network for searching domains"
results: JSON
- name: cloudflareai
engine: cloudflareai
shortcut: cfai
@@ -914,11 +954,57 @@ engines:
timeout: 3.0
disabled: true
- name: fastbot
engine: xpath
search_url: https://fastbot.de/search?q={query}
results_xpath: //section[contains(@class, 'organic-results')]/div[contains(@class, 'result-item')]
url_xpath: (./a/@href)[last()]
title_xpath: (./a)[last()]
content_xpath: ./div[contains(@class, 'snippet')]
suggestion_xpath: //section[contains(@class, 'related-searches')]//a/span[1]
shortcut: fa
categories: general
disabled: true
about:
website: https://fastbot.de
official_api_documentation:
use_official_api: false
require_api_key: false
results: HTML
- name: fdroid
engine: fdroid
shortcut: fd
disabled: true
- name: findfiles
engine: findfiles
findfiles_categ: all
categories: files
shortcut: fif
disabled: true
- name: findfiles images
engine: findfiles
findfiles_categ: image
categories: images
shortcut: fifi
disabled: true
- name: findfiles videos
engine: findfiles
findfiles_categ: video
categories: videos
shortcut: fifv
disabled: true
- name: findfiles music
engine: findfiles
findfiles_categ: audio
categories: music
shortcut: fifm
disabled: true
- name: findthatmeme
engine: findthatmeme
shortcut: ftm
@@ -1022,6 +1108,7 @@ engines:
- name: gabanza
engine: xpath
categories: general
search_url: https://www.gabanza.com/search?query={query}
shortcut: gab
timeout: 4
@@ -1054,6 +1141,11 @@ engines:
search_type: text
timeout: 10
- name: giphy
engine: giphy
shortcut: gif
disabled: true
- name: gitlab
engine: gitlab
base_url: https://gitlab.com
@@ -1235,6 +1327,13 @@ engines:
require_api_key: false
results: JSON
- name: iseek
engine: iseek
shortcut: isk
timeout: 4
disabled: true
inactive: true
- name: il post
engine: il_post
shortcut: pst
@@ -1323,6 +1422,21 @@ engines:
# api_key: "" # required
# kagi_categ: videos
- name: kozmonavt
engine: xpath
search_url: https://kozmonavt.su/s?q={query}
shortcut: koz
disabled: true
inactive: true
results_xpath: //div[contains(@class, 'list')]/section
url_xpath: concat('https://', substring-after(.//a/@href, '?q='))
title_xpath: .//a
content_xpath: .//div[contains(@class, 'snip')]
about:
website: https://kozmonavt.su
description: "Web site directory with its own index"
results: HTML
- name: jisho
engine: jisho
shortcut: js
@@ -1340,6 +1454,22 @@ engines:
shortcut: kc
timeout: 4.0
- name: kukei
engine: xpath
categories: [general, blogs]
search_url: https://kukei.eu/?q={query}
shortcut: kuk
disabled: true
inactive: true
results_xpath: //ul/li[contains(@class, "result-item-first-level")]
url_xpath: .//a/@href
title_xpath: .//h3
content_xpath: .//p
about:
website: https://kukei.eu
description: "Curated search for the small web"
results: HTML
- name: lemmy communities
engine: lemmy
lemmy_type: Communities
@@ -1436,6 +1566,38 @@ engines:
shortcut: luc
timeout: 3.0
- name: luxxle
engine: luxxle
categories: general
luxxle_categ: search
shortcut: lux
disabled: true
inactive: true
- name: luxxle images
engine: luxxle
categories: images
luxxle_categ: images
shortcut: luxi
disabled: true
inactive: true
- name: luxxle videos
engine: luxxle
categories: videos
luxxle_categ: videos
shortcut: luxv
disabled: true
inactive: true
- name: luxxle news
engine: luxxle
categories: news
luxxle_categ: news
shortcut: luxn
disabled: true
inactive: true
- name: marginalia
engine: marginalia
shortcut: mar
@@ -1794,6 +1956,11 @@ engines:
# query_str: 'SELECT * from my_table WHERE my_column = %(query)s'
# shortcut : psql
- name: podchaser
engine: podchaser
shortcut: poc
disabled: true
- name: presearch
engine: presearch
search_type: search
@@ -1933,6 +2100,27 @@ engines:
engine: radio_browser
shortcut: rb
- name: rawweb
engine: json_engine
shortcut: rw
categories: [general, blogs]
paging: true
search_url: 'https://api.rawweb.org/api/search?keyword={query}&page={pageno}&lang=*'
results_query: data
url_query: link
title_query: title
content_query: content
title_html_to_text: true
content_html_to_text: true
disabled: true
inactive: true
about:
website: https://rawweb.org
official_api_documentation:
use_official_api: false
require_api_key: false
results: JSON
- name: reddit
engine: reddit
shortcut: re
@@ -1970,7 +2158,7 @@ engines:
- name: searchmysite
engine: xpath
shortcut: sms
categories: general
categories: [general, blogs]
paging: true
search_url: https://searchmysite.net/search/?q={query}&page={pageno}
results_xpath: //div[contains(@class,'search-result')]
@@ -2053,6 +2241,28 @@ engines:
base_url: 'https://discourse.pi-hole.net'
disabled: true
- name: privacywall
engine: privacywall
categories: general
privacywall_category: general
paging: false # only images and videos support pagination
shortcut: pw
disabled: true
- name: privacywall images
engine: privacywall
categories: images
privacywall_category: images
shortcut: pwi
disabled: true
- name: privacywall videos
engine: privacywall
categories: videos
privacywall_category: videos
shortcut: pwv
disabled: true
# - name: searx
# engine: searx_engine
# shortcut: se
@@ -2192,6 +2402,36 @@ engines:
shortcut: tm
disabled: true
- name: tonline
engine: tonline
shortcut: tol
disabled: true
inactive: true
- name: tonline images
engine: tonline
categories: images
tonline_categ: images
shortcut: toli
disabled: true
inactive: true
- name: tonline videos
engine: tonline
categories: videos
tonline_categ: videos
shortcut: tolv
disabled: true
inactive: true
- name: tonline news
engine: tonline
categories: news
tonline_categ: news
shortcut: toln
disabled: true
inactive: true
# Requires Tor
- name: torch
engine: xpath
@@ -2234,6 +2474,35 @@ engines:
- 5000
inactive: true
- name: tusksearch
engine: tusksearch
shortcut: tu
tusk_categ: web
categories: general
disabled: true
- name: tusksearch images
engine: tusksearch
shortcut: tui
paging: false
tusk_categ: images
categories: images
disabled: true
- name: tusksearch videos
engine: tusksearch
shortcut: tuv
tusk_categ: videos
categories: videos
disabled: true
- name: tusksearch news
engine: tusksearch
shortcut: tun
tusk_categ: news
categories: news
disabled: true
# tmp suspended - too slow, too many errors
# - name: urbandictionary
# engine : xpath
@@ -2247,6 +2516,24 @@ engines:
engine: unsplash
shortcut: us
- name: unobtanium
engine: xpath
shortcut: uno
categories: [general, blogs]
paging: true
first_page_num: 0
search_url: https://unobtanium.rocks/search?search={query}&page={pageno}
results_xpath: //ul[contains(@class, 'search-results')]/li
url_xpath: .//a/@href
title_xpath: .//a/b
content_xpath: .//p[2]
disabled: true
inactive: true
about:
website: https://unobtanium.rocks
description: "Personal websites focused search engine."
results: HTML
- name: yandex
engine: yandex
categories: general
@@ -2306,7 +2593,7 @@ engines:
url_query: URL
title_query: Title
content_query: Snippet
categories: [general, web]
categories: [general, blogs]
shortcut: wib
disabled: true
about:
@@ -2630,12 +2917,139 @@ engines:
categories: videos
disabled: true
- name: reloado
engine: xpath
paging: true
search_url: https://reloado.com/search?q={query}&page={pageno}
results_xpath: //div[contains(@class, 'result-item')]
url_xpath: .//div[contains(@class, 'result-title')]/a/@href
title_xpath: .//div[contains(@class, 'result-title')]/a
content_xpath: .//div[contains(@class, 'result-excerpt')]
shortcut: rel
categories: general
disabled: true
language: de
about:
website: https://reloado.com
official_api_documentation:
use_official_api: false
require_api_key: false
results: HTML
- name: repology
engine: repology
shortcut: rep
disabled: true
inactive: true
- name: resulthunter
engine: resulthunter
resulthunter_categ: web
categories: general
shortcut: reh
disabled: true
- name: resulthunter images
engine: resulthunter
resulthunter_categ: images
categories: images
shortcut: rehi
disabled: true
- name: searchch
engine: xpath
shortcut: sch
paging: true
search_url: https://search.ch/web/api/loadmore.html?path=/web/&q={query}&page={pageno}
results_xpath: //div[contains(@class, 'www-feed-web-result')]
# the URL looks like "//search.ch/web/r/redirect?...result!uffe8997e781e241d/https://example.com"
url_xpath: concat('https://', substring-after(.//a[contains(@class, 'sl-gus-result-url')]/@href, '://'))
title_xpath: .//a[contains(@class, 'sl-gus-result-title')]
content_xpath: .//div[contains(@class, 'sl-gus-result-body')]
language: "ch"
disabled: true
about:
website: https://search.ch/web
description: "Swiss search engine with its own Swiss index"
use_official_api: false
require_api_key: false
results: HTML
- name: searchzee
engine: json_engine
search_url: https://searchzee.com/api/search?q={query}&type=web&offset={pageno}
paging: true
first_page_num: 0
results_query: results
url_query: url
title_query: title
content_query: summary
content_html_to_text: true
categories: general
shortcut: sz
disabled: true
inactive: true
about:
website: https://searchzee.com
use_official_api: false
require_api_key: false
results: JSON
- name: searchzee news
engine: json_engine
search_url: https://searchzee.com/api/search?q={query}&type=news&offset={pageno}{time_range}
paging: true
first_page_num: 0
time_range_support: true
time_range_url: "&freshness={time_range_val}"
time_range_map:
day: pd
week: pw
month: pm
year: py
results_query: results
url_query: url
title_query: title
content_query: summary
thumbnail_query: thumbnail
content_html_to_text: true
categories: news
shortcut: sznw
disabled: true
inactive: true
- name: startpagina
engine: startpagina
shortcut: spnl
startpagina_categ: web
categories: general
disabled: true
inactive: true
- name: startpagina images
engine: startpagina
shortcut: spnli
startpagina_categ: images
categories: images
disabled: true
inactive: true
- name: startpagina videos
engine: startpagina
shortcut: spnlv
startpagina_categ: videos
categories: videos
disabled: true
inactive: true
- name: startpagina news
engine: startpagina
shortcut: spnln
startpagina_categ: news
categories: news
disabled: true
inactive: true
- name: swisscows
engine: swisscows
categories: general
@@ -2684,13 +3098,13 @@ engines:
content_xpath: //div[@class="synonyms-list-group"]
title_xpath: //div[@class="upper-synonyms"]/a
no_result_for_http_status: [404]
language: de
about:
website: https://www.woxikon.de/
wikidata_id: # No Wikidata ID
use_official_api: false
require_api_key: false
results: HTML
language: de
- name: tootfinder
engine: tootfinder
@@ -2706,6 +3120,27 @@ engines:
shortcut: void
disabled: true
- name: vuhuv
engine: vuhuv
categories: general
vuhuv_category: general
shortcut: vu
disabled: true
- name: vuhuv images
engine: vuhuv
categories: images
vuhuv_category: images
shortcut: vui
disabled: true
- name: vuhuv videos
engine: vuhuv
categories: videos
vuhuv_category: videos
shortcut: vuv
disabled: true
- name: wallhaven
engine: wallhaven
# api_key: abcdefghijklmnopqrstuvwxyz
@@ -2724,22 +3159,39 @@ engines:
content_xpath: //li/div[@class="searchresult"]
categories: general
disabled: true
language: fr
about:
website: https://wikimini.org/
wikidata_id: Q3568032
use_official_api: false
require_api_key: false
results: HTML
language: fr
- name: wttr.in
engine: wttr
shortcut: wttr
timeout: 9.0
- name: xonaly
engine: xpath
search_url: https://xonaly.com/?query={query}
categories: general
shortcut: xo
disabled: true
inactive: true
results_xpath: //div[contains(@class, 'results-block')]/ul/li
url_xpath: ./div[contains(@class, 'result-title')]/a/@href
title_xpath: ./div[contains(@class, 'result-title')]/a
content_xpath: ./p[contains(@class, 'excerpt')]
about:
website: https://xonaly.com
description: "Canadian search engine with its own index"
results: HTML
- name: zapmeta
engine: xpath
shortcut: zpm
categories: general
search_url: https://www.zapmeta.com/search?q={query}&pg={pageno}
results_xpath: //article[contains(@class, "organic-results-item")]
url_xpath: ./h2/a/@href
@@ -2839,6 +3291,38 @@ engines:
website: https://minecraft.wiki/
wikidata_id: Q105533483
# s1search google engines / mirrors
- name: searchtoday
engine: s1search
shortcut: std
base_url: https://info.searchtoday.site
disabled: true
# - name: webcrawler
# engine: s1search
# shortcut: wc
# base_url: https://www.webcrawler.com
# disabled: true
# s1search yahoo engines / mirrors
# - name: excite
# engine: s1search
# shortcut: exc
# base_url: https://results.excite.com.s1search.co
# disabled: true
# - name: metacrawler
# engine: s1search
# shortcut: mec
# base_url: https://search.metacrawler.com
# disabled: true
- name: infospace
engine: s1search
shortcut: ifs
base_url: https://search.infospace.com
disabled: true
# Doku engine lets you access to any Doku wiki instance:
# A public one or a privete/corporate one.
# - name: ubuntuwiki
File diff suppressed because one or more lines are too long
File diff suppressed because one or more lines are too long

Some files were not shown because too many files have changed in this diff Show More