Google Engines

Google API

SearXNG’s implementation of the Google API is mainly done in get_google_info.

For detailed description of the REST-full API see: Query Parameter Definitions. The linked API documentation can sometimes be helpful during reverse engineering. However, we cannot use it in the freely accessible WEB services; not all parameters can be applied and some engines are more special than other (e.g. Google News).

Google WEB

This is the implementation of the Google WEB engine. Some of this implementations (manly the get_google_info) are shared by other engines:

searx.engines.google.fetch_traits(engine_traits: EngineTraits, add_domains: bool = True)[source]

Fetch languages from Google.

searx.engines.google.get_google_info(params, eng_traits)[source]

Composing various (language) properties for the google engines (Google API).

This function is called by the various google engines (Google WEB, Google Images, Google News and Google Videos).

Parameters:
  • param (dict) – Request parameters of the engine. At least a searxng_locale key should be in the dictionary.

  • eng_traits – Engine’s traits fetched from google preferences (searx.enginelib.traits.EngineTraits)

Return type:

dict

Returns:

Py-Dictionary with the key/value pairs:

language:

The language code that is used by google (e.g. lang_en or lang_zh-TW)

country:

The country code that is used by google (e.g. US or TW)

locale:

A instance of babel.core.Locale build from the searxng_locale value.

subdomain:

Google subdomain google_domains that fits to the country code.

params:

Py-Dictionary with additional request arguments (can be passed to urllib.parse.urlencode()).

  • hl parameter: specifies the interface language of user interface.

  • lr parameter: restricts search results to documents written in a particular language.

  • cr parameter: restricts search results to documents originating in a particular country.

  • ie parameter: sets the character encoding scheme that should be used to interpret the query string (‘utf8’).

  • oe parameter: sets the character encoding scheme that should be used to decode the XML result (‘utf8’).

headers:

Py-Dictionary with additional HTTP headers (can be passed to request’s headers)

  • Accept: '*/*

searx.engines.google.request(query, params)[source]

Google search request

searx.engines.google.response(resp)[source]

Get response from google’s search request

searx.engines.google.UI_ASYNC = 'use_ac:true,_fmt:prog'

Format of the response from UI’s async request.

Google Autocomplete

searx.autocomplete.google_complete(query, sxng_locale)[source]

Autocomplete from Google. Supports Google’s languages and subdomains (searx.engines.google.get_google_info) by using the async REST API:

https://{subdomain}/complete/search?{args}

Google Images

This is the implementation of the Google Images engine using the internal Google API used by the Google Go Android app.

This internal API offer results in

  • JSON (_fmt:json)

  • Protobuf (_fmt:pb)

  • Protobuf compressed? (_fmt:pc)

  • HTML (_fmt:html)

  • Protobuf encoded in JSON (_fmt:jspb).

searx.engines.google_images.request(query, params)[source]

Google-Image search request

searx.engines.google_images.response(resp)[source]

Get response from google’s search request

Google Videos

This is the implementation of the Google Videos engine.

Content-Security-Policy (CSP)

This engine needs to allow images from the data URLs (prefixed with the data: scheme):

Header set Content-Security-Policy "img-src 'self' data: ;"
searx.engines.google_videos.request(query, params)[source]

Google-Video search request

searx.engines.google_videos.response(resp)[source]

Get response from google’s search request

Google News

This is the implementation of the Google News engine.

Google News has a different region handling compared to Google WEB.

  • the ceid argument has to be set (ceid_list)

  • the hl argument has to be set correctly (and different to Google WEB)

  • the gl argument is mandatory

If one of this argument is not set correctly, the request is redirected to CONSENT dialog:

https://consent.google.com/m?continue=

The google news API ignores some parameters from the common Google API:

  • num : the number of search results is ignored / there is no paging all results for a query term are in the first response.

  • save : is ignored / Google-News results are always SafeSearch

searx.engines.google_news.request(query, params)[source]

Google-News search request

searx.engines.google_news.response(resp)[source]

Get response from google’s search request

searx.engines.google_news.ceid_list = ['AE:ar', 'AR:es-419', 'AT:de', 'AU:en', 'BD:bn', 'BE:fr', 'BE:nl', 'BG:bg', 'BR:pt-419', 'BW:en', 'CA:en', 'CA:fr', 'CH:de', 'CH:fr', 'CL:es-419', 'CN:zh-Hans', 'CO:es-419', 'CU:es-419', 'CZ:cs', 'DE:de', 'EG:ar', 'ES:es', 'ET:en', 'FR:fr', 'GB:en', 'GH:en', 'GR:el', 'HK:zh-Hant', 'HU:hu', 'ID:en', 'ID:id', 'IE:en', 'IL:en', 'IL:he', 'IN:bn', 'IN:en', 'IN:hi', 'IN:ml', 'IN:mr', 'IN:ta', 'IN:te', 'IT:it', 'JP:ja', 'KE:en', 'KR:ko', 'LB:ar', 'LT:lt', 'LV:en', 'LV:lv', 'MA:fr', 'MX:es-419', 'MY:en', 'NA:en', 'NG:en', 'NL:nl', 'NO:no', 'NZ:en', 'PE:es-419', 'PH:en', 'PK:en', 'PL:pl', 'PT:pt-150', 'RO:ro', 'RS:sr', 'RU:ru', 'SA:ar', 'SE:sv', 'SG:en', 'SI:sl', 'SK:sk', 'SN:fr', 'TH:th', 'TR:tr', 'TW:zh-Hant', 'TZ:en', 'UA:ru', 'UA:uk', 'UG:en', 'US:en', 'US:es-419', 'VE:es-419', 'VN:vi', 'ZA:en', 'ZW:en']

List of region/language combinations supported by Google News. Values of the ceid argument of the Google News REST API.

Google Scholar

This is the implementation of the Google Scholar engine.

Compared to other Google services the Scholar engine has a simple GET REST-API and there does not exists async API. Even though the API slightly vintage we can make use of the Google API to assemble the arguments of the GET request.

searx.engines.google_scholar.detect_google_captcha(dom)[source]

In case of CAPTCHA Google Scholar open its own not a Robot dialog and is not redirected to sorry.google.com.

searx.engines.google_scholar.parse_gs_a(text: Optional[str])[source]

Parse the text written in green.

Possible formats: * “{authors} - {journal}, {year} - {publisher}” * “{authors} - {year} - {publisher}” * “{authors} - {publisher}”

searx.engines.google_scholar.request(query, params)[source]

Google-Scholar search request

searx.engines.google_scholar.response(resp)[source]

Parse response from Google Scholar

searx.engines.google_scholar.time_range_args(params)[source]

Returns a dictionary with a time range arguments based on params['time_range'].

Google Scholar supports a detailed search by year. Searching by last month or last week (as offered by SearXNG) is uncommon for scientific publications and is not supported by Google Scholar.

To limit the result list when the users selects a range, all the SearXNG ranges (day, week, month, year) are mapped to year. If no range is set an empty dictionary of arguments is returned. Example; when user selects a time range (current year minus one in 2022):

{ 'as_ylo' : 2021 }