Searches

../_images/search.png

Making searches

Sosse uses PostgreSQL’s Full Text Search to perform keyword based searches. This makes the search bar behave like most search engine websites 🦡:

  • Typing multiple space-separated keywords returns pages containing all of them.

  • Separating search terms with OR returns pages containing one of them.

  • Keywords enclosed in double-quotes match consecutive words.

  • Using - in front of a search term removes matching pages from the result list.

  • Parenthesis can be used to make complex queries and prioritize operators.

Tags filtering

../_images/tags_filter.png

Clicking the Tags button, shows a list of tags that can be used to filter the search results. Tags are organized in a tree structure, and filtering on a tag also include all its children.

Avanced filtering

More search options are available when clicking on Params:

../_images/extended_search.png
  • Language: select document which detected language matches

  • Sort: sort order of matching documents

  • Include hidden documents: show hidden documents in results, requires the “Can change documents” permission

Below, more field can be added to perform exact text match as opposed to the search bar that has natural-language processing features (word stemming, diactric removal, …). Any number of extra filter can be added using the plus_button button. Each field in the filter is:

Function of the filter

  • Keep: pages matching the filter are displayed in the results

  • Exclude: pages matching the filter are removed from the results

Field

This defines against which field the keyword is matched:

  • Document: this matches against the Content, the Title or the URL

  • Content: the text content of the page

  • Title: the title of the page

  • URL: the URL of the page

  • Mimetype: the mimetype of the document

  • Links to url: returns documents containing links which target URLs matching the keyword

  • Links to text: returns documents containing links which text (the text of the link, not the text of the target document) matching the keyword

  • Linked by url: returns documents which are the target of the links of URLs matching the keyword

  • Linked by text: returns documents which are pointed by links whose text match the keyword

  • Tag: this finds matching tags and returns documents containing them or any of their descendants

Operator

This defines how the keyword is matched against the field:

  • Containing: this matches when the keyword is contained inside the field.

  • Equal to: this matches when the keyword is exactly to entire field.

  • Matching Regex: matching is done using Posix regular expressions (see PostgreSQL documention for details)

Results

../_images/search_result.png

From top to bottom, left to right, the elements displayed are:

  • the favicon of the page

  • the title of the page, or its URL if it has no title

  • the URL

  • the tags of the page

  • the score of the page for the provided search keywords from 0.0 to 1.0

  • the language of the page

  • the archive link to the archived version, or source link to the original page (depending on the related option)

Atom feeds

The atom_button button, gives access to an Atom feeds for the current search terms ⚛:

  • Atom results feed has entries with links to the original website

  • Atom archive feed has entries with links to the archived website

In case anonymous searches are disabled, a token can be defined to access the Atom feed without authenticating. This is done by appending a token=<Atom access token> parameter to the Atom feeds URL.

CSV export

The atom_button button, gives access to a CSV export for the current search terms.

CSV export can be disabled using the csv_export option.