|
Boolean Search
In a regular search, the query sent to the search engine is just a list of words. The Tocqueville search engine can accept as query a text of any length and will return a list of documents in order of decreasing relevance (or similarity to the query). This is the preferred search mode.A boolean search has the advantage of being more controlled, in that the query specifies the words which may occur, must occur or must not occur in the resulting documents. For example, the query
(villepin and xavier) and not dominique
should only return documents referring to Xavier de Villepin and not to his son Dominique. Capitalization and the use of diacritics are optional: FRançois and francois will work just as well.In practice, things are rarely as simple: for example, an article referring to both Xavier and Dominique de Villepin would be rejected by this query, as well as an article referring to Xavier de Villepin and Dominique Dupont. Another inconvenient is that, in a strictly boolean approach, there is no way to order by relevance the returned list of documents. However, if the query is well designed, this list should be short.
This is why the boolean search mode is usually not recommended, except in very specific circumstances where one is careful in interpreting the results. There are also a number of limitations regarding boolean queries:
However, an error in a boolean query is never fatal: the boolean query is simply converted into a regular query, and the fact that there was an error is indicated on the result page.
- They should not contain empty (ie. very frequent) words such as the or are.
- They should not contain square brackets, braces or angle brackets [ ] { } < > and punctuation signs.
- For readability, and also to avoid issues of operator precedence, it is advisable to use as many parentheses ( ) as possible.
- These parentheses must be balanced, ie. there must be as many opening as closing parentheses.
- They can only use the following logical operators: and, or, not.
- They cannot contain more than 8 words, not including logical operators.