Here are some guidelines on - how to use special characters, search limitations, and general rules when running Simple/Advanced searches
Special Characters are considered as white space when performing the search, this allows for finding the strings that contain Special Characters.
The search engine uses the following Unicode standard for the analyzer:
https://unicode.org/reports/tr29/
Simple Search
- multiple terms: CONTAINS ANY
- pre-analyzed: DOMAIN, EMAIL, OTHER (REGULAR TEXT)
Email address:
- all Header fields (FROM, TO, CC, BCC, Hidden, Mailbox, Recipients) are searchable as single term.
- Wildcards (*, ?) are allowed.
- on Subject, BODY, and Attachment fields analyzed as REGULAR TEXT (split by dots and @ sign)
DOMAIN:
- on header fileds (FROM, TO, CC, BCC, Hidden, Mailbox, Recipients) analyzed as a single term.
- Wildcards (*, ?) are allowed.
- on Subject, BODY, and Attachment fields analyzed as REGULAR TEXT (split by dots)
REGULAR TEXT:
- punctuation marks or special characters (including $, #, @, superscripts, and under scripts) are ignored and are used as term separators (some exceptions exist, see below).
- if a punctuation mark is contained within a search query, it will most likely produce multiple search terms.
- Fuzzy search is allowed
- wildcard characters are allowed (*,?) but fuzzy search has a higher priority
EXCEPTIONS (for regular text search):
- decimal numbers are treated as single terms (example 23.345)
- apostrophes in possessive case (John's, it's) are not considered as separator
- copyright (©) and related are considered a term
- underscore (_) is not a separator
NOTE:
- If other fields (non-address) are included in a simple search field set, EMAILS and DOMAINS will be broken up as multiple terms
Advanced Search:
- we can select search target fields more precisely
- everything related to Simple Search analysis applies here as well
- multiple terms: CONTAINS ALL, CONTAINS ANY, CONTAINS PHRASE are selectable:
- CONTAINS ALL operator sometimes represented as the logical operator "AND," is used to find search results that include all of the specified keywords or terms. When you use this operator, the search engine will return results that contain all the terms mentioned in the query. It narrows down the search and provides more specific results by ensuring that every keyword is present in the search results. For example, if you search for "cat AND dog," the search engine will return only those results that contain both the terms "cat" and "dog."
- CONTAINS ANY operator, often represented as the logical operator "OR," is used to find search results that include any of the specified keywords or terms. With this operator, the search engine will return results that contain at least one of the terms mentioned in the query. It broadens the search and provides a wider range of results by allowing any of the specified keywords to be present. For example, if you search for "cat OR dog," the search engine will return results that include either the term "cat" or the term "dog" or both.
- CONTAINS PHRASE operator, sometimes referred to as a phrase search, is a search query operator used in search engines to find results that contain an exact phrase or sequence of words. It helps to narrow down search results by specifying that the exact phrase should be present in the search results, rather than just individual words. For example, if you search for "best pizza in New York," the search engine will return results that include that exact phrase, rather than showing results that contain the words "best," "pizza," "in," and "New York" scattered throughout the content. This operator is particularly useful when you want to search for email address strings.
Keyword List:
- everything related to Simple Search analysis applies here as well
- multiple terms from the keyword list separated by coma are searched with CONTAINS PHRASE parameter
Please NOTE the table below represents the overview, for more precise explanation check the text above.
SPECIAL CHARACTER/PUNCTATION | SIMPLE SEARCH | ADVANCED SEARCH | EXCEPTION |
any combination of three special characters in a row | produces zero results | not allowed | n/a |
~ | recognized by the search engine as white space/separator | recognized by the search engine as white space/separator | |
` | |||
! | splits the word in two and searches the strings that either contain one or another | ignores | |
@ | ignores | ignores | email address is considered as a string in keyword lists and FROM, TO, CC, BCC fields |
# | recognized by the search engine as white space/separator | recognized by the search engine as white space/separator | |
$ | |||
% | |||
^ |
|||
( | |||
) | |||
- | |||
_ | searchable | searchable | |
= | recognized by the search engine as white space/separator | recognized by the search engine as white space/separator | |
+ | |||
[ | |||
] | |||
{ | |||
} | |||
\ | |||
| | |||
; | |||
: | |||
' | in the keyword list acts like a separator | ||
" | |||
< | |||
> | |||
, | |||
. | email address is considered as a string in keyword lists and FROM, TO, CC, BCC fields | ||
/ | returns 0 results in a keyword list |
Comments