- Search features
- How does Simpplr search work?
- 'Did you mean' suggestions
- Coveo Search integration
- Searching for files
- Best practicesxf
Simpplr Search (otherwise known as global search) is one of the most powerful features in Simpplr. It's prominently displayed at the top left of every page because data shows that when it comes to intranets, users want to get in, complete their task, and get out. For these reasons we put Search up front and center, and we’ve invested a lot of resources into making our search best of class. Search is powerful for a number of reasons: It’s smart, federated, and curated. Search results are also faceted. More on this below.
Simpplr Search is smart. That means it's powered by artificial intelligence (AI) and adaptive machine learning (ML). It takes into account your profile data on Simpplr such as department and location, and more to help serve personalized search results. So the results from one user's search may not be the same results as another user.
Simpplr search will also show you recent searches you've made each time you go to search. This makes it easy to find recently used information.
Finally, our auto-suggested results feature will suggest results based on the initial characters you type into the search box. This saves time and allows users to be more efficient.
Simpplr Search is federated, meaning in addition to searching content from the entire intranet, it also searches any integrations plugged into Simpplr, such as file repositories like Dropbox, Google MyDrive, or SharePoint. Again, this allows your team members to find any content from one centralized location.
As of the 23.09 release, Search has been enhanced to include Neural search. Note, this feature is being gradually rolled out, and will not be immediately available to all customers.
With the introduction of Neural search, the entire search experience is shifting from traditional keywords to a much more powerful hybrid search to get the best of both worlds. It uses AI to determine relationships between data points and converts data to vectors, which facilitates speed and flexibility. By using vectors (a numerical array that captures the meaning and similarity), the Neural search goes beyond just matching keywords, to find results that are further relevant and accurate. It understands the search intent of the users using natural language processing, particularly for long tail queries (triggered for two or more words) where keyword/result pairs don’t perfectly match.
For example, if you search for 'product manager in Canada', with Neural search, the product manager based in Canada is listed at the top. The same would be applied to 'product managers in UK'.
Exact match results
By default, the global search is designed to look for multiple sources based on the keywords and phrases input. The introduction of Neural search solidifies this default method of searching. However, in cases where you know absolutely, exactly what content that contains your search words or phrases, you can filter down to show an 'exact match' with the entered keywords (or phrases) to get to the content quickly.
By checking the checkbox for Exact match, the search results get filtered to display only content (across 3rd parties) that have:
- All the words typed in the search box present
- All the words present in the same sequence as in the search phrase.
This option to filter further using the 'exact match' checkbox is displayed after the initial set of results is loaded for all the end users/employees
For example, you want to find your company's Employee Handbook, but specifically the one for India. So you typed 'India Employee Handbook' which, before Exact match, returned 40+ results. Now, applying the Exact match checkbox will filter down the results to 1, which has the words "India Employee Handbook" in the same sequence and is the most relevant.
How does Simpplr search work?
There are various ways to weigh the search. The main types we use are relevancy, recency, popularity and personalization. We’ll discuss what these mean below.
These can be cumulatively added or work independently. When the different methods are combined together, the overall outcome is often quite complicated to predict.
|Weighting being used
|Relevancy > Recency
|Relevancy > Recency
|Relevancy > Recency
|Relevancy > Recency
|Relevancy > Recency
The fields/values below have a differential weight in Search, making them appear higher or lower in the search results if there is a match with the search keyword.
- Page title
- Page body content (does not include HTML embed code and hyperlinks)
- Site name
- Site category
- Page category name
- Topic names
- Question titles
- File title
- File body
- Video title
- Tile titles
Once the search calculates relevancy scores, the app further tweaks the relevancy using the following 'boosts':
People in the same department - boosted 1.1x
Content publish date before 18 months - boosted 0.1x
Page publish date in last 12 months - boosted 2 .5x
User location is the title or body of content published in the last 12 months - boosted 2x
Topics matching for content published in the last 12 months - boosted 2x
Page category for content published in the last 12 months - boosted 2x
Event name (last 3 months or next 3 months) - boosted 1.1x
Album (edited in the last 12 months) - boosted 1.1x
- File - (published in the last 12 months) boosted 2.5x
- Videos - (published in the last 12 months) boosted 2.5x
- Files - (published before 18 months) boosted 0.1x
- Videos - (published before 18 months) boosted 0.1x
Note:Currently Simpplr Search does not support Boolean operators.
Native Video and File search results
As of the 23.09 release, Simpplr Native Video content and files directly uploaded to Simpplr can now be searched and displayed in the search's top result based on relevancy and recency. With most of the intranet content (except feed and apps) now being displayed in a unified single-ranked list, there is a greater likelihood of finding what you need without applying additional filters or navigating multiple pages.
The below fields for Video and Files have a differential ‘weight’ thus making them appear higher or lower in the search result if there is a match with the search keyword:
Video Title - 0.8
File Title - 0.8
File Body - 0.2
Note:Only Video titles are searched and returned in top search results, not Video transcripts or descriptions. To find specific video transcript content, you'll need to filter by Video only search results.
Further boosting takes place to get you the most recent content ranked higher.
- Video added in the last 12 months boosted higher - 2.5x
- File added in the last 12 months boosted higher - 2.5x
- Video added earlier than 18 months boosted lower - 0.1x
- File added earlier than 18 months boosted lower - 0.1x
When searching for internally stored files (i.e., files uploaded and stored directly in Simpplr), Simpplr Search looks at the file name and text within the file to determine keyword matching. All text within the file is searched, regardless of length of file text/content. If there is a match in the file name, that will weigh higher than a match in the body of file content. This is applicable to all file types.
File search results appear in a separate tab from the main list of content-related search results. The following files types are not searched: "jpg","gif","png","jpeg","JPG","GIF","PNG","JPEG".
For third party-connected file storage integration apps, Simpplr will send the vendor the keyword/phrase you're searching, and the vendor provides the results in whatever order they determine. We don't know the logic they're using (i.e., relevancy/weight, etc.). The only exception to this is for the Confluence integration. With Confluence, wherever there's a match in the title of the Confluence doc, it sorts first in the top 10 results.
Relevancy is the starting point for all searches. This is the process of matching the query to results.
Some analyzers are processed while indexing the documents and while making the query:
- Words made into lowercase to increase chance of matches
Remove special characters
- E.g. ‘Wi-fi’ matches to ‘wifi’ and ‘wi fi’
- E.g. ‘runs’ matches to ‘runs’, ‘running’, ‘runners’
Neural search will look at 'Stop' words like the ones below. It's looking at all words to form context. For example, "HR Managers in Canada"
- a, an, and, are, as, at, be, but, by, for, if, in, into, is, it, no, not, of, on, or, such, that, the, their, then, there, these, they, this, to, was, will, with
Some analyzers just happen while querying:
TFIDF (Term Frequency Inverse Document Frequency)
- If a word is repeated a lot in documents it is given much less weight, this helps key words have more prominence in the results
- Fuzzy matching - covers mismatching and matches data even when there are multiple differences. Users are also prompted with a 'Did you mean' suggestion
- The longer the query, the greater the threshold for mistakes
- Can be any letters that are incorrectly added
- We have gone with the standard fuzzy matching rules of Elastic-search
If there is only one result from the search, then relevancy is all that's needed. When there are multiple results, we will consider other factors.
As of the 23.06 release, we have improved relevancy of Search results.
- Relevancy ranking is more accurate with a greater likelihood of finding what you need without navigating multiple pages.
- User location is factored into relevancy ranking for finding content with the user’s location in the title.
- When searching for certain users, people from the same department as you are ranked higher in relevancy.
Prefix matching is used to complete a search term by predicting the ending based on the prefix you’ve typed.
Although this is a powerful feature in search it increases the index size vastly:
E.g. To use prefix matching for the term 'Adam', the index will need to store A, Ad, Ada and Adam.
If prefix matching was used on the whole index, it would slow the search function down vastly, and potentially return very confusing results. Because of this limitation, prefix matching is used sparingly.
The autocomplete function uses prefix matching for all titles:
- Site names
- People names
- Content titles
The global search only uses prefix matching on:
We added prefixing on people names in the global search because without it, the results seemed odd at times:
- E.g. Typing ‘Jo’ into autocomplete would return the results, ‘Joe’, ‘Jonathan’, ‘Jovita’. But if you then did a global search using ‘Jo’ there would be no results.
- Now that we have prefix matching in the global search this isn’t a problem.
- However for the sake of index size and search speed we made the decision not to include it for site names and content titles in the global search.
Having spent some time experimenting with different combinations of search functions, Simpplr decided that within the confines of an intranet, for the vast majority of content, the most important factor for weighting results (apart from relevancy) is recency.
- If you search for a ‘Company Update’ there may be hundreds of results in the search ,but the one that you most likely to want to view is the most recent
- If you search for ‘Benefits’, you want to know it’s the most recent version of the Benefits policy that appears at the top of the results
Although there may be some occasions where the recency of a piece of content is not as important as its popularity, we feel this will be an exception.
The result of this is that when users are searching, the results should be in chronological order.
There may be some discrepancies due to one piece of content being more relevant to the initial term than others, so it ends up with a higher weighting.
- E.g. A piece of content contains the initial search term multiple times in the title and summary
Search can find relevant content based on numeric strings you input. For example, if you search 8765, the top results will include content with numeric values containing that string of digits. Often, this can be a policy number you only remember the first few digits of. Search results also include characters found in files uploaded to the intranet, such as PDFs.
‘Did you mean’ suggestions
Suggestions for similar content based on your search query will be made when no results are found. Simpplr uses phrase suggestion features of elastic search to display more contextual and relevant suggestions.
What's included in global search and auto-complete suggestions?
Global search and auto-complete suggestions include results from the following:
- All tile titles (site and home dashboard). Note that only tiles you have access to as a user will be searched. In other words, if you don't have access to a certain site, you won't see any of that site's tiles when searching.
- Apps tab
Coveo Search integration
Simpplr’s search is integrated with Coveo. Coveo searches for content not just against Simpplr data, but across multiple sources integrated with Coveo. For more information on our Coveo integration, check out this article.
Keep Search clean and functional by being specific with your topics. General practice is not to exceed adding six topics to any one piece of content. While App managers can always go in and manage topics to keep them from getting overcrowded, this job can be made easier by spreading awareness to your users to limit their topic additions.
The reason we recommend this is because eventually, if you have 100+ topics throughout your intranet, your Simpplr Search can get muddy and not work as efficiently due to searching too many topics for similar content.