Simpplr Search

Overview

Simpplr Search (otherwise known as global search) is one of the most powerful features in Simpplr. It's prominently displayed at the top left of every page because data shows that when it comes to intranets, users want to get in, complete their task, and get out. For these reasons we put Search up front and center, and we’ve invested a lot of resources into making our search best of class. Search is powerful for a number of reasons: It’s smart, federated, and curated. Search results are also faceted. More on this below.

Search features

Smart

Simpplr Search is smart. That means it's powered by artificial intelligence (AI) and adaptive machine learning (ML). It takes into account your profile data on Simpplr such as department and location, and more to help serve personalized search results. So the results from one user's search may not be the same results as another user.

Simpplr search will also show you recent searches you've made each time you go to search. This makes it easy to find recently used information.

Finally, our auto-suggested results feature will suggest results based on the initial characters you type into the search box. This saves time and allows users to be more efficient.

Federated

Simpplr Search is federated, meaning in addition to searching content from the entire intranet, it also searches any integrations plugged into Simpplr, such as file repositories like Dropbox, Google MyDrive, or SharePoint. Again, this allows your team members to find any content from one centralized location.

Neural

As of the 23.09 release, Search has been enhanced to include Neural search. Note, this feature is being gradually rolled out, and will not be immediately available to all customers.

With the introduction of Neural search, the entire search experience is shifting from traditional keywords to a much more powerful hybrid search to get the best of both worlds. It uses AI to determine relationships between data points and converts data to vectors, which facilitates speed and flexibility. By using vectors (a numerical array that captures the meaning and similarity), the Neural search goes beyond just matching keywords, to find results that are further relevant and accurate. It understands the search intent of the users using natural language processing, particularly for long tail queries (triggered for two or more words) where keyword/result pairs don’t perfectly match.

For example, if you search for 'product manager in Canada', with Neural search, the product manager based in Canada is listed at the top. The same would be applied to 'product managers in UK'.
before and after neural search.png

Exact match results

By default, the global search is designed to look for multiple sources based on the keywords and phrases input. The introduction of Neural search solidifies this default method of searching. However, in cases where you know absolutely, exactly what content that contains your search words or phrases, you can filter down to show an 'exact match' with the entered keywords (or phrases) to get to the content quickly.

By checking the checkbox for Exact match, the search results get filtered to display only content (across 3rd parties) that have:

  • All the words typed in the search box present

AND

  • All the words present in the same sequence as in the search phrase.

This option to filter further using the 'exact match' checkbox is displayed after the initial set of results is loaded for all the end users/employees

For example, you want to find your company's Employee Handbook, but specifically the one for India. So you typed 'India Employee Handbook' which, before Exact match, returned 40+ results. Now, applying the Exact match checkbox will filter down the results to 1, which has the words "India Employee Handbook" in the same sequence and is the most relevant.
exact match.png

How does Simpplr search work?

There are various ways to weigh the search. The main types we use are relevancy, recency, popularity and personalization. We’ll discuss what these mean below.

These can be cumulatively added or work independently. When the different methods are combined together, the overall outcome is often quite complicated to predict.

Search Type Weighting being used
Content Relevancy > Recency
People Relevancy > Recency 
Sites Relevancy > Recency 
Files Relevancy > Recency
Video Relevancy > Recency

The fields/values below have a differential weight in Search, making them appear higher or lower in the search results if there is a match with the search keyword.

  • Page title 
  • Page body content (does not include HTML embed code and hyperlinks)
  • Site name
  • Site category 
  • Page category name 
  • Topic names 
  • Question titles 
  • Expertise 
  • File title
  • File body
  • Video title
  • Tile titles

Once the search calculates relevancy scores, the app further tweaks the relevancy using the following 'boosts': 

    • People in the same department - boosted 1.1x 

    • Content publish date before 18 months - boosted 0.1x

    • Page publish date in last 12 months - boosted 2 .5x

    • User location is the title or body of content published in the last 12 months - boosted 2x 

    • Topics matching for content published in the last 12 months - boosted 2x 

    • Page category for content published in the last 12 months - boosted 2x

    • Event name (last 3 months or next 3 months) - boosted 1.1x

    • Album (edited in the last 12 months) - boosted 1.1x

    • File - (published in the last 12 months) boosted 2.5x
    • Videos - (published in the last 12 months) boosted 2.5x
    • Files - (published before 18 months) boosted 0.1x
    • Videos - (published before 18 months) boosted 0.1x

Note:

Currently Simpplr Search does not support Boolean operators.

 

Native Video and File search results

As of the 23.09 release, Simpplr Native Video content and files directly uploaded to Simpplr can now be searched and displayed in the search's top result based on relevancy and recency. With most of the intranet content (except feed and apps) now being displayed in a unified single-ranked list, there is a greater likelihood of finding what you need without applying additional filters or navigating multiple pages.

The below fields for Video and Files have a differential ‘weight’ thus making them appear higher or lower in the search result if there is a match with the search keyword:

  • Video Title - 0.8

  • File Title - 0.8

  • File Body - 0.2

Note:

Only Video titles are searched and returned in top search results, not Video transcripts or descriptions. To find specific video transcript content, you'll need to filter by Video only search results.

Further boosting takes place to get you the most recent content ranked higher.

  • Video added in the last 12 months boosted higher - 2.5x
  • File added in the last 12 months boosted higher - 2.5x
  • Video added earlier than 18 months boosted lower - 0.1x
  • File added earlier than 18 months boosted lower - 0.1x
    Video search results.png

When searching for internally stored files (i.e., files uploaded and stored directly in Simpplr), Simpplr Search looks at the file name and text within the file to determine keyword matching. All text within the file is searched, regardless of length of file text/content. If there is a match in the file name, that will weigh higher than a match in the body of file content. This is applicable to all file types.

File search results appear in a separate tab from the main list of content-related search results. The following files types are not searched: "jpg","gif","png","jpeg","JPG","GIF","PNG","JPEG". 

For third party-connected file storage integration apps, Simpplr will send the vendor the keyword/phrase you're searching, and the vendor provides the results in whatever order they determine. We don't know the logic they're using (i.e., relevancy/weight, etc.). The only exception to this is for the Confluence integration. With Confluence, wherever there's a match in the title of the Confluence doc, it sorts first in the top 10 results.

Relevancy 

Relevancy is the starting point for all searches. This is the process of matching the query to results.

Some analyzers are processed while indexing the documents and while making the query:

  • Lowercase 
    • Words made into lowercase to increase chance of matches
  • Remove special characters 
    • E.g. ‘Wi-fi’ matches to ‘wifi’ and ‘wi fi’
  • Stemming 
    • E.g. ‘runs’ matches to ‘runs’, ‘running’, ‘runners’
  • Neural search will look at 'Stop' words like the ones below. It's looking at all words to form context. For example, "HR Managers in Canada"
    • a, an, and, are, as, at, be, but, by, for, if, in, into, is, it, no, not, of, on, or, such, that, the, their, then, there, these, they, this, to, was, will, with 

Some analyzers just happen while querying:

  • TFIDF (Term Frequency Inverse Document Frequency) 
    • If a word is repeated a lot in documents it is given much less weight, this helps key words have more prominence in the results
  • Mismatched spellings
    • Fuzzy matching - covers mismatching and matches data even when there are multiple differences. Users are also prompted with a 'Did you mean' suggestion
    • The longer the query, the greater the threshold for mistakes
    • Can be any letters that are incorrectly added 
    • We have gone with the standard fuzzy matching rules of Elastic-search 

If there is only one result from the search, then relevancy is all that's needed. When there are multiple results, we will consider other factors.

As of the 23.06 release, we have improved relevancy of Search results.

  • Relevancy ranking is more accurate with a greater likelihood of finding what you need without navigating multiple pages.
  • User location is factored into relevancy ranking for finding content with the user’s location in the title.
  • When searching for certain users, people from the same department as you are ranked higher in relevancy.

Prefix matching

Prefix matching is used to complete a search term by predicting the ending based on the prefix you’ve typed. 

Although this is a powerful feature in search it increases the index size vastly:

  • E.g. To use prefix matching for the term 'Adam', the index will need to store A, Ad, Ada and Adam. 

If prefix matching was used on the whole index, it would slow the search function down vastly, and potentially return very confusing results. Because of this limitation, prefix matching is used sparingly.

The autocomplete function uses prefix matching for all titles: 

  • Site names 
  • People names 
  • Content titles 

The global search only uses prefix matching on:

  • People names

We added prefixing on people names in the global search because without it, the results seemed odd at times:

  • E.g. Typing ‘Jo’ into autocomplete would return the results, ‘Joe’, ‘Jonathan’, ‘Jovita’. But if you then did a global search using ‘Jo’ there would be no results.
  • Now that we have prefix matching in the global search this isn’t a problem. 
  • However for the sake of index size and search speed we made the decision not to include it for site names and content titles in the global search.

Recency

Having spent some time experimenting with different combinations of search functions, Simpplr decided that within the confines of an intranet, for the vast majority of content, the most important factor for weighting results (apart from relevancy) is recency.

  • Examples:
    • If you search for a ‘Company Update’ there may be hundreds of results in the search ,but the one that you most likely to want to view is the most recent
    • If you search for ‘Benefits’, you want to know it’s the most recent version of the Benefits policy that appears at the top of the results

Although there may be some occasions where the recency of a piece of content is not as important as its popularity, we feel this will be an exception. 

The result of this is that when users are searching, the results should be in chronological order.

  • There may be some discrepancies due to one piece of content being more relevant to the initial term than others, so it ends up with a higher weighting. 
    • E.g. A piece of content contains the initial search term multiple times in the title and summary 

Numeric matching

Search can find relevant content based on numeric strings you input. For example, if you search 8765, the top results will include content with numeric values containing that string of digits. Often, this can be a policy number you only remember the first few digits of. Search results also include characters found in files uploaded to the intranet, such as PDFs.

‘Did you mean’ suggestions

Suggestions for similar content based on your search query will be made when no results are found. Simpplr uses phrase suggestion features of elastic search to display more contextual and relevant suggestions.

What's included in global search and auto-complete suggestions?

Global search and auto-complete suggestions include results from the following:

  • All tile titles (site and home dashboard). Note that only tiles you have access to as a user will be searched. In other words, if you don't have access to a certain site, you won't see any of that site's tiles when searching. 
  • Apps tab

Coveo Search integration

Simpplr’s search is integrated with Coveo. Coveo searches for content not just against Simpplr data, but across multiple sources integrated with Coveo. For more information on our Coveo integration, check out this article.

Best practices

Keep Search clean and functional by being specific with your topics. General practice is not to exceed adding six topics to any one piece of content. While App managers can always go in and manage topics to keep them from getting overcrowded, this job can be made easier by spreading awareness to your users to limit their topic additions.

The reason we recommend this is because eventually, if you have 100+ topics throughout your intranet, your Simpplr Search can get muddy and not work as efficiently due to searching too many topics for similar content. 

Was this article helpful?
2 out of 3 found this helpful
Have more questions? Submit a request

Comments

4 comments
  • Hi KB Team!
    Could you please advise if there are any special characters that allow for different ways of searching using the global search?
    i.e. using quotations to define a results list of exact matches for the "keyword"
    or % to search for part of a word to see what results it will pull (i.e. %doc - to see if the page you're looking for had a topic document or documents, etc...)
    those are the two that I know of, but since the Hotaka/Ida release, I'm not sure if these search features have been abandoned or if there are new ones, etc...

    0
    Comment actions Permalink
  • Hi Aileen. We currently do not support these functionalities. However I've filed an Idea with our Engineering team to take this into consideration.

    2
    Comment actions Permalink
  • Hey folks, In the section, " How does Simpplr search work?" where there's a list of factors that boost search, is "publish date" referencing initial publish date or does it include published edits?

    Ex. "Content publish date before 18 months - boosted 0.1x" - Is only content that was originally published <18 months ago boosted, or is content that was edited within 18 months also boosted?

    Thank you!

     

    0
    Comment actions Permalink
  • Hi Maeve. For now, only the initial publish date is considered in the algorithm. 

    0
    Comment actions Permalink

Please sign in to leave a comment.

Articles in this section

See more