Maximum performance with Sitecore Search API

Maximum performance with Sitecore Search API

default avatar
VonRosen Petrov
Juli 10, 2019
Orange search and code art on a teal background

FFW's award-winning Sitecore team shares our insights on refining Sitecore's search functionality speed the return of results and enhance the user experience.

The FFW team works with more than just Drupal and WordPress. We've built some award-winning Sitecore solutions as well, and in this Sitecore blog series, we'll be sharing our insights on refining Sitecore's search functionality to enhance the user experience.

In this blog post I’ll share with you 4 tips to make your Sitecore search code run up to 5x faster.

Test Scenario

I’ll describe the test scenario I’ve built to test the Sitecore Search API and define the tips that are coming next in the blog posts. In short, I've created a Sitecore bucketable template called Article with the following fields on it:

  • Article Title (Single Line Text)
  • Article Description (Multiline Text
  • Field One, Field Two, Field Three, Field Four, Field Five (Single Line Tex)

I've also created a Sitecore bucket which contains 2000 articles. One half of the “Article Title” fields values have been set to “Even“ and the other half to “Odd“. Other fields have been set to random string values. I haven’t created a separate search index but have used the default site core_web_index for my tests.

Here is the ArticleSearchResultItem that I’ve built in my code:

public class ArticleSearchResultItem {
  [IndexField("_group"), TypeConverter(typeof(IndexFieldIDValueConverter)), DataMember]
  public virtual ID ItemId { get; set; }

  [IndexField("_template")]
  [TypeConverter(typeof(IndexFieldIDValueConverter))]
  public virtual ID TemplateId { get; set; }

  [IndexField("article_title_t")]
  public virtual string ArticleTitle { get; set; }

  [IndexField("article_description_t")]
  public virtual string ArticleDescription { get; set; }

  [IndexField("field_one_s")]
  public virtual string FieldOne { get; set; }

  [IndexField("field_two_s")]
  public virtual string FieldTwo { get; set; }

  [IndexField("field_three_s")]
  public virtual string FieldThree { get; set; }

  [IndexField("field_four_s")]
  public virtual string FieldFour { get; set; }

  [IndexField("field_five_s")]
  public virtual string FieldFive { get; set; }
}

I’ve added a Sublayout on the home page of the website and on each page load I’ve executed a Sitecore search for Article Title = “Even“ AND TemplateID = “Article Template ID“.

Here is my initial search code snippet:

var articleTemplateID =
ID.Parse("{F385BEE3-5F8E-4F84-AC77-A625077641DF}");
var text = "Even";

var index = ContentSearchManager.GetIndex("sitecore_web_index");
using (var context = index.CreateSearchContext())
{
  // var watch = System.Diagnostics.Stopwatch.StartNew();

  var resultItems = context.GetQueryable<ArticleSearchResultItem>()
     .Where(p => p.TemplateId == articleTemplateID)
     .Where(p => p.ArticleTitle.Equals(text))
     .ToList();
 
  // watch.Stop();
  // var executionMs = watch.ElapsedMilliseconds.ToString();
}

I’ve left my Stopwatch measurements as comments in the code just for your reference. Each performance test is done “in cold“ which means that both Solr and IIS was restarted before each measurement. This is done to ensure that we won’t get wrong measurements because of the Solr or Sitecore cache.

TIP 1: Focus on your C# code to improve search performance

After executing my initial search code snippet, I’ve measured two metrics:

  • The C# total time of search call execution (executionMS from the initial code snippet)
  • I got the translated from C# Solr query and ran it in the Solr Query Admin panel (QTime)

Initial results:

  • Stopwatch C# API value: executionMS = 1150ms
  • Solr Query Admin panel value (running the same translated query): QTime = 85ms

As we can see from the above results, the actual Solr querying is more that 10x less than the total searching time we got in our application. There is something more than the searching itself that costs much more processing time during the search execution. In fact, most of our search time is spent on the result data being transferred, parsed and deserialized. That means that if we want to improve the search performance we need to carefully control what is transferred, parsed and deserialized during our search requests.

Let’s start tweaking our search code snippet.

TIP 2: Replace Sitecore’s SearchResultItem with a custom one

Does it make any difference if we inherit from (or use directly) the default Sitecore SearchResultItem or we built and use a custom one? I ran my initial search code snippet twice with the only difference that in the first run I haven’t inherited from Sitecore SearchResultItem but in the second, I did. And here are the results:

Custom vs SearchResultItem results:

  • Custom ArticleSearchResultItem: executionMS = 1150ms
  • Inherited from Sitecore SearchResultItem: executionMS= 2190ms

Sitecore SearchResultItem maps a lot of properties which we don’t need during our search. That’s why it’s better for the performance to create

custom SearchResultItem which contains only the properties that you actually care about during searching.

TIP 3: Use Select in your search queries

For this test I kept my custom ArticleSearchResult. The only change that I did to my initial search code snippet from is that I’ve added a Sel ect clause which will pull only the fields that I need from Solr.

Here is how that change looks:

var resultItems = context.GetQueryable<ArticleSearchResultItem>()
   .Where(p => p.TemplateId == articleTemplateID)
   .Where(p => p.ArticleTitle.Equals(text))
   .Select(p => new ArticleSearchResultItem
   {
      ArticleTitle = p.ArticleTitle,
      ArticleDescription = p.ArticleDescription,
      FieldOne = p.FieldOne,
      FieldTwo = p.FieldTwo,
      FieldThree = p.FieldThree,
      FieldFour = p.FieldFour,
      FieldFive = p.FieldFive
   })
   .ToList();

Query Without Select vs Query With Select results:

  • Without Select in the query (the initial search code snippet): executionMS = 1150ms
  • With Select in the query: executionMS = 830ms

We can see that adding Select to our query improved performance with around 300ms. Let’s compare the translated Solr queries and see where is the difference:

  • Without Select q=(_template:(f385bee35f8e4f84ac77a625077641df)%20AND%20article_title_t:(Even))&rows=2147483647&fq=_indexname:(sitecore_ web_index)&wt=xml
  • With Select q=(_template:(f385bee35f8e4f84ac77a625077641df) AND article_title_t:(Even))&rows=2147483647&fl=article_title_t,article_descripti on_t,field_one_t,field_two_t,field_three_t,field_four_t,field_five_t,_uniqueid,_datasource&fq=_indexname:(sitecore_web_index)&w t=xml

When we add Select to our queries, we actually add the fl (Field List) parameter to the Solr query. The fl parameter limits the information included in a query response to a specified list of fields. This means that our performance improvement is based on the change that we pull only the document properties we need from Solr but not all of them.

Using Select will improve the performance of your queries, especially when you need only a few fields from your search result item. In these scenarios, you could consider using C# anonymous types in your query. Here is an example of C# anonymous type usage which will load only the Article Title from Solr:

var searchResults = context.GetQueryable<ArticleSearchResultItem>()
   .Where(p => p.TemplateId == articleTemplateID)
   .Where(p => p.ArticleTitle.Equals(text))
   .Select(p => new {
      p.ArticleTitle
   });

TIP 4: Don’t get all search results but paginate them

Since now, I was getting all of the found results which in our case is 1000 results. It’s true that 1000 results are not processable by a human at once. So, most of the time you will need just a subset of the search result data. It could be the top N results or a page from the pagination.

Here, it’s extremely important to limit your search result set during the querying but not getting all of the results and manipulate them after that using Linq to Object.

So, let’s assume that we have a pager for our Articles that displays 10 results per page and we need to get the first page results. To accomplish that, I’ve extended my query from TIP 3 adding the Page clause. Here is how it looks now:

var searchResults = context.GetQueryable<ArticleSearchResultItem>()
   .Where(p => p.TemplateId == articleTemplateID)
   .Where(p => p.ArticleTitle.Equals(text))
   .Page(0, 10)
   .Select(p => new ArticleSearchResultItem
   {
      ArticleTitle = p.ArticleTitle,
      ArticleDescription = p.ArticleDescription,
      FieldOne = p.FieldOne,
      FieldTwo = p.FieldTwo,
      FieldThree = p.FieldThree,
      FieldFour = p.FieldFour,
      FieldFive = p.FieldFive
   })
   .GetResults();

var resultItems = searchResults.Hits.Select(p => p.Document);

Paginated search data vs Not-Paginated search data results:

  • Not-Paginated Query (TIP 3 search code snippet): executionMS = 830ms
  • Paginated Query: executionMS = 445ms.

We can see that adding pagination to the query in our case improved the speed twice!

Summary

Applying these 4 tips in our search queries we could improve our code speed 5 times!

In comparison:

  • Not optimized query time: 2190ms
    • Using Sitecore’s SearchResultItem, without Select clause, and without pagination.
  • Optimized query time: 445ms
    • Using a custom search result item, using Select clause, and using pagination.

If you enjoyed this blog post, please don’t forget to check the Sitecore Search Series: Your Complete Guide to Performance Improvement blog series for more good tips!

If you need help with a Sitecore site, let us know. And if you have more Sitecore Search Performance tips, feel free to put them as comments below.