Prevent Sitecore from interpreting url path as language

This blog post describes a technique to prevent Sitecore CMS from throwing Language/Culture exception when parsing language through Sitecore's StripLanguage.

By default it attempts to interpret any input after your domain name as a language. For example, it considers "anyurl" in http://your.domain.com/anyurl as a language and may throw an exception.

Since every url passes through StripLanguage, the accumulation of those exceptions affects your application's availability, stability and performance.

@SitecoreJohn in his blog post already found a solution which overcomes this situation using the web.config file and a custom StripLanguage. As explained by him, one drawback of his approach is that every time you will create a new site, you will need to input its language in the config file using the below patch.

<configuration xmlns:patch="http://www.sitecore.net/xmlconfig/">
  <sitecore>
<pipelines>
  <preprocessRequest>
<processor type="Sitecore.Pipelines.PreprocessRequest.StripLanguage, Sitecore.Kernel">
  <patch:attribute name="type">SitecoreJohn.Pipelines.PreprocessRequest.StripLanguage, SitecoreJohn</patch:attribute>
  <allowedLanguges hint="list:AddValidLanguage">
<en>en</en>
  </allowedLanguges>
</processor>
  </preprocessRequest>
</pipelines>
  </configuration>
</sitecore>

The reason behind his implementation is that the StripLanguage, which is a preprocess request, does not determine the Sitecore Context (SiteResolver in httpRequestPipeline processor fires after preprocess pipeline).

To dynamize this fix, I am using SiteContextFactory to map the current host and url to its corresponding Sitecore context instead of adding languages in the web.config. With this approach I can access all available languages in the current database using Database.GetLanguages()

namespace RabehajaLoic.Pipelines.PreprocessRequest
{
using System;
using System.Collections.Generic;
using System.ComponentModel;
using System.Linq;
using Sitecore;
using Sitecore.Configuration;
using Sitecore.Data.Managers;
using Sitecore.Pipelines.PreprocessRequest;
using Sitecore.Sites;
using Sitecore.Web;
using Enumerable = System.Linq.Enumerable;

public class StripLanguage : Sitecore.Pipelines.PreprocessRequest.StripLanguage
{

public override void Process(PreprocessRequestArgs args)
{

     var sitecoreContext =
     SiteContextFactory.GetSiteContext(args.Context.Request.Url.Host, args.Context.Request.Url.PathAndQuery);

    if (!sitecoreContext.HasValue())
    {
        return;
    }


    var languages = sitecoreContext.Database.GetLanguages();

    if (!Enumerable.Any(languages))
    {
        return;
    }

    if (args.Context != null && !string.IsNullOrWhiteSpace(args.Context.Request.FilePath))
    {
        var prefix = WebUtil.ExtractLanguageName(args.Context.Request.FilePath);

        if (string.IsNullOrWhiteSpace(prefix))
        {
            return;
        }


        if (!languages.Contains(LanguageManager.GetLanguage(prefix)))
        {
            return;
        }
    }

base.Process(args);   
     }

    }
}

Assuming that you applied this fix on your solution, you will notice that you are not anymore able to open popups in your back-office (Installation wizard, package designer and content editor will show resource not found error). This is happening because even the back office goes through our custom StripLanguage.

To overcome that, we need to filter which sites are allowed to go through our custom StripLanguage.

I used Sitecore Config patch to add the sites to ignore as a comma separated values in the web.config file. Assuming that those sites will not change in future releases of Sitecore, we can add them in the config.

<configuration xmlns:patch="http://www.sitecore.net/xmlconfig/">
  <sitecore>
    <settings>
       <!-- Sites to ignore by custom processors -->
      <setting name="SitesToIgnoreByCustomProcessors" value="shell,login,admin,service,scheduler,system,publisher,modules_shell,modules_website"/>
    </settings>
  </configuration>
</sitecore>

Now that we have the list of all sites to be ignored, we can now call this node in the custom StripLanguage.

namespace RabehajaLoic.Pipelines.PreprocessRequest
{
using System;
using System.Collections.Generic;
using System.ComponentModel;
using System.Linq;
using Sitecore;
using Sitecore.Configuration;
using Sitecore.Data.Managers;
using Sitecore.Pipelines.PreprocessRequest;
using Sitecore.Sites;
using Sitecore.Web;
using Enumerable = System.Linq.Enumerable;

public class StripLanguage : Sitecore.Pipelines.PreprocessRequest.StripLanguage
{

private static readonly List<stringsitesToIgnoreByCustomProcessors = Settings.GetSetting("SitesToIgnoreByCustomProcessors", string.Empty).ToLowerInvariant().Split(new char[1]
        {
            ','
        }, StringSplitOptions.RemoveEmptyEntries).ToList();

public override void Process(PreprocessRequestArgs args)
{

     var sitecoreContext =
     SiteContextFactory.GetSiteContext(args.Context.Request.Url.Host, args.Context.Request.Url.PathAndQuery);

    if (!sitecoreContext.HasValue())
    {
        return;
    }

    //do not go through the custom StripLanguage if sitename is contained in the list. 

    if (sitesToIgnoreByCustomProcessors.Contains(sitecoreContext.Name.ToLowerInvariant()))
        {
            base.Process(args);
            return;
        }


    var languages = sitecoreContext.Database.GetLanguages();

    if (!Enumerable.Any(languages))
    {
        return;
    }

    if (args.Context != null && !string.IsNullOrWhiteSpace(args.Context.Request.FilePath))
    {
        var prefix = WebUtil.ExtractLanguageName(args.Context.Request.FilePath);

        if (string.IsNullOrWhiteSpace(prefix))
        {
            return;
        }


        if (!languages.Contains(LanguageManager.GetLanguage(prefix)))
        {
            return;
        }
    }

base.Process(args);   
     }

    }
}

Rather than listing all languages in a config file, you may consider this solution as a dynamic alternative to parse the language in the Url and prevent sitecore from throwing exceptions.

One alternative which can be used as well in this approach is the use of Language.TryParse instead of

if (!languages.Contains(LanguageManager.GetLanguage(prefix)))
        {
            return;
        }

Final code would look like below:

namespace RabehajaLoic.Pipelines.PreprocessRequest
{
using System;
using System.Collections.Generic;
using System.ComponentModel;
using System.Linq;
using Sitecore;
using Sitecore.Configuration;
using Sitecore.Data.Managers;
using Sitecore.Pipelines.PreprocessRequest;
using Sitecore.Sites;
using Sitecore.Web;
using Enumerable = System.Linq.Enumerable;

public class StripLanguage : Sitecore.Pipelines.PreprocessRequest.StripLanguage
{

private static readonly List<stringsitesToIgnoreByCustomProcessors = Settings.GetSetting("SitesToIgnoreByCustomProcessors", string.Empty).ToLowerInvariant().Split(new char[1]
{
','
}, StringSplitOptions.RemoveEmptyEntries).ToList();

public override void Process(PreprocessRequestArgs args)
{

     var sitecoreContext =
     SiteContextFactory.GetSiteContext(args.Context.Request.Url.Host, args.Context.Request.Url.PathAndQuery);

    if (!sitecoreContext.HasValue())
    {
        return;
    }

        //do not go through the custom StripLanguage if sitename is contained in the list. 

if (sitesToIgnoreByCustomProcessors.Contains(sitecoreContext.Name.ToLowerInvariant()))
{
base.Process(args);
return;
}


    if (args.Context != null && !string.IsNullOrWhiteSpace(args.Context.Request.FilePath))
    {
        var prefix = WebUtil.ExtractLanguageName(args.Context.Request.FilePath);

        if (string.IsNullOrWhiteSpace(prefix))
        {
            return;
        }


        if (!Language.TryParse(prefix))
        {
            return;
        }
    }

base.Process(args);   
         }

    }
    }

Conclusion

As I mentionned, this is a dynamic version of @Sitecorejohn's approach. It is still possible to refactor, improve it and make a better version out of it.

Resources