Virtual nodes in Umbraco 15

I would like to amend Umbraco’s URL generation for v15 to selectively exclude certain doctypes from the URL structure, for example going from:

/list-page/contents/example-item

to:

/list-page/example-item

In a separate thread on the same topic, another user was given the suggestion of porting the functionality offered by a third-party package called “Dotsee.Discipline” which currently only supports U13 - the source for this feature can be found here. I’m trying to achieve this, but have run into a problem with the URL provider.

When the provider attempts to generate a URL for a page with a virtual node in its path, a stack overflow exception occurs after repeated recursions to base.GetUrl via ConstructUrl. The code I am porting does mention this occurring whenever the Content.Url field is used, but now this appears to be happening on when GetUrl is called too - does anybody know more about the latest changes to Umbraco’s internals which may be causing this?

For reference, this is the code I have so far - it’s very similar to the code in the Dotsee package:

URL provider:

public class VirtualNodesUrlProvider : DefaultUrlProvider
{
    private readonly GlobalSettings _globalSettings;
    private readonly IUmbracoContextFactory _umbContextFactory;

    private readonly VirtualNodesHelpers _virtualNodesHelpers;

    public VirtualNodesUrlProvider(IOptionsMonitor<RequestHandlerSettings> requestSettings, ILogger<DefaultUrlProvider> logger, ISiteDomainMapper siteDomainMapper, IUmbracoContextAccessor umbracoContextAccessor,
        UriUtility uriUtility, ILocalizationService localizationService, IOptions<GlobalSettings> globalSettings, IUmbracoContextFactory umbContextFactory, VirtualNodesHelpers virtualNodesHelpers)
        : base(requestSettings, logger, siteDomainMapper, umbracoContextAccessor, uriUtility, localizationService)
    {
        _globalSettings = globalSettings.Value;
        _umbContextFactory = umbContextFactory;
        _virtualNodesHelpers = virtualNodesHelpers;
    }

    public override UrlInfo GetUrl(IPublishedContent content, UrlMode mode, string culture, Uri current)
    {
        //Just in case
        if (content == null)
        { return null; }

        //If this is a virtual node itself, no need to handle it - should return normal URL
        bool hasVirtualNodeInPath = false;
        foreach (IPublishedContent item in content.Ancestors()) //.Union(content.Children())
        {
            if (_virtualNodesHelpers.IsVirtualNode(item))
            {
                hasVirtualNodeInPath = true;
                break;
            }
        }
        using (var umb = _umbContextFactory.EnsureUmbracoContext())
        {
            var _urlInfo = hasVirtualNodeInPath ? ConstructUrl(umb.UmbracoContext, content.Id, current, mode, content, culture) : null;
            return _urlInfo;
        }
    }

    private UrlInfo ConstructUrl(IUmbracoContext umbracoContext, int id, Uri current, UrlMode mode, IPublishedContent content, string culture)
    {
        string path = content.Path;

        //Keep path items in par with path segments in url
        //If we are hiding the top node from path, then we'll have to skip one path item (the root). 
        //If we are not, then we'll have to skip two path items (root and home)
        int pathItemsToSkip = (_globalSettings.HideTopLevelNodeFromPath == true) ? 2 : 1;

        //Get the path ids but skip what's needed in order to have the same number of elements in url and path ids.
        string[] pathIds = path.Split(',').Skip(pathItemsToSkip).Reverse().ToArray();

        //Get the default url 
        //DO NOT USE THIS - RECURSES: string url = content.Url;
        //https://our.umbraco.org/forum/developers/extending-umbraco/73533-custom-url-provider-stackoverflowerror
        //https://our.umbraco.org/forum/developers/extending-umbraco/66741-iurlprovider-cannot-evaluate-expression-because-the-current-thread-is-in-a-stack-overflow-state
        string url = null;
        try
        {
            url = base.GetUrl(content, mode, culture, current).Text;
        }
        catch (NullReferenceException ex)
        {
            return null;
        }
        //If we come from an absolute URL, strip the host part and keep it so that we can append
        //it again when returing the URL. 
        string hostPart = "";
        if (url.StartsWith("http"))
        {
            Uri u = new Uri(url);
            url = url.Replace(u.GetLeftPart(UriPartial.Authority), "");
            hostPart = u.GetLeftPart(UriPartial.Authority);
        }

        //Strip leading and trailing slashes 
        if ((url.EndsWith("/")))
        {
            url = url.Substring(0, url.Length - 1);
        }
        if ((url.StartsWith("/")))
        {
            url = url.Substring(1, url.Length - 1);
        }

        //Now split the url. We should have as many elements as those in pathIds.
        string[] urlParts = url.Split('/').Reverse().ToArray();

        //Iterate the url parts. Check the corresponding path id and if the document that corresponds there
        //is of a type that must be excluded from the path, just make that url part an empty string.
        int cnt = 0;
        foreach (string p in urlParts)
        {
            if (cnt + 1 > pathIds.Length)
            {
                cnt++;
                continue;
            }
            IPublishedContent currItem = umbracoContext.Content.GetById(int.Parse(pathIds[cnt]));

            //Omit any virtual node unless it's leaf level (we still need this otherwise it will be pointing to parent's URL)
            if (_virtualNodesHelpers.IsVirtualNode(currItem) && cnt > 0)
            {
                urlParts[cnt] = "";
            }
            cnt++;
        }

        //Reconstruct the url, leaving out all parts that we emptied above. This 
        //will be our final url, without the parts that correspond to excluded nodes.
        string finalUrl = string.Join("/", urlParts.Reverse().Where(x => x != "").ToArray());

        //Just in case - check if there are trailing and leading slashes and add them if not.
        if (!(finalUrl.EndsWith("/")))
        {
            finalUrl += "/";
        }
        if (!(finalUrl.StartsWith("/")))
        {
            finalUrl = "/" + finalUrl;
        }

        finalUrl = string.Concat(hostPart, finalUrl);
        var _urlInfo = new UrlInfo(finalUrl, true, culture);

        //Voila.
        return _urlInfo;
    }
}

Content finder:

public class VirtualNodesContentFinder(IMemoryCache memoryCache, IUmbracoContextAccessor contextAccessor, ILogger<VirtualNodesContentFinder> logger) : IContentFinder
{
    public Task<bool> TryFindContent(IPublishedRequestBuilder request)
    {
        //Exit early if no Umbraco Context
        if (!contextAccessor.TryGetUmbracoContext(out var umbracoContext))
        {
            return Task.FromResult(false);
        }

        //Get a cached dictionary of urls and node ids
        var cachedVirtualNodeUrls = memoryCache.Get<Dictionary<string, int>>("cachedVirtualNodes");

        //Get the request path
        string path = request.AbsolutePathDecoded;

        //If found in the cached dictionary, get the node id from there
        if (cachedVirtualNodeUrls != null && cachedVirtualNodeUrls.ContainsKey(path))
        {
            //That's all folks
            int nodeId = cachedVirtualNodeUrls[path];
            request.SetPublishedContent(umbracoContext.Content?.GetById(nodeId));
            return Task.FromResult(true);
        }

        //If not found on the cached dictionary, traverse nodes and find the node that corresponds to the URL
        IPublishedContent item = null;
        var rootNodes = umbracoContext.Content?.GetAtRoot(request.Culture);
        try
        {
            item = rootNodes
            ?.DescendantsOrSelf<IPublishedContent>(request.Culture)
            ?.Where(x => x.Url(request.Culture) == (path + "/") || x.Url(request.Culture) == path)
            .FirstOrDefault();
        }
        catch (Exception ex)
        {
            logger.LogError(ex, string.Format("Could not get content for URL '{0}'", request.Uri.ToString()));
        }

        //If item is found, return it after adding it to the cache so we don't have to go through the same process again.
        if (cachedVirtualNodeUrls == null)
        { cachedVirtualNodeUrls = new Dictionary<string, int>(); }

        //If we have found a node that corresponds to the URL given
        if (item != null)
        {
            //This check is redundant, but better to be on the safe side.
            if (!cachedVirtualNodeUrls.ContainsKey(path))
            {
                //Add the new path and id to the dictionary so that we don't have to go through the tree again next time.
                cachedVirtualNodeUrls.Add(path, item.Id);
            }

            //Update cache
            memoryCache.Set("cachedVirtualNodes", cachedVirtualNodeUrls, new MemoryCacheEntryOptions
            {
                Priority = CacheItemPriority.High
            });

            //That's all folks
            request.SetPublishedContent(item);
            return Task.FromResult(true);
        }

        //Abandon all hope ye who enter here. This means that we didn't find a node so we return false to let
        //the next ContentFinder (if any) take over.
        return Task.FromResult(false);
    }
}

In your ConstructUrl method, you’re calling base.GetUrl, which, depending on the inheritance and overrides, might eventually call back into your overridden GetUrl method, creating a recursive loop. I’m assuming that’s happening here.

Haven’t looked at your code in depth, but you might be able to detect where it goes into a loop and guard against it.

As a safeguard, instead, I’d probably strictly rely on the path, which I think you’re already mostly doing here:

Hi Sebastian,

Your assessment is correct in that the base.GetUrl call is responsible for the infinite recursion, what I’m wondering about is what has changed between v13 and v15 of Umbraco to result in this behaviour, especially when the source of the actual Dotsee package I’m cribbing from mentions this as a known issue with content.Url. Either this is a regression, or there is a fundamental change to the underlying behaviour of Umbraco’s routing that is preventing this from working in the newer version. My hope was that someone more familiar with this aspect of Umbraco’s internals would be able to point this out.

Highlighting the specific segment of the Dotsee URL provider which notes this:

//Get the default url 
//DO NOT USE THIS - RECURSES: string url = content.Url;
//https://our.umbraco.org/forum/developers/extending-umbraco/73533-custom-url-provider-stackoverflowerror
//https://our.umbraco.org/forum/developers/extending-umbraco/66741-iurlprovider-cannot-evaluate-expression-because-the-current-thread-is-in-a-stack-overflow-state
string url = null;
try
{
 url = base.GetUrl(content, mode, culture, current).Text;
} catch (NullReferenceException ex) {
 return null;
}