Examine doesn't escape special characters?

I’m creating an IQuery and I want to make sure that only a subset of documents are searched:
query.Field("path", parent.Path.MultipleCharacterWildcard());

This translates to +(path:-1,1230,1242,1318,1319*) +topicTextContent:"test test" when I call query.ToString().
When I execute the query in code I get 0 results. But when I copy/paste the Lucene query to the Examine tab in the Umbraco backoffice I will get the desired results. The only thing I had to do was add a \ before -1, because the - character needs to be escaped.

I’ve tried to add the backslash before the path so now the Lucene query is exactly the same, but still no results.
query.Field("path", "\\" + parent.Path.MultipleCharacterWildcard());

Did I encounter a bug in Examine or did I do something wrong?

You can always ditch the fluent api and add a native query
Searching | Examine

query.NativeQuery("path:-1,1230,1242,1318,1319*") (may or may not need the \-1)

(assuming your query is set to and and not or.. )

or maybe query.NativeQuery($"path:{parent.Path}*") for dynamic?

When I convert my whole query to use NativeQuery it will work. But sadly the whole codebase is using the fluent API so that would require a significant rewrite.

I tried to use NativeQuery for only the path part but that didn’t work.

What was the generated query when you tried? As I don’t remember having issues with injecting bits of native into fluent… :thinking:

I’m not saying you can’t get that to work, but here is an alternative I always use.

You can call the TransformingIndexValues event to add a new path field to the index that is searchable.

The problem with the default path is that it is indexed as a string “-1,1230,1242,1318,1319” so you cannot find based on just the ID you want.

If you instead index it as a multivalue field with a string array: [1,1230,1242,1318,1319] then each of those will be indexed and you could simply search for an id and all nodes with that in its path would be shown.

Here is an example of adding a new “searchablePath” field:

public class ExternalIndexTransformations : INotificationHandler<UmbracoApplicationStartedNotification>
{
    private readonly IExamineManager _examineManager;

    public ExternalIndexTransformations(IExamineManager examineManager)
    {
        _examineManager = examineManager;
    }

    public void Handle(UmbracoApplicationStartedNotification notification)
    {
        if (!_examineManager.TryGetIndex(Umbraco.Cms.Core.Constants.UmbracoIndexes.ExternalIndexName, out var index))
        {
            return;
        }

        index.TransformingIndexValues += IndexOnTransformingIndexValues;
    }

    private void IndexOnTransformingIndexValues(object? sender, IndexingItemEventArgs e)
    {
        if (e.ValueSet.Category != IndexTypes.Media)
        {
            return;
        }

        var valuesDictionary = e.ValueSet.Values.ToDictionary(x => x.Key, x => x.Value.ToList());

        var path = valuesDictionary["path"].FirstOrDefault()?.ToString()?.Split(',');
        var paths = path?.Cast<object>().ToList();

        if (paths is not null && paths.Count != 0)
        {
            valuesDictionary.Add("searchablePath", paths);
        }

        e.SetValues(valuesDictionary.ToDictionary(x => x.Key, x => (IEnumerable<object>)x.Value));
    }
}

Then you can just add it to your query:

var query = index
            .Searcher.CreateQuery(IndexTypes.Media)
            .Field("searchablePath", node.Id)
            .And()
            .GroupedOr(["__NodeTypeAlias"], "umbracoMediaArticle", "File");

Note: in my example I did it for media, but it works just the same for content, just have to change the IndexTypes.Media to IndexTypes.Content

1 Like

If you can avoid it though.. isn’t there an overhead in that transforming index always happens at startup, there is no check to see if the distributed cache index id is unchanged, so just let examine rehydrate the index from the persisted files?
Obv depending on the size of you site (content,media,member nodes) and which index you are transforming the hit on restart may be acceptable?

Also the original query is to match on the whole path, not on an id in the path? I think slightly different, as any node with id in it’s path would get the eqv of descendentsOrSelf, and not just descendents?

In fact re-examining that :slight_smile: .. would a match on ParentId not be sufficient?
With the caveat you’d need to do a range query…

The transforming index values event happens whenever an index document is saved.
So it should not happen for everything on startup unless you have something else triggering a rebuild on startup.

IIRC Umbraco has some startup checks that rebuilds it if it is empty for example, but if that is the case you’d need to do that anyways.

In my experience using the event is not a noticeable performance hit unless you actually do demanding things within the event. That could depend on the size of the site though.

I’ve never worked with a multivalue field before in Examine. I’ll try this out.

We are already adding extra fields to the index anyway so adding a searchable Path field won’t be a big issue. Using ParentID is not an option because the page could be deeper in the hierarchy.

This topic was automatically closed 30 days after the last reply. New replies are no longer allowed.