How to do a fast atomic update via API on a lots of nodes?

Hi, I need to update a lot of nodes ( like 1000 for each batch ) with API to do a delicate task of calculation for invoice by customers. Today I run a cycle and commit the data for each node with the SaveAndPublish but it’s slow and sometime I’ve lock -333. Is’t possibile to use the only Save and after in background to refresh the cache with SaveAndPublish or there is another mode? Thanks.

HI @biapar

You could save each node without publishing and then call PublishBranch (in v13 it is called SaveAndPublishBranch) which will publish your node and all children, but I don’t know internally if this does it in batches to improve performance. Either way, publishing via the API has an overhead as it has to create new versions, run handlers, etc.

Here is the method for v17:

Also, that would publish all children which may not be what you want if you want to be more selective. Maybe worth a try to see if it helps.

Your batch method is probably the only alternative but I would keep the batches small to avoid the risk of locks being held for too long.

If this is data that changes regularly then it may be better to put it in a custom database table outside of the node, that way you would get much better performance (but then it would not be editable in the CMS unless you had a custom property or dashboard.

Justin

Thanks for your reply. I cannot use a external db because I’ve custom proprierties and I built a custom plugin to manage the invoices.

So Do you suggest to made the Save for each nodes and after , in background, to made the Save Branch? The scope is to update the invoice state for only authorized invoices and then export an Excel. Today I already do this but in the same function I made the first cycle to export the Excel and the second to change the state with SaveAndPublish. But sometime the second cycle stop for the lock error. I spent 40 minutes to publish 672 nodes .

Try saving each node in turn and then call SaveAndPublishBranch at the end. Not sure if it will help that much unless internally Umbraco performs batch updates, but most likely it just cycles through the nodes in turn.

Would it not be easier to update individually invoices in real-time rather than batch? Are you saving invoices as content nodes? You could create a handler that does your calculation in response to another event rather than in batch but I don’t know your setup and architecture.

It sounds like this data may have been better outside of content nodes initially but that is probably too late to change.

Today, I don’t work in batch. I made a query on the content tree to find all “invoice” nodes under each customers. Then I made a cycle for each nodes and made SaveAndPublish, but I see that it slow and sometime happens-333 lock and this break the cycle and not sync the cache. At this state , to update the cache is not important, but it’s important to save the date into the db to have the consistent data

You could try something like this (AI suggestion so not tested)

using Umbraco.Cms.Core;
using Umbraco.Cms.Core.Models;
using Umbraco.Cms.Core.Scoping;
using Umbraco.Cms.Core.Services;
using Umbraco.Cms.Infrastructure.Examine;
using Examine;

public class BulkUpdateService
{
    private readonly ICoreScopeProvider _scopeProvider;
    private readonly IContentService _contentService;
    private readonly IIndexRebuilder _indexRebuilder;
    private readonly IExamineManager _examineManager;

    public BulkUpdateService(
        ICoreScopeProvider scopeProvider,
        IContentService contentService,
        IIndexRebuilder indexRebuilder,
        IExamineManager examineManager)
    {
        _scopeProvider = scopeProvider;
        _contentService = contentService;
        _indexRebuilder = indexRebuilder;
        _examineManager = examineManager;
    }

    public void DoBulkWork(IEnumerable<int> nodeIds, int? branchRootId = null)
    {
        // 1. Bulk update with notifications suppressed (no indexing, no NuCache refresh)
        using (ICoreScope scope = _scopeProvider.CreateCoreScope())
        using (scope.Notifications.Suppress())
        {
            foreach (var id in nodeIds)
            {
                var content = _contentService.GetById(id);
                if (content is null) continue;

                content.SetValue("someProperty", "newValue");
                _contentService.Save(content);
            }

            scope.Complete();
        }

        // 2. Rebuild the external index
        // useBackgroundThread: false to block, true (default) to fire-and-forget
        if (_examineManager.TryGetIndex(Constants.UmbracoIndexes.ExternalIndexName, out _))
        {
            _indexRebuilder.RebuildIndex(
                Constants.UmbracoIndexes.ExternalIndexName,
                delay: null,
                useBackgroundThread: false);
        }

        // 3. Publish the branch so NuCache + cache refreshers fire properly
        if (branchRootId.HasValue)
        {
            var branchRoot = _contentService.GetById(branchRootId.Value);
            if (branchRoot is not null)
            {
                // v13 modern overload using PublishBranchFilter:
                _contentService.SaveAndPublishBranch(
                    branchRoot,
                    PublishBranchFilter.IncludeUnpublished);

                // Equivalent older overload (still works in v13):
                // _contentService.SaveAndPublishBranch(branchRoot, force: true);
            }
        }
    }
}

Thanks…I’ll try it. I think could be a good solution. A question about this code.

  • with IncludeUnpublished will the nodes I set to unpublished for formal reasons, such as “virtual” deletion of the request, be saved?
  • Will those saved only in the previous cycle be saved and published?

Thanks

_contentService.SaveAndPublishBranch(
branchRoot,
PublishBranchFilter.IncludeUnpublished);

It will include all unpublished nodes, so if you only want published nodes to be saved and publish then change the parameters accordingly.

I don’t think you can be selective, it will save and publish all child nodes.

Ok. I think to add a flag into the invoice header so emulate the logic deletion with ON/OFF