Examine Files keep locking

Not sure if we are there yet.. but there is in development search abstraction in Umbraco.. so that you don’t have to use Examine :slight_smile:

Umbraco Search is coming to town — 24 Days In Umbraco
DevRel Deep Dive: Umbraco’s New Search in Alpha
(GitHub - umbraco/Umbraco.Cms.Search: Search abstractions and implementations for Umbraco - The Future of Search)

1 Like

I guess that won’t be coming in for Umbraco 13 though :frowning:

17 is LTS though now… so we’ve run out of excuses over here… :slight_smile:

1 Like

That’s what cancellation tokens are for.

Anywhere long running background tasks are happening cancellation tasks should be used to cancel them when shutdown is called. Ideally with a bit more fine grain that this, especially if the PDFs are chunky, but in principle this should happen.

foreach (var pdf in pdfs)
{
    if (cancellationToken.IsCancellationRequested)
    {
        return;
    }

    IndexPdf(pdf);
}

Yes, but generally speaking, even with Azure’s shenanigans, the app should be cleanly shutting down. Examine has defensive coding around this either way.

It also has great logs, so it’s usually easy to pinpoint the real problem just by setting some namespaces to “Debug”.

I can guarantee that UglyToad/PDFPig has nothing using/checking cancellation tokens. It might be happening inside Examine quite the thing. But this third party code for indexing PDFs, it does not happen nicely and fails easily when indexing files. I don’t know if that might cause Examine to fail without a nice “shutdown”. But lived experience is things lock. We don’t know why. It’s all clutching at straws. A 2 index location as described above would solve all issues.

And yes, we generally just leave Debug mode on for our smaller sites with smallish file size and day turn overs on the logs, but then the log files are unreadable outside Umbraco and the search for logs inside Umbraco isn’t the best so it’s still difficult. Maybe we’ll do as you say and then on debug for only the examine namespace for a while and see.

There is Compact Log Viewer.. Log Viewer | CMS | Umbraco Documentation

But also why not log to an external store? SEQ, Application Insights any other available serilog SINK that works for you?

I prefer SEQ.. (and you can limit the umbraco file based logging if you aren’t using it to save disk space :-))
Seq — centralized structured logs
<PackageReference Include="Serilog.Sinks.Seq" Version="7.0.1" /> I believe fits with the serilog version in v13 umbraco

"Serilog": {
  "MinimumLevel": {
    "Default": "Information"
  },
  "WriteTo": [
    {
      "Name": "Async",
      "Args": {
        "configure": [
          {
            "Name": "Console",
            "Args": {
              "restrictedToMinimumLevel": "Information"
            }
          },
          {
            "Name": "Seq",
            "Args": {
              "serverUrl": "https://localhost:5341",
              "restrictedToMinimumLevel": "Information"
            }
          }
        ]
      }
    },
    {
      "Name": "UmbracoFile",
      "Args": {
        "RestrictedToMinimumLevel": "Fatal"
      }
    }
  ]
},

@bythewiseman Morning! I think a summary is likely to be useful at this point because there are a couple of variations being discussed:

  1. Examine index locking is being experienced intermittently on an Azure App Service production slot whilst the application is up and running.
  2. Examine index locking is being experienced consistently when deploying to a VPS using WebDeploy.
  3. Both scenarios are generating the same exception.
  4. Both scenarios are using the correct site configuration.
  5. Both scenarios have only occurred whilst using v13
  6. No customization or additional plugins (e.g. PDF indexing) are involved.
  7. Using an index local to the application is not a suitable solution.
1 Like

@bythewiseman So your issue is the Azure locking whilst running. The trigger is easy to diagnose - Azure will automatically restart instances for a number of reasons including updates to software, after certain GC activity etc. You can avoid the infrastructure-based restarts by using the WEBSITE_ADD_SITENAME_BINDINGS_IN_APPHOST_CONFIG config setting I posted earlier, as that is its purpose. I think this will largely solve your issue.

The actual cause, I don’t have a solution for but I don’t believe it is Examine itself.

On the flip side, I have a solution for my issue of indexes locking whilst deploying. Although I am yet to test it, I am confident it will work. This article here describes a little known feature (re)introduced in .NET called ShadowCopy. This is a fantastic writeup and I seriously suggest people read it.

The key bit for me is here:

… this is due to the application still running pending requests or running some background operations that have not completed and released their background threads. End result: In some cases the IIS application does not unload.

Sometimes you can wait a little bit and try again, but if the application is super busy or has long running requests or background services it might be increasingly difficult to update the application without explicitly shutting down the Web application on the server

This is nothing new - we all know this but I think that the fact that Umbraco runs in a nested app domain and is heavily reliant on background services means that this locking situation isn’t going to have a code-based fix. Maybe running on Linux will be better, without IIS getting in the way?

Personally, I think this is going to remain an unresolved issue with v13 and using the Search provider with v17 is going to be the way forward.

That article is interesting indeed!

I’m not sure that Rick’s approach will work with Umbraco, as IIRC, Umbraco does make assumptions based on Assembly.GetExecutingAssembly().

it might be increasingly difficult to update the application without explicitly shutting down the Web application on the server

It’s important to make sure that Umbraco isn’t still running if you deploy over the top of it.

To that end some of our CI pipelines have an explicit stop/start call to IIS when deploying to VPSs where app_offline.htm wasn’t effective enough. Though, and I know I sound like a broken record here, but if you can work out why a site isn’t cleanly shutting down and fix it, app_offline.htm should be enough on its own.

For Azure Web Apps, deployment slots make this a non-issue as you can deploy to a stopped instance. Much friendlier and robust experience all round. You can even have your pipeline spin up ephemeral slots - 0% chance of file-locking if you’re deploying to a brand new empty slot.

If you think this is an issue you try setting WEBSITE_DISABLE_OVERLAPPED_RECYCLING=1 on your web app. This will ensure that the Web App’s VM is shutdown before spinning up a new instance.

“Maybe running on Linux will be better, without IIS getting in the way?”

That’s the thing, we run in Linux on Azure as it’s cheap…

1 Like

Maybe, maybe not, we’ll see. I’ll report back when I’ve done some experiments.

Correct (we already use this setting) but in @bythewiseman’s case he’s trying to avoid the app shutting down in the first place which is the trigger for the issue. This setting will just cause extra avoidable downtime.

That level of downtime isn’t acceptable to me or my clients. We’re all on the same page though re finding a fix.

However, I don’t believe this is necessarily an Examine, Umbraco or code issue. I believe this might be tied to .NET 8 itself. It’s difficult to prove but my theory is that with the performance increases that came in .NET 8, the shutdown process of Umbraco is hampered by additional overlapping calls to Examine that aren’t prevented in time. I could be wrong and I likely am but as I said earlier I think this is going to remain an unresolved issue with v13.

Cool, that takes IIS out of the equation completely. In which case I would concentrate on what you can control via config. Have you implemented the WEBSITE_ADD_SITENAME_BINDINGS_IN_APPHOST_CONFIG appsetting key? If not, that’s the first place to start. If you are slot swapping, I imagine you already have the WEBSITE_DISABLE_OVERLAPPED_RECYCLING setting set but if you don’t, this should be set. It’s frustrating because with it you no longer benefit from the seamless slot swap but until it is possible to move away from Examine, I have found this to be necessary too.

Fair enough, but you can’t stop this from being true:

For zero-downtime deployments with Umbraco, the only viable option is Blue-Green deployments. Whether that’s slot swaps or containers or something more manual on a VPS.

Interestingly, Rick Strahl’s approach is basically Blue-Green, but at the filesystem level only :thinking:

A lot of people are using v13 in anger across Azure Web Apps and VPSs without constantly locking up. Right now I don’t even think WEBSITE_DISABLE_OVERLAPPED_RECYCLING is enabled on any of our client’s production sites…

I feel like there must be something more unusual going on here.