Umbraco v 13 - Azure load balanced site down - how to kill Azure SQL Connections

Hi,

I have a sizeable Umbraco v 13 - Azure site that is load balanced.

This morning it seemed to get into a state following a release and / or the following attempt to scale up - it got seemingly stuck with not much happening.

I stopped all instances and started back up - by the time everything was up I’d had 25 minutes of downtime. Not ideal.

I think the SQL server got into a locked state.

Is there a quick way of killing all connections to an Azure SQL DB so that I can recover more quickly if this happens again? Does anyone have experience of similar?

I do get the occasional downtime with Azure - the startup time of the site is an issue - but testing it with a cloned db and a similar sized web app instance it’s 4mins not 25…

Steve

Hey Steve :waving_hand:

It sounds like the Azure SQL instance might be underpowered for the amount of content. Presumably if you look at the database utilization for the time period it’s stuck at 100% DTU for all that time?

This is a bit of a red flag. 4 minutes is a lot for a normal startup - throw in a re-index after an upgrade or a rebuild of the SQL cache (especially if you’ve got multiple instances trying to access the same DB in a load balanced environment) and 25 mins doesn’t sound that crazy.

How big are we talking - what tier is the database and how much content are we talking? Also, what plan is the Web App on?

To answer your specific question about SQL though, IIRC you can use the KILL command via TSQL. You’ll want to stop the Umbraco site(s) first.

Hi Jason - thanks for the reply.

The SQL DB is a Standard S6 - 400DTUs

There are two load balanced web apps (we scale this out in busy times)
HaWebv1020231005091658Plan (P2v3: 2)

I would hope with the Azure costs involved with these that this is ample to startup a website!!

I hate trying to get to the logs of these load balanced sites - I’ll have a better look today but I did a copy with a clone of the DB and a single webapp and it was 4 minutes. It think it’s strongly pointing at an issue in the DB in this scenario (for example releases the site starts relatively quickly in the deployment slot and that’s doing the same thing as a cold start.

There is a lot of custom logic on startup (caching of services that uses Umbraco data) I’m sure I could make improvements but my first task it see what is “normal” - e.g. what is just the Umbraco startup due to the amount of content. Then see what the extra sugar in this project is causing.

Umbraco 13 is not particularly strong when booting because the entire cache gets loaded on boot. The more content you have, the longer it takes. ‘Normal’ websites for us with a few thousand media nodes and content nodes usually boot in 30 to 60 seconds. This is on P1V3 and a S1 database.

Now obviously your site needs way more horse power either because of the number of visitors and/or the complexity of the data. But lets see if you have set some things correctly for the fastest boot possible. Maybe you already know all this, but it doesn’t hurt to check.

  • First, run the health check in the Umbraco backoffice and see if there is any configuration issue. Seriously, we’ve had a time where we thought our sites were all a bit sluggish. Turned out we were running in debug mode… Yeah… Yay for production mode now!
  • Second, set the NuCache setting UsePagedSqlQuery to false. This will use more memory, keep an eye on that, but especially on Linux this helps boot times on larger sites.
  • We recently had issues with sites booting 20+ minutes. It turned out that the media cache wasn’t working correctly because of the order in Program.cs. See this topic: Media cache (Azure) causes very slow boot - #7 by LuukPeters

Just some pointers that might help!

1 Like

Some great pointers here - I thought I’d replied, sorry!

Not quite your issue but it did highlight a todo which was potentially causing the issue on a site where AddAzureBlobImageSharpCache() was missing - I think this package was split between versions.

Not tried the UsePagedSqlQuery yet - will drip feed the changes to see what works.