Umbraco 13 - Recommended infrastructure sizing

Hi all,

Over the past 6 months we’ve observed a slow but steady increase in memory usage over a 24-hour period, which eventually results in an OOM kill and ECS task replacement.

We are running Umbraco 13 in a high-availability configuration (Delivery/Mgmt) on AWS ECS, with each task currently sized at 1 vCPU / 4 GB RAM.

From our investigation (including repeated memory dumps and runtime monitoring), it appears the majority of memory growth is associated with the published content cache. As pages are accessed and the site is used, memory consumption increases gradually and does not appear to stabilise before hitting the container memory limit.

Some relevant scale details:

  • Hosting 10 websites

  • Approximately 650+ published pages total

  • Multi-site setup with shared content database

  • 50k-100k monthly page views.

I understand that Umbraco infrastructure sizing can vary significantly by implementation, and I haven’t been able to find concrete sizing guidance in the documentation for AWS.

Given the above:

  • Is 4 GB RAM per node considered undersized for this scale in Umbraco 13?

  • If not, what would be the recommended infrastructure configuration?

Any guidance on expected memory behavior or recommended sizing for this kind of workload would be greatly appreciated.

Thanks in advance.

Have you tried doing any load testing to see how much you instances can take, before they crash?
Also if possible, you could also gain some performance by jumping to Umbraco 17 and take advantage of the hybrid cache setup.

Hi Steffen,

No that is on the agenda.

Unfortunately for our implementation upgrading to U17 is not a quick and easy process as we have a lot of custom angularjs logic in the back office.

There is this issue here: Incorrect usage of IOptionsMonitor causes memory leaks · Issue #20709 · umbraco/Umbraco-CMS · GitHub, which has been fixed in 13.2. But, I think may be just a red herring.

Do you have an idea on what the ideal infrastructure sizing is?

Daimen

The stress test wil propably give you the right answer, but based on our experience we see a general 8GB usage when the backoffice is used both for editors and for serving data through the Deliver API. And our solutions servers closer to 2 million page views a month (but with the use of cloudflare cache also included).

Based on the details on your setup, I always use the thumb-rule of “better safe than sorry”. We are also in the progress of moving our setup from a windows server Docker setup to a kubernetes cluster, and we would rather just give it to much power in the beginning and then after 2-3 weeks scale down to what fits the project the best. I would rather pay the extra to ensure good performance, than to give each node to little and having to scalue up.

Hi @daimenrees :slightly_smiling_face:

Just to be sure, you did configure the platform for Production Mode, right?

It has a huge impact on performance if not setup and configured to spec.

We have a platform with ~23k content nodes that sits at around 4GB memory, reported from the IIS process.
The platform also has Engage and a pretty good amount of custom code and integrations.
So as I see it 4GB should be plenty for a solution of your size.

1 Like

Unless your content is… weird, 4GB should be perfectly fine. CPU is usually the bottleneck that needs addressing before memory becomes a problem.

It sounds like there’s a memory leak. In Umbraco >v15 content is loaded into the cache eagerly at boot and on its own that shouldn’t “grow” with pageviews.

I have seen this behaviour before. In our case, a rogue UmbracoHelper was being injected into the DI container and never disposed of. This meant the associated IPublishedSnapshot was never disposed of either so we ended up with multiple copies of content from the cache piling up in memory.

Look for any code that’s holding onto references to instances of IPublishedContent or IPublishedSnapshot, either directly or indirectly. Also look out for anything else that’s designed to be request-scoped/short-lived, that you might be holding onto across requests.