Question Around AI Usage in Core Development of the Platform

Hi all,

I want to raise something that may be a little sensitive, but I am doing so in good faith and with genuine curiosity rather than criticism.

Firstly, I wanted to point out that I work heavily with AI driven solutions myself across a range of projects, including some fairly complex systems. I am very familiar with what current large language models can and cannot do in real world production environments.
They are impressive tools, and it is a rapidly growing technology space. I am a big fan of where it can actually aid through acceleration and helpful autocomplete assistance, refactoring support, and certain debugging workflows. Used correctly, they are valuable in our space.

However, they are not autonomous engineers. They still require clear direction, and the person using them having the majority of the understanding to correctly prompt and guide the A.I to produce not just working but the correct solutions. Their output should always have careful review.

In my own experience, when AI generated code is used without strong oversight or understanding, certain patterns tend to emerge:

  • Features or methods unintentionally removed while fixing something else

  • Logic rewritten in a way that subtly breaks existing behaviour

  • Previously fixed bugs resurfacing in slightly altered form

  • Code that “looks” structurally correct but does not fully respect the architectural context

  • Inconsistent patterns introduced into otherwise mature codebases

  • And just those odd issues that can happen with human coding but are kind of different. I am sure people will know what I mean. Odd issues.

We are seeing this nearly on a weekly basis at the moment.
While some companies do not want to admit it some major patch and update deployments have clearly been A.I developed, debugged and deployed and have caused a variety of large issues.

Windows is in a shocking mess because it is no secret that bugs and patch deployment is driven by their huge A.I Push and its NOT GOOD.

We have impressive, improving image and video generation but still limited and flawed in practical use. Coca Cola spent way to much and had too many people using A.I prompting to generate a really bad Xmas advert. The winter Olympic allowed A.I video generation that fundamentally broke the Olympic rings representation and allowed that to role out the door! It is the “Well, we spent this much money and time on it, we got to use it”.

Over the past several Umbraco releases, I have noticed what appear to be some unusual regressions and reintroductions of issues. Occasionally functionality seems to disappear and later reappear after being reported and just some of these patterns I have started to notice with A.I managed projects.

I have also noticed references to Claude and Claude bot related tooling in the Git repositories.

To be clear, I am not opposed to AI assistance in development. Quite the opposite. An AI assistant helping engineers with legwork, scaffolding, or repetitive tasks is entirely sensible. A faster more direct assistant aiding debugging is really useful!
What concerns me is when there are signs of over reliance without rigorous human review and architectural accountability.

An increased workload in the last year or so for myself has been involving inheriting projects that were clearly heavily AI generated, and I see these consistent patterns in how issues arise as well as seeing and knowing when my Junior staff have blindly used A.I with prompts vs, aiding them or doing it themselves.

Some of the bugs and regressions I have seen myself, seen reported recently in Umbraco feel similar to those patterns. That may be coincidence, but it has prompted me to have concerns and questions.

So I am simply asking, openly and respectfully:

  • Is AI tooling being used in core development workflows?

  • If so, what safeguards or review processes are in place?

  • Are others observing similar regression patterns, or am I over analysing this?

This is not an accusation. It is a genuine question about process and quality control as the industry evolves.

If this is a direction Umbraco is taking I think it is important to have those flags go up. I care about Umbraco as a platform, and I want to understand how these new tools are being integrated into development practices.

Thanks

I also wanted to put some context in some of what I have covered here as well.

Local Context Over Global Understanding
LLMs operate based on token context . They do not have architectural awareness in the way a human does.
If asked to modify a class to fix one issue, they may reconstruct the class from the prompt and inadvertently omit methods that were not explicitly in view.
IT is the same with video and image generation as well. Everything is “one off” in a sense so modifying an existing output, expanding and so on initially was never a thing.

More recently things are improving here, reference images, layered editable output, more consistent mapping and producing more consistent output. In fact I am working on something that is exciting that uses A.I image generation combined with data output based on Geo Location information, Crypto mining and block chain. So having some constancy as well as controlled mutations is a key goal.

But all this is still in infancy and we really still need to treat things as Case by case and a bigger scope is something A.I is still learning to do better.

Pattern Completion Instead of Intent Preservation
Models predict what things should look like based on the training and data they have available and not what is actually required. The output is the statistically calculated most common result.

There is key reliance on the right data and the updated data.

Regression Through Regeneration

Something I have mentioned already but things run primary on Regeneration. Everything is re-done.
Yeah, if you are asking Chat GPT on something it may give you some chunks of code with bits of text in between but a lot of this is the illusion and how it has been designed to present the output. In reality minimal differentiation is not happening.

Hallucinated APIs or Simplified Logic
If context is thin, the model may invent helper methods, assume framework behaviour, or simplify conditional flows in ways that pass compilation but break runtime behaviour.
This is where I am using an old developer term but seems so relevant today - “Garbage in, Garbage out”.
I am sure some may have experienced the increasing trend of bosses using ChatGPT to try and solve client problems without the knowledge to correctly prompt the platform. Many people do not understand that some solutions out there like ChatGPT are responding to appease you and your prompts. Any view or opinion you may already have will often have a favourable response which may not be correct.
This combined with other issues like using older free models vs more up to date models can lead to outdated information and solutions or in cases of code.. Inaccurate code.
Again, Chat models FAKE knowledge. Unless they are the actual up to date solutions they will pretend to be gathering information from the web or “I know .NET10” but in reality they can spit out code thats simply not accurate in the current context.

Inconsistent Style Drift
Over multiple AI assisted commits, code style and patterns can slowly drift because each generation optimizes locally, not holisticall based on those patterns and statically calculated responses based on that use case.

There is more nuance to everything and you can probably notice there are already directly conflicting output here and that is one of the critical issues with the current roadmap and approach to A.I.
There is no constancy and much of what you may think you are seeing is faked and where those wrong assumptions are made or in cases of some of these large companies where they simply do not care and pushing through determined to make things work because of their investment… It is not good.

I am sure we will get there on a lot of things but we are far from ideal solutions yet.
I am also in agreement with real A.i Experts. This current roadmap companies like openA.I think will lead to true A.I is not.
We are in a bubble and I think considering how it is also effecting hardware costs, power use and more - Something is going to burst.

Hi @thenexus00, this feels a concern about the software industry in general, beyond the Umbraco specifics, which most of us are still trying to make sense of.

In terms of the usage of AI in Umbraco core CMS development, I can respond to your questions.

Yes, (since early December 2025).

As you noticed, many commits are being made by Claude Code, but these are generally done as a co-author, where the developer has used Claude Code to assist in making the commit or create a pull request. This doesn’t always mean that Claude Code (or other AI tooling) have contributed to the code changes, but we can make general assumptions.

GitHub Copilot has been enabled for reviewing pull request workflows, generating feedback, but required manual action to initiate any automated bug fixes/resolution.

In terms of our core CMS workflows, we’ve had an approval mechanism in place for all pull requests for a long time, e.g. no one is directly committing to the main branch!
If we felt that the standard of development was degrading, we would flag this internally (and deal with much like any other organisation).

This also stands for community contributions, we’ve had several AI generated pull requests, which we have reviewed as we do human contributions. Taking the changes on their own merit. We’ve accepted some, we’ve rejected some. It’s all on the GitHub repository, open and transparent.

I do feel that you’d need to give more specifics here. We’ve had our fair share of regressions before AI, which we have always been transparent about and look to rectify in a professional manner.

If there is any specific issue you feel has regressed that could be attributed to the use of AI assistance, please do share your feedback.

1 Like

Hi Lee,
Thanks for the response.
I think there is just too many to be honest, that is why I am concered.

A lot of very strange bugs currently in Umbraco office with random new issues as well as issues that have come back.
Features to do with media libraries and pickers which have been missing, added, then gone again not because they removed on purpose.
A lot around the Javascript side and the back office which is inline with what I am seeing as a whole where the A.I is doing as I outlined a lot just hard dumping of features when fixing others and lack of overall scope.

In all honestly, the timing fits. I have felt that there is a larger increase in smaller issues occurring and as I have said stuff broke and fixed and broke again as some of the more obvious observations.

Again, I am not against this approach at all as it can be helpful and save time and headache in a variety of use cases but as long as it is used in that assistant sense with correct human oversite.

Around my circles I am known for breaking stuff, finding things. I believe I have found a number of quirks and issues regarding the new licensiing I been feeding to the team for example. I do not believe I am wrong with what I am seeing and just trying to be helpful and flagging it.

I just want to ensure that it is flagged, work is done to get on a good path going forward otherwise I can quite easily see the “Jank” nature of all this and the constant issues, niggles and recurring issues building and becoming an increasing issue that will be decremental to the platform in the long run.

I have already been a bit frustrated with the backoffice not being as old reliable and solid as it used to be up to 13 for example. The shift is a big change, I understand that but I have posted before my concerns on the rapid pace of releases which seem to cause less solid releases we used to see.

This rapid pace and combination of A.I use I feel is a valid concern and I believe we are seeing the issues around this.