Teaching an AI Coding Agent to Work Safely with a Legacy InterSystems Caché ApplicationContestant
We didn't start with a big AI strategy.
We had a legacy InterSystems Caché 2018 application, a lot of old business logic, and a practical need: build a new UI and improve code that had been running for years. At first, I thought an AI coding agent would help only with a small part of the work. Maybe some boilerplate, some REST work around the system, and a bit of help reading old ObjectScript.
In practice, it became much more than that.
Once we started using it seriously, we realized it could move across a large codebase, understand patterns, suggest refactors, and help us modernize around Caché much faster than I expected. But that only happened after a frustrating period at the beginning.
The real challenge wasn't getting code suggestions. It was teaching the agent how our Caché environment actually behaved.
Why we were able to use it
Before any technical work, we had to answer a security question.
We were not going to send ERP code and internal business logic directly to a public AI service and hope for the best. That was never going to pass a serious security review.
For us, Amazon Bedrock changed the conversation. Our servers, data, and development environment were already in AWS, so Bedrock fit naturally into a cloud environment we already trusted and governed. We could work with the model while keeping the traffic, access control, and surrounding security controls inside the same AWS framework we were already using.
The model itself was interesting, but what made it usable in a real enterprise environment was the fact that it fit the security model we already had, instead of creating a separate and less controlled path for sensitive development work. Once that became clear, the discussion with security and governance teams became much easier.
What didn't work at first
At the beginning, it was frustrating.
The model knew ObjectScript syntax better than I expected, but it didn't know our version, our conventions, or the old corners of a Caché system that had grown over many years. It also didn't know where the dangerous parts were. And honestly, we didn't yet know how to work with it either.
We had to learn when it made sense to use a stronger model and when a faster one was enough. We also had to learn how to keep sessions focused so token usage wouldn't get out of control. On a large legacy codebase, that becomes important very quickly. In practice, the same discipline that reduced cost also improved the quality of the work — smaller, tighter sessions usually gave better results.
It also took us time to understand why it felt like we were teaching the same things again and again. At first that was one of the most annoying parts of the whole experience. Only later did we understand that if we wanted consistent behavior, we couldn't rely on chat history and hope the model would somehow absorb the project. We had to start giving it structured context through documentation, CLAUDE.md, and hooks.
Once we did that, the work started to become much more effective.
Treat the agent like a new developer
The breakthrough came when we stopped treating the model like a chatbot and started treating it like a new developer joining the team.
That meant onboarding.
We gave it documentation. We explained the patterns in our project. We documented mistakes that had already happened. We wrote down what was standard ObjectScript behavior and what was specific to our environment.
The CLAUDE.md file became the center of this process. It wasn't there to make the writing prettier. It was there to stop the same technical mistakes from happening twice. Over time, it became a knowledge base for the system: what breaks, what is safe, what is misleading, and what must never be done.
One detail mattered more than I expected: wording. "Prefer X" didn't behave the same way as "Always do X. Never do Y." Once the instructions became direct, the model was much less likely to fall back to generic defaults.
The Caché terminal was part of the challenge
One of the biggest things the agent had to learn was that deployment in Caché 2018 does not feel like deployment in a modern JavaScript or Python stack.
There is no git push to production. There is the Caché terminal.
In our environment, code often reaches the server through an interactive terminal session. Commands are sent one at a time. Long commands can wrap because of terminal width and turn into broken input. Interactive loops are not something you can rely on the way you would in a normal script. Some operations that look fine in theory become messy once they have to pass through the terminal prompt by prompt.
This created a class of failures that had nothing to do with ObjectScript syntax itself. The generated code could be technically correct and still fail because the way it was being executed on the server was wrong for the environment.
We had to teach that explicitly.
Why Atelier still mattered
We also used Atelier as part of the workflow.
Even though the interesting part of the story is the AI agent, the reality is that on an older Caché system you still need traditional tools to understand what is actually happening. Atelier was useful for browsing class definitions, checking compiled behavior, looking at code structure, and running ad hoc queries during development. In practice, the work was a mix: the agent helped us move faster, but Atelier was still part of how we verified things, explored classes, and stayed grounded in the real system.
The agent was not working in a vacuum. It was working inside a very specific InterSystems development reality.
The InterSystems-specific lessons that mattered most
What's interesting is that we didn't really have to teach the agent ObjectScript from zero. It already knew a lot. What we had to teach it was where our Caché environment differed from the safe default assumptions.
One example was ProcedureBlock. In an older codebase, details like that are not academic. They directly affect how methods behave and what kind of QUIT pattern is safe in practice. In our project, that caused real mistakes until we documented it explicitly. The model kept generating code that looked correct if you assumed the standard case, but was wrong for our actual environment.
The lesson is simple: the model may know ObjectScript, but it doesn't know your class configuration unless you tell it.
Another example was expression evaluation. ObjectScript does not think like C, JavaScript, or Python. If you come from those languages, you naturally assume operator precedence. In ObjectScript, that assumption can become a bug.
WRITE 1+2*3 ; outputs 9, not 7
A model trained mostly on modern languages will often leave compound expressions unparenthesized unless you make the rule explicit. Once we added a simple rule to parenthesize compound expressions, that whole class of mistakes dropped immediately.
The same was true with conditionals. If you come from other languages, it's easy to overlook the difference between operators that always evaluate both sides and operators that short-circuit. In a legacy system, that difference can turn into a real production issue very quickly. Once we documented the pattern clearly, the model became much more reliable.
We also taught it patterns that are very natural in Caché once you know them well. A good example is process-private globals. Once we showed the model where and why we use them, it started applying that pattern naturally for temporary working data in longer flows.
Then there were JSON details. %DynamicObject.%Set() needs an explicit type if you want a real JSON boolean. That sounds minor until an API consumer expects true and gets 1.
// Returns {"active":1}
DO tObj.%Set("active", isActive)
// Returns {"active":true}
DO tObj.%Set("active", isActive, "boolean")
After one rule and one example, the model started doing it correctly.
And then there was the knowledge that wasn't in any official documentation at all.
In our system, some Hebrew text handling depended not only on the value itself but on which code path produced it. The correct conversion sometimes depended on what happened earlier in the method chain. That kind of knowledge doesn't exist in product manuals. It exists in the history of the application. The model couldn't infer it. We had to write it down. And honestly, writing it down helped us understand it better too.
What changed once we did this properly
After we built the right context, the value became real.
The agent became useful for reading old business logic, suggesting safer first drafts, helping with refactoring, and moving faster on the UI and API work around the system. It didn't replace understanding Caché — if anything, it made our own understanding even more important, because every recurring mistake had to be turned into a rule, and writing the rule forced us to understand the behavior precisely.
Don't ask whether the model knows ObjectScript. Ask whether you've documented enough of your own system for the model to work safely inside it.
Conclusion
If you're working on a long-lived InterSystems application and thinking about using an AI coding agent, don't start with prompts. Start with onboarding.
The model already knows a lot about code. What it doesn't know is your Caché version, your configuration, your terminal-based deployment habits, your patterns, and your legacy behavior. Once we captured that knowledge in documentation, CLAUDE.md, and hooks, the results changed completely. Before that, it was an interesting experiment. After that, it became a real part of how we work.