Lessons learned from shipping a regulated enterprise Healthcare app with Claude Code
In my last post I talked about how AI tools like Claude Code are redefining the sales process and sales cycle expectations....
…. moving from persuasion to proof based selling changes the conversation from “do we like this?” to “how do we make this real in production?”
This post is all about what actually got built and more importantly the lessons learned and some key insights gained when building enterprise software with AI coding tools.
First let me start off by saying, very emphatically, based on my experience you are not “vibe coding” real enterprise software. Both from the perspective of building something that needs to run in production environments in highly regulated industries AND the deep domain knowledge required to even know what to build and how to build it. I purposely picked this industry/project as the place to start because 1- its the most relevant to my own interest (enterprise sw); and 2- I’m interested in solving the hardest problems, not the easiest ones. There is nothing more rigorous than selling and shipping into enterprise healthcare where there are lives on the line. Building something like this requires deep understanding of things like the clinician workflow, the dominant lab sources, LOINC coding, lab-specific reference ranges, multi-date panel correlation, what a HIPAA auditor actually looks at ... down to “trivial” things like TSH lab values needing two decimal places because thats clinically meaningful. My perspective around what is going to happen to the enterprise software/SaaS industry is highly informed by this project, but I’ll save that for another day.
So what actually was built? I want to be specific about this because the details are what separate this from a vibe-coded demo. In a nutshell the system automates the extraction of lab values from relevant lab sources (Quest, LabCorp, etc) in the exact manner these clinicians need saving over 10h/week/clinician.
No Vibe Coding Here. Everything on this page (the meat of the product) requires domain knowledge of what to build and how:
Diabetes vs Thyroid Presets. Reference ranges. LOINC codes. MD Summary format (preference for a particular group). This also has the ability to push right into the EHR, or the clinician can hit the “Copy” button which allows them to paste it directly in themselves as they get comfortable with the workflow. This is the human-in-the loop component and its critical to understand the best way to integrate large changes into a workflow from a human perspective. I know everyone is going crazy for fully automated magic, but knowing when to do that versus when to have a human-in-the-loop is critical. None of this came from telling Claude Code to build me this app. It came from being in the room with clinicians, understanding the nuances of their specialty, learning their workflow, and then translating that into precise requirements for Claude to build.... the fun stuff! For example, why is it important to display creatinine alongside glucose values? Because diabetes damages kidneys and creatinine measures kidney function.
Built-In HIPAA Auditing, all available real-time:
Multi-Cloud OCR with runtime switching between the relevant AWS, GCP and Azure services for lab processing. This was done on intentionally so the solution can be sold to customers regardless of which major cloud provider they have settled on, and as a secondary benefit provides a nice fall back incase one of the services has issues. Note: This particular customer is all in on AWS.
Authentication is done via AWS Cognito with RS256 JWT verification, automatic refresh on key rotation with appropriate timeouts and lockouts preconfigured. These are not nice-to-haves these are requirements in Healthcare.
The other point I want to make explicitly is Claude Code didn’t just write the application code, I also wrangled it to handle the entire lifecycle of standing up the production AWS infrastructure (this customer is all in on AWS). Claude Code generated the Terraform config for the entire stack, it wrote the Dockefile, the docker-compose orchestration, etc, etc. And the entire app can be managed by a few Claude Code skills, never having to ever touch AWS at all:
Everything is managed via Claude Code skills. These skills are infrastructure-as-conversation --> workflow-as-code. This is a different kind of DevOps: no YAML pipelines, no Jenkins, no bash scripts with cryptic flags to maintain, etc. This is using agents to do deploy. Made changes and want to test locally? Just type “/local”. Ready to deploy to prod? Just type “/deploy”. This is just one way of handling this, YMMV, but I thought it was pretty cool and efficient. There are of course pros/cons, but the main takeaway is we need to completely reimagine how software is built, deployed and managed from first principles. But just think about this: Claude Code wrote the app, stood up the production environment, configured HIPAA-compliant cloud services and automated the deployment. Just mind blowing stuff.
One other interesting thing to mention is that this was originally designed to use AWS Textract AND Amazon Comprehend Medical. After some testing, we ended up removing Comprehend Medical from the pipeline because we were able to find everything deterministically and do it faster & w/ less cost, for this particular use case. Comprehend Medical is truly an awesome service but for this use case with bounded inputs, standardized formats, etc the deterministic strategy won on every metric. There is an entire lesson here on making deterministic vs probabilistic design decisions & tradeoffs which I’ll save for another post. Its truly fascinating and I think one of the key stills that needs to be developed as we go forward. We evaluated 4 levels of agentic architecture and rejected all of them in favor of a deterministic pipeline.
Few key takeaways for me....
1- The human role didn’t shrink, it elevated. What to build. How to build it. Deep domain expertise. Understanding tradeoffs. This is a move towards architecture, domain modeling, judgement and taste.
2- Vibe coding produces nice demos. Domain expertise + AI produces real solutions.
3- The right way to think about this is that you are the brain and claude code is your hands.
4- Think very hard about where your sw is deterministic vs probabilistic (i.e. LLM usage).
5- Using Claude Code as my DevOps engineer was unexpected, and awesome.
6- You cannot outsource your thinking. You need to steer these tools. Think of it like you are molding clay.
Most “I build X with AI” posts are dashboards, chatbots, or alike. This is a HIPAA complaint app thats multi-cloud, deployed on AWS with audit logging, JWT auth, PHI redaction and EHR integration. Anyone who says you cannot use AI to build in regulated industries needs to just give it a try. It’s possible, but not by vibe coding. I’m truly blown away by what is possible with these tools. Remember this was all done in weeks not months. This is all enabled by proof over persuasion selling.
Happy building and selling!





