Skip to main content
Government Technology

47 Users, 47 Engineers

Government contracting has a 100x overbuilding problem. Internal apps serving dozens of users get architected like they're competing with Amazon. The taxpayer pays for it. The mission suffers.

Intelligrit LLC | March 2026

Bottom Line Up Front

Your internal systems take 18 months instead of 60 days because they're overbuilt by 100x. Teams architect for millions of users when the app will serve dozens. They choose complex deployment schemes that turn every change into a multi-team coordination exercise. And every unnecessary layer — each microservice, each orchestrator, each environment — multiplies the security reviews, ATO documentation, and operational burden that already make government delivery slow.

This is a solved problem. A single compiled binary with an embedded database, cache, and storage handles 20,000+ real-world users on one server. Deploys take two commands. Backups are a file copy. Scaling to hundreds of thousands is a config change. The article below explains how, with benchmarks.

The Question Nobody Asks

Before the first sprint planning session, before the CI/CD pipeline is designed, there's one question that should come first:

How many people will actually use this?

In 2014, Mike Acton gave a talk at CppCon called "Data-Oriented Design and C++". The core idea: programmers systematically avoid understanding the actual problem they're solving. They build abstractions on top of abstractions, optimize for imaginary scenarios, and never look at the data. His point was simple: "The best way to solve problems you don't have is to not solve them." Solving imaginary problems doesn't just waste time — it creates concrete problems you now definitely do have. Every unnecessary abstraction, every speculative scaling layer, every "just in case" architecture decision introduces real complexity, real bugs, and real maintenance burden. In highly regulated environments like government, where every component must be documented, audited, and authorized, each imaginary problem you pre-solve multiplies into very real paperwork, very real deploy complexity, and very real security surface area.

This doesn't mean you go in blind. You won't always know exact user counts up front — even government agencies often don't. But you can understand how to structure data without painting yourself into corners. You can make sane technology choices that scale naturally. You can design interfaces that don't assume a specific backend. The goal isn't clairvoyance about your user base — it's discipline about not pre-building for problems that haven't materialized.

Most internal government applications have small user bases. A case management tool might serve 30 people. A reporting dashboard, 75. A workflow system, 200. Some serve a few thousand — and yes, some government systems genuinely serve tens of thousands of users. The point isn't that government apps are always small. The point is that the architecture shouldn't assume millions when the reality is hundreds, or even thousands. And the number that matters is concurrent users — people simultaneously making requests at any given moment. An app with 10,000 registered users where each person uses it 30 minutes per day, with all usage falling between 10am and 3pm? That's about 1,000 concurrent users at peak. (30 minutes of use in a 300-minute window = 10% concurrency.) On the right stack, a single server handles that comfortably — and handles 20,000+ real-world users before you even think about scaling. That's not optimistic. That's what a compiled binary with an embedded database and in-process cache actually does on commodity hardware.

The 47-User Architecture

Yet here's what typically gets built:

Typical architecture for a 47-user internal app:

  • Kubernetes cluster across multiple availability zones
  • Helm charts, ArgoCD, GitOps pipeline
  • Container registry, image scanning, promotion workflows
  • Service mesh for inter-service communication
  • Separate microservices for auth, API gateway, business logic, notifications
  • Multi-stage CI/CD with 15+ pipeline steps
  • Secrets management vault with dynamic credential rotation
  • Three environments mirroring production exactly
  • Multi-region failover with automated health checks
  • Centralized logging, distributed tracing, custom dashboards

That's not a system designed to serve 47 users. That's a system designed to employ 47 engineers.

Enterprise Architecture
$4.7M/yr
47 engineers, K8s cluster, service mesh, 15-step CI/CD, three environments
Right-Sized Architecture
$800/mo
Single binary, embedded database, 2-command deploy, one server

The Downstream Explosion

Each architectural layer multiplies work across every compliance dimension

Every unnecessary architectural layer multiplies the work across every dimension of the project. It's not just that the build is more complex — everything connected to it becomes more complex.

Security reviews expand to cover every component. A Kubernetes deployment means documenting container security, network policies, service-to-service auth, ingress configuration, secrets management, image provenance, and cluster hardening. For a monolith on a VM behind Caddy, you document the binary and the reverse proxy.

ATO documentation balloons. Every microservice is a separate component in the system security plan. Each one needs its own boundary analysis, data flow diagram, and control implementation. A system with eight microservices doesn't have 8x the documentation — it has closer to 20x, because you also document every interaction between them.

Deploy windows become coordination exercises. When a deploy touches multiple services, registries, and environments with promotion gates, a change that should take minutes takes days of scheduling and cross-team communication. Developers sit idle waiting for deploy approvals on systems that serve three dozen people.

Incident response becomes archaeology. When something breaks in a distributed system, finding the root cause means correlating logs across services, checking network policies, verifying container health, and tracing requests through multiple hops. In a single binary, you read one log file.

Compliance requirements don't mandate any of this. FedRAMP, FISMA, Section 508, NIST 800-53 — these require that your system is secure, documented, and accessible. A single binary behind a TLS-terminating reverse proxy meets those requirements just as well as an eight-service Kubernetes deployment — and the security boundary is dramatically easier to audit. A lot of what passes for "compliance-driven architecture" is process theater: complexity dressed up as thoroughness.

Why does it happen? Complex architectures are self-justifying. They require large teams. Large teams require large contracts. Nobody gets fired for overbuilding. The architect who proposes an "enterprise-grade" solution looks thorough. The one who proposes a single binary on a VM looks naive — even if it would serve the user base perfectly and ship in a quarter of the time.

The Five Nines Delusion

99.999% uptime means less than 5.3 minutes of downtime per year. Amazon engineers for this because every minute costs millions and affects hundreds of millions of users.

Your internal case management tool is not Amazon. If it goes down for 30 minutes on a Tuesday, 47 people switch to email until it comes back. The engineering cost of preventing those 30 minutes — multi-region failover, redundant replicas, load balancer chains — vastly exceeds any business impact.

If AWS us-east-1 goes down, is it acceptable for your internal app to also be down? For almost every internal tool, the answer is yes. Single-region, single-server deployment isn't cutting corners. It's appropriate.

What Right-Sized Actually Looks Like

A series of deliberate technology choices, each individually straightforward, compound into a system that handles enormous workloads on minimal hardware. It's not magic. It's paying attention to what modern hardware can actually do.

Compiled languages with native concurrency. Go compiles to a single static binary and uses goroutines — lightweight threads multiplexed across all CPU cores, handling millions of concurrent operations with minimal overhead. The TechEmpower Framework Benchmarks consistently show Go handling hundreds of thousands of requests per second on commodity hardware.

Embedded analytical databases. DuckDB runs in-process — no separate server, no network round trips, no connection pool management. On the H2O.ai database benchmarks, DuckDB on a single machine routinely outperforms distributed clusters running on 30x the hardware. ClickBench tells the same story. Computers are remarkably fast when you stop making them wait on the network.

In-process caching and graph analysis. A cache in the same process means a cache hit is a memory read — nanoseconds, not the milliseconds of a network hop to Redis. gonum provides 40+ graph algorithms running in-memory, directly on the data — no graph database to manage, no separate service to deploy.

S3-compatible local file storage. The same API that works with the local filesystem today works with MinIO or AWS S3 tomorrow. Same interface, different backend, zero code changes.

Together these compound: no network hops between components, no serialization overhead, no distributed coordination. And the escape hatch is built in — DuckDB swaps to PostgreSQL, in-memory cache swaps to Valkey, local storage swaps to S3 — all via config changes. As Amazon's own Prime Video team demonstrated in 2023, moving from distributed microservices to a monolith cut their costs by 90%. If Amazon is simplifying, maybe the 47-user internal app doesn't need the complexity either.

Proof: ACT-IAC Hackathon, 2024

In two weeks, a two-person team built a real-time provider data processing engine: 1.8 million healthcare providers across 550+ million rows of data, with measurable results in a live demo. Seventeen teams competed — most of them companies with larger teams and bigger budgets. We won. The engine was a Go binary with an embedded database. The language, the database, and the algorithms were chosen for the actual problem — and the actual problem didn't need distributed infrastructure.

A note on the diagram below

"Concurrent users" means simultaneously active users — people making requests at the same moment. For a moderate-use app (30 min/day per user, peak hours 10am-3pm), about 10% of your user base is concurrent at any given moment. The capacity numbers below assume this pattern. Lighter usage or more spread-out access pushes them higher.

How We Actually Scale: Three StagesA diagram showing three scaling stages for the same application code. Stage 1: A single Go binary on one server with zero external dependencies — DuckDB embedded database, gonum graph analysis, Ristretto in-memory cache, and local S3-compatible filesystem storage all compiled into one executable. Supports approximately 2,000 concurrent users or 20,000 real-world users with moderate daily usage. Deploy is scp and systemctl restart. Backup is a single file copy or filesystem snapshot. Stage 2: The same Go binary running as multiple instances behind a load balancer, with PostgreSQL replacing DuckDB via a configuration change. Zero code changes required. Supports approximately 10,000 concurrent users or 100,000 real-world users. Only one external dependency. Stage 3: Multiple Go binary instances with PostgreSQL and Valkey for distributed sessions and caching. Supports 50,000 or more concurrent users, or 500,000 or more real-world users. Still the same application code. Two external dependencies. Most internal government applications never need to leave Stage 1.How We Actually ScaleSame application code at every stage. Infrastructure changes, not rewrites.STAGE 1Single Server · Zero DependenciesGo BinaryAPI + CLI + Web UI (embedded)DuckDBEmbedded DBgonumGraph AnalysisRistrettoIn-Memory CacheLocal FSS3-API StorageDeployscp binary to serversystemctl restart appBackupcp app.duckdb backup/(or filesystem snapshot)External deps: 0~2,000 concurrent users~20,000 real-world usersconfigchangeSTAGE 2Multi-Server · 1 External DepGo Binary x NPostgreSQLSame Store interfacegonumRistrettoMinIO / S3Object storageWhat changed?Connection stringin config.tomlZero code changes.External deps: 1~10,000 concurrent users~100,000 real-world usersconfigchangeSTAGE 3High Scale · 2 External DepsGo Binary x NPostgreSQLValkeySessions + CachegonumMinIO/S3Still the sameapplication code.Handles tens of thousandsof concurrent users.External deps: 2~50,000+ concurrent users~500,000+ real-world usersMost internal government apps never need to leave Stage 1. Start here. Scale when evidence demands it.
How We Actually Scale — same application code at every stage
StageInfrastructureEmbedded ComponentsExternal DependenciesDeploy MethodConcurrent UsersReal-World Users (30 min/day)
Stage 1: Single ServerOne Go binary on one VMDuckDB (database), gonum (graph analysis), Ristretto (cache), Local FS (S3-compatible storage), Web UI (go:embed)0scp binary, systemctl restart~2,000~20,000
Stage 2: Multi-ServerMultiple Go binaries behind load balancergonum (graph analysis), Ristretto (cache), MinIO/S3 (storage)1 (PostgreSQL)Config change: swap connection string~10,000~100,000
Stage 3: High ScaleMany Go binaries behind load balancergonum (graph analysis), MinIO/S3 (storage)2 (PostgreSQL, Valkey)Config change: add Valkey for sessions and cache~50,000+~500,000+

The AI-First Engineering Bonus

A single binary is a known, tractable problem for AI agents. The agent builds it, runs make verify, and knows — with certainty — whether the code works. The embedded database, cache, and file storage are all exercised by the same test suite in the same process. When all tests pass, that's a real integration test, not just unit tests that happen to be green.

Go's idioms reinforce this. The language culture favors small, explicit functions — typically under 25 lines of business logic. Each function does one thing, takes explicit inputs, returns explicit outputs. There's no hidden inheritance, no implicit middleware chains, no decorator magic. An AI agent can reason about a 20-line Go function completely. It can hold the entire function, its inputs, its outputs, and its tests in context at once. That's not true of a 200-line method buried in a class hierarchy.

Distributed systems break all of this. AI agents can't easily spin up Kubernetes clusters, verify service mesh routing, or confirm that a Helm chart deploys cleanly across three environments. Every layer of infrastructure you add is a layer your agent can't test. A single binary with small, explicit functions closes that verification gap — and in a world where AI agents are writing most of the code, that gap is the difference between shipping daily and shipping monthly.

Why Are We Sharing This?

Everything in this article — compiled binaries, embedded databases, in-process caching — is publicly available technology. DuckDB is open source. Go has been production-ready for over a decade. The ideas aren't new. Most of the thinking behind our approach can be learned from people like Mike Acton: his CppCon 2014 talk on Data-Oriented Design, his Solving the Right Problems for Engine Programmers, and his You're Responsible talk. The philosophy is out there for anyone willing to absorb it.

What makes it work in practice is decades of engineering experience at ultra-high scale — top-30 worldwide traffic sites, petabytes of data, systems where a wrong architectural choice costs millions per month. That experience taught us how critical simplicity is. Not simplicity as laziness — simplicity as discipline. Knowing which layers to remove requires understanding what every layer does. Knowing when a single server is enough requires having operated systems where it wasn't. It requires taste, judgment, and a set of experiences that you can't get from talks alone — you get them from years of operating systems where the consequences of overbuilding are measured in dollars per minute.

Imagine if the internal app your organization needs could be in production in 60 days instead of 18 months. What problems could you solve? What mission-critical work is stalled right now, waiting on a system that hasn't shipped because it's buried under infrastructure nobody asked for?

That's the real cost of overbuilding. Not just the money — the time. The problems that don't get solved. The mission that waits.