- Christos Panagiotidis
- Aug 29
- 4 min read
n the golden age of arcade cabinets, when your Pac-Man machine got too popular, you pushed another cabinet onto the floor and split the line. Cosmos DB for MongoDB (vCore) just gave you the cloud version of that move: you can add physical shards to a cluster and rebalance data—currently in preview—so your workload scales horizontally without taking the lights down. It’s the feature a lot of teams have been waiting for, because it bridges the gap between “scale up” and “time to re-architect everything” with a button that feels satisfyingly clicky. Microsoft Learn
Let’s decode it. The vCore flavor of Cosmos Mongo has always supported logical sharding at the collection level, mapping shard keys to logical partitions while the service handles the heavy lifting under the hood. What’s new is the ability to increase the number of physical shards in your cluster and then trigger a data rebalance so those logical shards spread out onto the new hardware without downtime. If you’ve ever lived through a manual shard expansion on DIY infra, you know this is less “drag crates behind the venue” and more “roadies did it before soundcheck.” The docs call out that this capability is in preview, with no SLA, so save it for non-production or for carefully controlled rollouts where preview is acceptable. You enable the rebalancer, add a shard, and kick off the balancer commands—then you watch storage skew converge while your app keeps serving. That’s the mixtape flipping sides without missing a beat. Microsoft Learn
This isn’t just a checkbox for capacity nerds. It’s a design unlock. Teams that started vertically—bigger compute tier, larger disks—now have a sanctioned path to go horizontal when single-shard limits or write throughput say, “enough.” And because vCore decouples compute and storage to a degree, you can still make smart vertical moves where it pays while reserving horizontal expansion for the moment data or IOPS demands it. The broader vCore guidance emphasizes that vertical scaling is great for read-heavy workloads where memory helps and for avoiding the modeling gymnastics of sharding too early; but eventually the party gets too big for one room. That’s the moment to add a shard, light up the rebalancer, and spread the groove. Microsoft Learn
Operationally, treat preview like you’re driving your DeLorean on fresh snow. Test first. Start with a staging cluster that mirrors prod scale enough to be meaningful. Add a single physical shard; the service only allows increasing shard count one at a time anyway, so bake that cadence into your runbook. Enable the “Rebalancer for multishard clusters” feature flag in the portal, confirm it’s on, then step through the Mongo shell commands to start and observe the balancer. While the rebalancer runs, keep an eye on CPU, disk IOPS, and the spread of storage across shards. The docs reassure that the operation is online, but every dataset is its own animal—use Application Insights and your own telemetry to watch p95s and error rates like a hawk. When you’re done, write down what you saw and what you’d change. That’s how you turn a preview into muscle memory. Microsoft Learn
Choosing or revisiting your shard key is the place where architecture meets art. Good keys spread writes, avoid hot partitions, and line up with your most common query filters. If you’re migrating an unsharded collection that’s grown unwieldy, consider whether your access patterns have drifted since you launched. A key that was fine at 10 TB might be a bottleneck at 25 TB. If you can model with a compound key that balances cardinality with locality, you’ll help the balancer do its job without creating a constellation of micro-hotspots. Meanwhile, remember that vCore gives you big disks—up to tens of terabytes per shard—so you can use sharding to manage throughput and parallelism even when disk headroom looks generous. The point isn’t just “fit”; it’s “flow.” Microsoft Learn
Capacity planning gets more fun with options. Picture a growth curve where you scale compute for read latency today, then scale storage next quarter, then add a shard the quarter after that when your write volume turns into a stadium tour. With vCore, you can stage those moves instead of doing a forklift migration that takes your weekend hostage. In practice, that can look like autoscaling your app tier to protect the hot path while you spread data on the backend—then stepping down compute once distribution levels out. If your latency SLOs are tight, schedule shard adds during off-peak windows and throttle client traffic if you must. You’ll find your groove quickly after the first rehearsal.
And because this is 2025, let’s talk automation. Treat shard management like anything else in infra: define it as code, review it with peers, run it in pipelines, and capture every change in a runbook. When you add a shard and run a rebalance, tag the cluster with metadata so six months from now you can answer, “Why did we go from four to five?” without digging through detective novels. While preview lasts, put guardrails around who can flip the switch. As GA approaches, you’ll already have the muscle memory—and the receipts.
Most importantly, don’t confuse horizontal headroom with a license to ignore data design. Indexes still matter. Query patterns still matter. TTLs, archiving, and back-pressure strategies still matter. Shards amplify both the good and the bad in your model. The new feature is a turbo button, not a fix-all. Use it right and your cluster scales like an anthem chorus, each shard singing the same hook in harmony. Use it wrong and you’ll be chasing echoes in an empty arena. The capability is here, the path is clear, and the preview label just means you get to be early to the party. Bring your shades; the shards are bright.
