In 2024, I worked on a large user account migration (~1.6M accounts) off a game-service cloud provider (PlayFab) where a lot of our critical identity and commerce data lived:
<ul>
<li>authentication</li>
<li>game ownership data</li>
<li>storefront entitlements</li>
</ul>
The “easy” part was moving data. The hard part was moving it safely, with minimal downtime, and with a fragmented, legacy data model that had drifted for years.
While planning, I quickly realized we had a severe throughput constraint: certain user data could only be extracted iteratively (per-user) from PlayFab. Our first-pass approach was a Node.js script, but no matter what concurrency method I chose, it was not nearly performant enough. The extraction step alone was estimated to take 6+ hours—unacceptable downtime.
So I rewrote the extraction pipeline in C++ using libcurl, pushing concurrency hard: using as many sockets as the kernel would allow, and saturating outbound requests right up to the limit. The result was night-and-day: 1.6M-user extraction completed in ~35 minutes!
“Make it faster” isn't always the best approach but in this case it lent many extra hours of time that was used to smoke test our data in the new system.

large_1430-A dimly lit data center with chaotic str-qwen_image_fp8_e4m3fn-1004749951-crop.jpg

Blog.

Supercharged Data Migration With Native Scripting