Our Adobe DRM server cluster had degraded performance and finally an outage due to a massive number of large files being delivered for encryption, resulting in out of memory errors on all servers.
To address this issue, we have come up with short term and long term plans.
In the short term, we're increasing the number of server instances we have in commission, as well as their memory amount and cpu power. This will mean substantially increased redundancy and failover, which will in turn minimize the risk of similar outages in the future.
In the long term, we will begin processing files in an asynchronous manner instead of doing so instantly, which will allow us to manage server resources better. More information on this will be forthcoming in the coming weeks as we come up with the development and deployment roadmap.