Self hosting your database

There was a discussion recently on Hacker News - on whether you self-host your database, and if so, why ? The common practice currently is to host your database with a service like RDS. Anyone hosting themselves would be doing it for a good reason. The most common reason people shared was price. And everyone agreed in general that better to host any production database on the cloud.

This reminded me of the time when I opposed moving our company’s production database to the cloud. Although I lost the argument eventually, I did argue rather strongly. And when I look back now, my stand does not seem quite that unreasonable.

We were hosting a Mysql database on a VM, the same VM that hosted the application code as well (this was a rather common setup in those days). The business was getting bigger, and it was time to move to a better setup. Better in terms of both reliability and scale. We would need proper backups, monitoring and eventually replication. My plan was to move the database to another VM, and then add these things one step at a time (we had backups already).

The CEO 1 wanted the database to be hosted on the cloud. We were using Azure, and these were the early days of Azure. AWS RDS was there, but moving to AWS was not a viable option. This was a healthcare startup, so data security becoming progressively important. HIPAA compliance was not a compulsion (in India) yet, but it could become any day.

Azure didn’t have a managed database offering those days. There was one company, called ClearDb, that used to offer managed Mysql on Azure. I was asked to conduct a technical evaluation. I ran some performance tests on a small instance, and things seemed generally okay. The only problem was the price - all reasonable plan were 200-400 USD a month or more. I mean this was essentially a VM within Azure datacentre on which they would install Mysql plus some more things (for monitoring/replication etc - that I thought I could do as well), and charge ten times the money. And it wasn’t even auto scaling - you decide on the instance size during signup, and hope for the best. And (IIRC) they would start dropping writes when we exceed the storage limits. It felt quite ridiculous actually.

The ClearDb price chart today. There are community and developer plans too.

My CEO liked everything about them, especially liked the master-master replication feature. That he thought, would take care of disaster recovery (say from accidental deletions) too. And of course there was this notion:


Don’t be maintaining servers,
focus on your application

Price was not a big concern.

My argument was based on two factors. One was price. The other was maintainability. I argued that Mysql was a fairly easy database to maintain. I had maintained Mysql databases for a few years by then, and I had never had any problems 2. It was mostly setup and forget. When I think maintainability, my impression is based on the maintainability of application software. I could never feel the same way about a Ruby or Python or a Wordpress application: these need far more involvement. Not so with database software. (there were some flaws in my reasoning, I’ll come to those)

Now here’s my policy when it comes to technical disagreements :

  • Share opinions as objectively as possible
  • Let the business/product owner have the last word.

The second point is more obvious, any professional should do that. But the first needs more elaboration. Often times my technical views contain more opinions than objective facts. I read tech news and opinions to keep myself up-to-date with multiple viewpoints, but staying hundred percent objective is difficult. Only when I put forth my points before colleagues, is the current validity of those opinions challenged. Sometimes a simple common sense argument from a colleague beats loads of wisdom acquired on the internet.

In this case, however, I did not feel very convinced, but decided continued with the cloud plan because the CEO was so exited about it. We were just about to migrate to ClearDb, when news came out that Azure will offer its own managed database service (similar to RDS).

Azure Database for MySql 3 There was months between announcement and availability. Now you need to keep tab on multiple announcement channels to find out when they launch. Azure too sensed this was a need, that’s why there was a separate Twitter account for the service. That’s an important catch whiling dealing with Cloud especially in India - services/features are there but not available in your region. At one point we had to connect with a PM within Azure to check if they were ever going to launch. Anyway, it got launched eventually, we set it up as required by business, and moved on.

This was back in 2017 I think. Now, in 2021, its hard to find anyone running a production database self hosted. A lot of hosting providers have come up with managed hosting services (including Digital Ocean recently). Prices have come down and it makes much more sense than before.

My own technical worldview has changed quite a bit. Now that I have been part of less funded projects that need high scalability, a lot of things popular adages now make sense. It seems correct that I need to spare time from system operations so I can focus on the application code. Even if that means higher cost of infrastructure. But then there are several nuances to this that the promoters do not mention.

First off, planning and calibration is still my job. Cloud providers give us a dozen switches and levers, and I need to figure out which one’s should be turned on. How big should the instance be ? Do I need IOPS provisioning ? What about backups, encryption, performance insights, monitoring, logging, maintenance upgrades etc etc. Many of those settings can’t be changed once provisioning is done. And each of those have some or the other price implications. To make informed choices, you need to read the documentation. Its safe to say there’s a learning curve.


This is the form you fill while provisioning an RDS database

Second, there are quite a few things that the cloud provider does not provide out of the box, that are not immediately apparent. For example, since RDS is a managed database service, you’d imagine they’ll apply security fixes on their own. But in reality, the best they do is inform you of well known vulnerabilities and ask you to upgrade. That too not for all bugs. I friend recently hit a rather unique kind of bug in Postgresql, and he had to find the patch and apply it on RDS himself. And then, all these advanced features - the replication etc, also needs to configured by you.

Third, price is still an important consideration. If you think spending hundreds of dollars on a database is too expensive, cloud is not for you. And if we add Serverless in to the mix, bills can get out of control very fast.

As a conclusion, I think it makes sense to use a cloud hosted database as your default choice. Whether or not that is an optimal choice, depends on the needs of the project, technical and financial. A little research before settling on the final choice never hurts. Often times our choices are based on our opinions, prejudices and current trends. Some among us are able to set aside the prejudices before deciding - I’m surely not one of them yet.




1. It was a startup, CEO == CTO == Product Manager (the wearing multiple hats thing)
2. Though none of them were large scale by any means
3. I know naming things is hard, but still, one of biggest software companies in the world came up with this name for its Mysql offering: ‘Azure Database for Mysql’. It was nearly impossible to reach from search engines.

© 2023