Genuine Question - have you migrated DBMS on a Production System which wouldn't have been possible with vendor lock-in on the backend? (lemmy.world)

submitted 2 years ago by aluminium@lemmy.world to c/programming@programming.dev

17 comments fedilink hide all child comments

This is something I have thought a lot recently since I recently saw a project that absolute didn't care in the slightest about this and used many vendor specific features of MS SQL all over the place which had many advantages in terms of performance optimizations.

Basically everyone always advises you to write your backend so generically with technologies like ODBC, JDBC, Hibernate, ... and never use anything vendor specific like stored procedures, vendor specific datatypes or meta queries with the argument being that you can later switch your DBMS without much hassle.

I really wonder if this actually happens in the real world with production Software or if this is just some advice that makes sense on surface level but in reality never pans out. I personally haven't seen any large piece of Software switch to a different DBMS, even if there would be long term advantages of doing so, because the risk and work to retest everything would be far too great.

The only examples I know of (like SAP) were really part of a much larger rewrite or update rather than "just" switching DBMS.

you are viewing a single comment's thread
view the rest of the comments

[-] otl@apubtest2.srcbeat.com 4 points 2 years ago

Great question. Short answer: yes!

Long answer: I did this on a production system about 2 years ago.

The system was using MySQL, which was served from 3 virtual machines. Nobody took responsibility for that MySQL cluster, so outages and crazy long maintenance windows were normal especially as there was no DB admin expertise. The system had been hobbling along for 3 years regardless.

One day the company contracting me asked for help migrating some applications to a new disaster recovery (DR) datacentre. One-by-one I patched codebases to make them more portable; even needing to remove hard-coded IP addresses and paths provided by NFS mounts! Finally I got to the system which used the MySQL cluster. After some digging I discovered:

The system was only ever configured to connect to one DB host
There were no other apps connecting to the DB cluster
It all ran on "classic" Docker Swarm (not even the last released version)

My ex-colleague who I got along really well with wrote 90% of the system. They used a SQL query builder and never used any DB engine-specific features. Thank you ex-colleague! I realised I could scrap this insane not-actually-highly-available architecture and use SQLite instead, all in a single virtual machine with 512MB memory and 1vCPU. SQLite was perfect for the job. The system consisted of a single reader and writer. The DB was only used for record-keeping of other long-running jobs.

Swapping it over took about 3 days, mostly testing. No more outages, no more working around shitty network administration, no more "how does the backup work again?" when DB dumps failed, no more complex config management to bring up, down DB clusters. The ability to migrate DB engines led to a significant simplification of the overall system.

But is this general advice? Not really. Programming with portability in mind is super important. But overly-generic programs can also be a pain to work with. Hibernate, JDBC et al. don't come for free; convenience comes at the cost of complexity. Honestly I'm a relational database noob (I'm more a SRE), so my approach is to try to understand the specific system/project and go from there. For example:

I recently saw a project that absolute didn't care in the slightest about this and used many vendor specific features of MS SQL all over the place which had many advantages in terms of performance optimizations.

Things I would want to learn more about:

were those performance optimisations essential?
if so, is the database the best place to optimise? e.g. smarter queries versus fronting with a dumb cache
are there database experts who can help out later? do we want to manage a cache?

Basically everyone always advises you to write your backend so generically with technologies like ODBC, JDBC, Hibernate, ... and never use anything vendor specific like stored procedures, vendor specific datatypes or meta queries with the argument being that you can later switch your DBMS without much hassle.

Things I would want to learn more about:

how many stored procedures could we be managing? 1, 100, 1000? may not be worth taking on a dependency to avoid writing like 3 stored procedures
is that tooling depended on by other projects already?
how much would the vendor-specific datatype be used? one column in one table? everywhere?
does using vendor-specific features make the code easier to understand? or just easier to write? big difference!

My shitty conclusion: it depends.

this post was submitted on 03 Apr 2024

31 points (100.0% liked)

Programming

25012 readers

193 users here now

Welcome to the main community in programming.dev! Feel free to post anything relating to programming here!

Cross posting is strongly encouraged in the instance. If you feel your post or another person's post makes sense in another community cross post into it.

Hope you enjoy the instance!

Rules

Follow the programming.dev instance rules
Keep content related to programming in some way
If you're posting long videos try to add in some form of tldr for those who don't want to watch videos

Wormhole

Follow the wormhole through a path of communities !webdev@programming.dev

founded 2 years ago

MODERATORS

snowe@programming.dev

Ategon@programming.dev

MaungaHikoi@lemmy.nz

UlrikHD@programming.dev