#4 is a good thing. ORMs do not make queries better or safer, they make them easier for devs that don't learn SQL or safe calls. In some cases, they have been shown to cause slowdowns.
A program that HR had built so that all employees could they their payment receipts online
The username was the companies' email address, the password was a government personal id code that you can lookup online, a don't change, and you can't update the password to something else.
So I told the director of HR this was a bad idea. She told me I was overreacting until I showed her her own receipt, then she finally understood that this is a really fucking bad idea.
Okay, so now she out me in charge of debugging that program.
So I setup a meeting with the director of the company they hired, he came by with the developer: a 21 yo girl who I think hadn't finished college yet. Great start! Apparently it was her idea to do the authentication like that so that explains a few things.
So we dive in to the code.
First of all, the "passwords" were stored in blank, no hashing, no encryption, nothing. That wasn't the worst.
For the authentication she made a single query to check if the user email existed. Of that was true, then step two was a second query to see if the password existed. If that were true, the email had been authenticated.
So let's say, hypothetically, that they had actual passwords that people could change... I could still login with the email from anyone, and then use MY OWN password to authenticate.
This just blew my mind so hard that I don't think I ever fully recovered, I still need treatment. The stupidity hurts
I wouldnt blame that on stupidity as much as on ignorance and naivety. Many people simply don't think about anybody deliberately misusing their design. The idea that somebody could even want to access somebody elses receipts didn't occur to them. And if they were still doing their studies they might not have known that you can "combine" SQL queries and ask for two things at once.
I don't blame the girl, but whoever chose her to design a system with sensitive information.
I don't blame a girl for doing a job that lands her food on the table. I blame the guy employing her because she's the cheapest option
Having said that, this design was so bad that she should not have been doing any of this. If you don't know that SQL allows you to select multiple columns then by all means, do a tutorial, it's not that hard.
If you don't even know what encryption is, that passwords need hashing and what not, then you should really question what you're doing
OPs question was about the worst code I've seen, that was the worst I've seen
If you don’t even know what encryption is, that passwords need hashing and what not, then you should really question what you’re doing
I agree with your point, but I would phrase it more generally: when we're assigned a task in a problem space we are unfamiliar with, we should always take some time to research that space before designing our solution.
After all, if we don't know what encryption or password hashing are, how could we know that we need to learn about them first? But spending just a couple hours one morning reading about password and authentication management would have given the developer a good sense of best practices.
So she either, A) didn't think to familiarize herself with a new topic prior to working on it, or B) did read about it and ignored general industry guidance. Both of those options are more problematic to me than simply not knowing specific things. Those are process problems that need to be addressed to build her skills as a developer.
But ultimately, in my opinion, this is really all the fault of the cheapass director who didn't want to pay any experienced professionals to handle the task.
It wouldn't take much google-fu to get a worked example of good authentication in whatever language. She can't have tried, she must have just gone "programming 104 covered how to SQL, I can use that"
There was something like
# sleep for about a second on modern processors
math.factorial(10000)
After it was found we left it in the code but commented out along with a sleep(1) for posterity.
XML-DOM page templates stored in a database, line by line.
So rendering a page started with:
select * from pages
where page_id = 'index'
order by line_number asc;
Each line of XML from each record was appended into a single string. This string was then XSLT transformed to HTML, for every page load.
This has to be one of the worst ways to reinvent a filesystem that I've ever heard. At the very least, storing static data in an relational database at this scale should be a slappable offense.
The session data, that would have been fantastic to have in a relational, queryable, reliable and trustable format was stored as a single giant string of PHP pickled data structure in a session file associated with the users cookie id.
The architect sending a pointer over an API, in hexadecimal string format. char *c = "71E4F33B" just cast it on the right structure bro.
Just to add, we only did C/C++, on windows mfc, in a monolithic software.
I spent quite some time assuring myself that I was not the insane person before bringing it up with him.
A memory pointer? So it must have been a program sending a pointer using an API to itself so it ends up in the same process again?
A raw memory pointer.
So this is not as bad as some of the other stories I've seen, but I'll bite.
It was an old .NET Framework MVC app. Some internal product management system or something. There was a need to do a PDF export in one of the use cases, so someone implemented it. It wasn't a good implementation: one big controller, mixing UI and business logic, etc. However, it basically came down to a single private method in a specific controller for a page.
Now time passes and lo and behold, we need a PDF export in another page for a different use case. "No problem," - same dev, probably - "I already solved this problem. I'll just reuse the PDF generation logic."
Now, any sane person would probably try to refactor the code responsible for PDF stuff into a separate service (class) and reuse it. A less sane, but somewhat, acceptable approach would have been to just copy paste the thing into another controller and call it a day.
Ha! No no no no no no… Copy pasting is bad, code should be reused…
The end solution: REFLECTION. So the dev decided that the easiest way to make it work was to: 1) use reflection to inject one controller into another; 2) then use reflection again to get access and call that private method for PDF rendering into a stream.
Fortunately I didn't have to fix that fragile mess. But I did my fair share of DevExpress corpse hacking and horrible angular "server side rendering" workarounds.
Private key for a third-party API hard-coded into the front-end web app
Oh boy, this one was a doozy...
Was working at a very big company named after a rainforest on smart home products with integrations for a certain home assistant...
New feature was being built that integrates the aforementioned home assistant with customer's printers so they can ask the assistant to print stuff for them.
The initial design lands from our partner team with a Java backend service fairly nicely integrated with some CUPS libraries for generating the final document to be sent to the customer's printer. All good.
They are about to launch when... uh oh... the legal team notices an AGPL licensed package in one of the CUPS library's dependencies that was absolutely required for the document format needed by the project and the launch is cancelled.
So the team goes off in a panic looking for alternatives to this library and can't find any replacements. After a month or two they come back with their solution...
Instead of converting the document directly in the backend service with the linked CUPS library (as AGPL is a "forbidden license" at this company) the backend uploads the initial document to an S3 bucket, then builds a CUPS document conversion bash shell script using some random Java library, the shell script is then sent (raw) to a random blank AWS host that comes prepackaged with CUPS binaries installed (these hosts were not automated with CI/CD / auto updates as was usually mandated by company practice because updating them might remove the CUPS binaries, so they required a ton of manual maintenance over the service's lifetime...), the bash shell script is then executed on that "clean" host, downloading the document from S3, converting it via the CUPS command line binary, then reuploading it to another S3 bucket where the Java backend picks it up and continues the process of working the document through the whole backend pipeline of various services until it got to the customer's printer.
This seemed to satisfy the legal team at the very least, and I have no doubt is probably still in production today...
The kicker though? After all those months of dev work from a whole team (likely all on 6 figure salaries), and all the time spent by various engineers including myself on maintenance and upkeep on that solution after it was transferred to us?
An alternative, completely unrestricted corporate license was available for the package in question for about $100 per year so long as you negotiated it with the maintainers.
But that was a completely unacceptable and avoidable cost according to upper management...
For info for those not familiar with minor open source licences: AGPL is descended from GPL 3, and is properly copyleft. No surprise it is poison to Amazon
I don't have any specific examples, but the standard of code is really bad in science. I don't mean this in an overly judgemental way — I am not surprised that scientists who have minimal code specific education end up with the kind of "eh, close enough" stuff that you see in personal projects. It is unfortunate how it leads to code being even less intelligible on average, which makes collaboration harder, even if the code is released open source.
I see a lot of teams basically reinventing the wheel. For example, 3D protein structures in the Protein Database (pdb) don't have hydrogens on them. This is partly because that'll depend a heckton on the pH of the environment that the protein is. Aspartic acid, for example, is an amino acid where its variable side chain (different for each amino acid) is CH2COOH in acidic conditions, but CH2COO- in basic conditions. Because it's so relative to both the protein and the protein's environment, you tend to get research groups just bashing together some simple code to add hydrogens back on depending on what they're studying. This can lead to silly mistakes and shabby code in general though.
I can't be too mad about it though. After all, wanting to learn how to be better at this stuff and to understand what was best practice caused me to go out and learn this stuff properly (or attempt to). Amongst programmers, I'm still more biochemist than programmer, but amongst my fellow scientists, I'm more programmer than biochemist. It's a weird, liminal existence, but I sort of dig it.
Lots. But one that springs to mind is a custom CMS where a new dev decided to print out the sql generated for a particular content type on paper. He took it to the CTO without comment.
What was wrong?
It was 12 pages.
For anyone who knows and understands Android development, process death, and saved state...
The previous dev had no understanding of any of it, and had null checks with returns or bypassing important logic littered all over the app, everywhere.
I could only assume he didn't understand how all these things were randomly null or why it was crashing all the time so he thought oh, i'll just put a check in.
Well, you minimize that app for a little bit, reopen it, and every screen was fucked visually and unusable, or would outright crash. It was everywhere. This was before Google introduced things like view models which helped but even then for awhile weren't a full solution to the problem.
It was many many months of just resolving these problems and rewriting it the correct way to not have these problems.
Oh I remember. There are tons of events and associated handlers. Even just switching to landscape view stops and restarts an android view I think. Friends at uni handled that problem by disallowing landscape view instead of handling it hahah
The C++ code went something like this:
- Conver pointer to int
- Serialize the int over IPC to self using Linux Message Queues
- Delete/free the pointer
- Read the int from the queue
- Convert to pointer
- "Use" the pointer
Ok so this one is someone trying to move to "the cloud."
They had a database they used. It was on a server in the office. We were tasked to clone the db server to a hosted VM. Due to order of creation this got put on a new host without anything yet on it.
They needed a site to site VPN to keep privacy, that was all fine. However after the clone and during testing, their guy there said that this one part was really slow. We take a look and everything is good with performance of the server and of the VPN. I have to pop on to take a look.
It was in an office app and written in VB. (I forgot which one.) It was indeed slower on the hosted server. So I took a look at the function (he got it up for me) and I could instantly tell the issue.
This part was a lookup page that searched for you input. The function retrieved the entire table, then filtered the results in the client. I explained that transferring the whole table over the internet would be slower than on the local lan.
This guy said he originally wrote this, but "forgot VB."
In the end they decided not to update the app or keep the server in the office, but instead they rented some VDIs in the same data centre as the db.
There was a website where users could request something or other, like a PDF report. Users had a limited number of tokens per month.
The client would make a call to the backend and say how many tokens it was spending. The backend would then update their total, make the PDF, and send it.
Except this is stupid. First of all, if you told it you were spending -1 tokens, it would happily accept this and give you a free token along with your report.
Second of all, why is the client sending that at all? The client should just ask and the backend should figure out if they have enough credit or not.
This one is funny because it 100% still exists somewhere, but I haven't had the chance to verify it again.
Okay so basically its a data recorder box (ex: brainbox) that connects to a bunch of industrial sensors and sends the data over the network with your preferred method.
Builtin firmware gives you an HTTP webui to login and configure the device, with a user # and password.
I think the user itself had a builtin default admin which was #0, which everyone uses since there wasn't really much use for other users.
Anyway, I was looking at the small JS code for the webui and noticed it had an MD5 hashing code that was very detailed with comments. It carefully laid out each operation, and explained each step to generate a hash, and then even why hashes should be used for passwords.
Here's the kicker: It was all client side JS, so the login page would take your password, hash it, and then send the hash over plaintext HTTP POST to the server, where it would be authenticated.
Meaning you could just mitm the connection to grab the hash, and then login with the hash.
I sat there for like 10 minutes looking at the request over and over again. Like someone was smart enough to think "hey let's use password hashing to keep this secure" and then proceeded to use it in the compleltly wrong way. And not even part of like a challenge/handshake where the server gives you a token to hash with. Just straight up MD5(password).
It was so funny because there were like a hundred of these on a network, so getting a valid hash was laughably easy.
I never got to check if this was fixed in a newer firmware version.
I think the worst software-gore I remember seeing was a web app that dumped all the data to the browser as a huge XML file and then had JavaScript translate the contents of the xml into views. That probably wouldn’t even sound that far off the reservation now if it was JSON, thanks to the sleepless efforts of the JavaScript industrial complex, but back then you’d just render pages and return them.
A registration form and backend that would return the error "please choose more unique password" if you choose a password that was already stored (in plain text) in the database against another username.
I shit you not.
Create a moderately ok password, hash it, use the hash as your nice unique password, as a private joke for when the database leaks and yours is the only password that's hashed and you start getting spam saying they know your password hunter2 (because they incorrectly dehashed the password) or 2ab96390c7dbe3439de74d0c9b0b1767 (md5 sum of hunter2; because they correctly read it as plain text)
First of all, lack of ORM isn’t bad. It’s not a good or bad thing to use them out not use them. What’s bad is not sanitizing your query inputs and you don’t need an ORM to do that.
I think the worst thing I’ve seen is previous devs not realize there’s a cost to opening a DB connection. Especially back when DBs were on spinning rust. So the report page that ran one query to get the all the items to report on, then for each row ran another individual query to get that row’s details was probably one of the slowest reports I’ve ever seen. Every DB round trip was at minimum 0.1 seconds just to open the connection, run the query, send back the data, then close the connection. So 10 rows per second could be returned. Thousands of rows per page has people waiting several minutes, and tying up our app server. A quick refactor to run 2 queries instead of hundreds to thousands and I was a hero for 10 min till everyone forgot how bad it was before I fixed it.
Long time ago, but by far the worst for me was when I inherited some code that a previous programmer had done. Every variable was a breakfast item. So if biscuit>bacon then scrambledeggs=10. Shit like that. It was a nightmare and luckily I only had to deal with it infrequently.
Floats for currency in a payments platform.
The system will happily take a transaction for $121.765, and every so often there's a dispute because one report ran it through round() and another through floor().
We had some super old code in our company monorepo that was written by someone who became the CTO, there was a comment forbidding people from writing private methods in the code base because "we aren't babies". It explained so much about the awful code and why everything was crazy.
One time, I had to request firewall access for a machine we were deploying to, and they had an Excel sheet to fill in your request. Not great, I figured, but whatever.
Then I asked who to send the Excel file to and they told me to open a pull request against a Git repo.
And then, with full pride, the guy tells me that they have an Ansible script, which reads the Excel files during deployment and rolls out the firewall rules as specified.
In effect, this meant:
- Of course, I had specified the values in the wrong format. It was just plaintext fields in that Excel, with no hint as to how to format them.
- We did have to go back and forth a few times, because their deployment would fail from the wrong format.
- Every time I changed something, they had to check that I'm not giving myself overly broad access. And because it's an Excel, they can't really look at the diff. Every time, they have to open it and then maybe use the Excel version history to know what changed? I have no idea how they actually made that workable.
Yeah, the whole time I was thinking, please just let me edit an Ansible inventory file instead. I get that they have non-technical users, but believe it or not, it does not actually make it simpler, if you expose the same technical fields in a spreadsheet and then still use a pull request workflow and everything...
So, this is completely off topic, but some of the comments here reminded me of it:
An elderly family friend was spending a lot of her time using Photoshop to make whimsy collages and stuff to give as gifts to friends and family.
I discovered that when she wanted to add text to an image, she would type it out in Microsoft Word, print it, scan the printed page, then overlay the resulting image over the background with a 50% opacity.
I showed her the type tool in Photoshop and it blew her mind.
A bit late to the party on this one, but Facepunch just opensourced a bunch of their code, I nominate that.
Here is my story:
There were console outputs after nearly every line. I asked about them: "Oh, I couldn't get the debugger to work, so I print everything to the console"
This was everywhere. The whole program was like this. On a standard Linux machine. It wasn't even remote debugging or something. Just a local C++ program.
The filenames where written in 8+3. Again, on a modern Linux machine. His answer? "You never know where we'll port this software to"
Onto computers that were outdated decades ago? To embedded systems? Of course he had no answer for this except "just in case..."
I could tell you more, that software was the stuff for nightmares.
I found code that calculated a single column in an HTML table. It was “last record created on”.
The algorithm was basically:
foreach account group
foreach account in each account group
foreach record in account.records
if record.date > maxdate
max = maxdate
It basically loaded every database record (the basic unit of record in this DATA COLLECTION SYSTEM) to find the newest one.
Customers couldn’t understand why the page took a minute to load.
It was easily replaced with a SQL query to get the max and it dropped down to a few ms.
The code was so hilariously stupid I left it commented out in the code so future developers could understand who built what they are maintaining.
Programmer Humor
Welcome to Programmer Humor!
This is a place where you can post jokes, memes, humor, etc. related to programming!
For sharing awful code theres also Programming Horror.
Rules
- Keep content in english
- No advertisements
- Posts must be related to programming or programmer topics
