123

"No Duh," say senior developers everywhere.

The article explains that vibe code often is close, but not quite, functional, requiring developers to go in and find where the problems are - resulting in a net slowdown of development rather than productivity gains.

you are viewing a single comment's thread
view the rest of the comments
[-] Baguette@lemmy.blahaj.zone -1 points 7 months ago

I'd be inclined to try using it if it was smart enough to write my unit tests properly, but it's great at double inserting the same mock and have 0 working unit tests.

I might try using it to generate some javadoc though.. then when my org inevitably starts polling how much ai I use I won't be in the gutter lol

[-] sugar_in_your_tea@sh.itjust.works 1 points 7 months ago

I personally think unit tests are the worst application of AI. Tests are there to ensure the code is correct, so ideally the dev would write the tests to verify that the AI-generated code is correct.

I personally don't use AI to write code, since writing code is the easiest and quickest part of my job. I instead use it to generate examples of using a new library, give me comparisons of different options, etc, and then I write the code after that. Basically, I use it as a replacement for a search engine/blog posts.

[-] MangoCats@feddit.it -1 points 7 months ago

Ideally, there are requirements before anything, and some TDD types argue that the tests should come before the code as well.

Ideally, the customer is well represented during requirements development - ideally, not by the code developer.

Ideally, the code developer is not the same person that develops the unit tests.

Ideally, someone other than the test developer reviews the tests to assure that the tests do in-fact provide requirements coverage.

Ideally, the modules that come together to make the system function have similarly tight requirements and unit-tests and reviews, and the whole thing runs CI/CD to notify developers of any regressions/bugs within minutes of code check in.

In reality, some portion of that process (often, most of it) is short-cut for one or many reasons. Replacing the missing bits with AI is better than not having them at all.

[-] Nalivai@lemmy.world 1 points 7 months ago

Replacing the missing bits with AI is better than not having them at all.

Nah, bullshit tests that pretend to be tests but are essentially "if true == true then pass" is significantly worse than no test at all.

[-] MangoCats@feddit.it 0 points 7 months ago

bullshit tests that pretend to be tests but are essentially “if true == true then pass” is significantly worse than no test at all.

Sure. But, unsupervised developers who: write the code, write their own tests, change companies every 18 months, are even more likely to pull BS like that than AI is.

You can actually get some test validity oversight out of AI review of the requirements and tests, not perfect, but better than self-supervised new hires.

[-] sugar_in_your_tea@sh.itjust.works 1 points 7 months ago

Ideally, the code developer is not the same person that develops the unit tests.

Why? The developer is exactly the person I want writing the tests.

There should also be integration tests written by a separate QA, but unit tests should 100% be the responsibility of the dev making the change.

Replacing the missing bits with AI is better than not having them at all.

I disagree. A bad test is worse than no test, because it gives you a false sense of security. I can identify missing tests with coverage reports, I can't easily identify bad tests. If I'm working in a codebase with poor coverage, I'll be extra careful to check for any downstream impacts of my change because I know the test suite won't help me. If I'm working in a codebase with poor tests but high coverage, I may assume a test pass indicates that I didn't break anything else.

If a company is going to rely heavily on AI for codegen, I'd expect tests to be manually written and have very high test coverage.

[-] MangoCats@feddit.it 2 points 7 months ago

but unit tests should 100% be the responsibility of the dev making the change.

True enough

A bad test is worse than no test

Also agree, if your org has trimmed to the point that you're just making tests to say you have tests, with no review as to their efficacy, they will be getting what they deserve soon enough.

If a company is going to rely heavily on AI for anything I'd expect a significant traditional human employee backstop to the AI until it has a track record. Not "buckle up, we're gonna try somethin'" track record, more like two or three full business cycles before starting to divest of the human capital that built the business to where it is today. Though, if your business is on the ropes and likely to tank anyway.... why not try something new?

Was a story about IBM letting thousands of workers go, replacing them with AI... then hiring even more workers in other areas with the money saved from the AI retooling. Apparently they let a bunch of HR and other admin staff go and beefed up on sales and product development. There are some jobs that you want more predictable algorithms in than potentially biased people, and HR seems like an area that could have a lot of that.

[-] Nalivai@lemmy.world -1 points 7 months ago

Why? The developer is exactly the person I want writing the tests.

It's better if it's a different developer, so they don't know the nuances of your implementation and test functionality only, avoids some mistakes. You're correct on all the other points.

[-] MangoCats@feddit.it 1 points 7 months ago

I'm mixed on unit tests - there are some things the developer will know (white box) about edge cases etc. that others likely wouldn't, and they should definitely have input on those tests. On the other hand, independence of review is a very important aspect of "harnessing the power of the team." If you've got one guy who gathers the requirements, implements the code, writes the tests, and declares the requirements fulfilled, that better be one outstandingly brilliant guy with all the time on his hands he needs to do the jobs right. If you're trying to leverage the talents of 20 people to make a better product, having them all be solo-virtuoso actors working independently alongside each other is more likely to create conflict, chaos, duplication, and massive holes of missed opportunities and unforeseen problems in the project.

[-] sugar_in_your_tea@sh.itjust.works 1 points 7 months ago

I really disagree here. If someone else is writing your unit tests, that means one of the following is true:

  • the tests are written after the code is merged - there will be gaps, and the second dev will be lazy in writing those tests
  • the tests are written before the code is worked on (TDD) - everything would take twice as long because each dev essentially needs to write the code again, and there's no way you're going to consistently cover everything the first time

Devs should write their tests, and reviewers should ensure the tests do a good job covering the logic. At the end of the day, the dev is responsible for the correctness of their code, so this makes the most sense to me.

[-] themaninblack@lemmy.world -1 points 7 months ago

Saved this comment. No notes.

[-] Baguette@lemmy.blahaj.zone -1 points 7 months ago* (last edited 7 months ago)

To preface I don't actually use ai for anything at my job, which might be a bad metric but my workflow is 10x slower if i even try using ai

That said, I want AI to be able to do unit tests in the sense that I can write some starting ones, then it be able to infer what branches aren't covered and help me fill the rest.

Obviously it's not smart enough, and honestly I highly doubt it will ever be because that's the nature of llm, but my peeve with unit test is that testing branches usually entail just copying the exact same test but changing one field to be an invalid value, or a dependency to throw. It's not hard, just tedious. Branching coverage is already enforced, so you should know when you forgot to test a case.

Edit: my vision would be an interactive version rather than my company's current, where it just generates whatever it wants instantly. I'd want something to prompt me saying this branch is not covered, and then tell me how it will try to cover it. It eliminates the tedious work but still lets the dev know what they're doing.

I also think you should treat ai code as a pull request and actually review what it writes. My coworkers that do use it don't really proofread, so it ends up having some bad practices and code smells.

[-] sugar_in_your_tea@sh.itjust.works 0 points 7 months ago

testing branches usually entail just copying the exact same test but changing one field to be an invalid value, or a dependency to throw

That's what parameterization is for. In unit tests, most dependencies should be mocked, so expecting a dependency to throw shouldn't really be a thing much of the time.

I’d want something to prompt me saying this branch is not covered, and then tell me how it will try to cover it

You can get the first half with coverage tools. The second half should be fairly straightforward, assuming you wrote the code. If a branch is hard to hit (i.e. it happens if an OS or library function fails), either mock that part or don't bother with the test. I ask my team to hit 70-80% code coverage because that last 20-30% tends to be extreme corner cases that are hard to hit.

My coworkers that do use it don’t really proofread, so it ends up having some bad practices and code smells.

And this is the problem. Reviewers only know so much about the overall context and often do a surface level review unless you're touching something super important.

We can make conventions all we want, but people will be lazy and submit crap, especially when deadlines are close. >

[-] Baguette@lemmy.blahaj.zone 0 points 7 months ago

The issue with my org is the push to be ci/cd means 90% line and branch coverage, which ends up being you spend just as much time writing tests as actually developing the feature, which already is on an accelerated schedule because my org has made promises that end up becoming ridiculous deadlines, like a 2 month project becoming a 1 month deadline

Mocking is easy, almost everything in my team's codebase is designed to be mockable. The only stuff I can think of that isn't mocked are usually just clocks, which you could mock but I actually like using fixed clocks for unit testing most of the time. But mocking is also tedious. Lots of mocks end up being:

  1. Change the test constant expected. Which usually ends up being almost the same input just with one changed field.
  2. Change the response answer from the mock
  3. Given the response, expect the result to be x or some exception y

Chances are, if you wrote it you should already know what branches are there. It's just translating that to actual unit tests that's a pain. Branching logic should be easy to read as well. If I read a nested if statement chances are there's something that can be redesigned better.

I also think that 90% of actual testing should be done through integ tests. Unit tests to me helps to validate what you expect to happen, but expectations don't necessarily equate to real dependencies and inputs. But that's a preference, mostly because our design philosophy revolves around dependency injection.

[-] sugar_in_your_tea@sh.itjust.works 0 points 7 months ago* (last edited 7 months ago)

I also think that 90% of actual testing should be done through integ tests

I think both are essential, and they test different things. Unit tests verify that individual pieces do what you expect, whereas integration tests verify that those pieces are connected properly. Unit tests should be written by the devs and help them prove their solution works as intended, and integration tests should be written by QA to prove that user flows work as expected.

Integration test coverage should be measured in terms of features/capabilities, whereas unit tests are measured in terms of branches and lines. My target is 90% for features/capabilities (mostly miss the admin bits that end customers don't use), and 70-80% for branches and lines (skip unlikely errors, simple data passing code like controllers, etc). Getting the last bit of testing for each is nice, but incredibly difficult and low value.

Lots of mocks end up being

I use Python, which allows runtime mocking of existing objects, so most of our mocks are like this:

@patch.object(Object, "method", return_value=value)

Most tests have one or two lines of this above the test function. It's pretty simple and not very repetitive at all. If we need more complex mocks, that's usually a sign we need to refactor the code.

dependency injection

I absolutely hate dependency injection, most of the time. 99% of the time, there are only two implementations of a dependency, the standard one and a mock.

If there's a way to patch things at runtime (e.g. Python's unittest.mock lib), dependency injection becomes a massive waste of time with all the boilerplate.

If there isn't a way to patch things at runtime, I prefer a more functional approach that works off interfaces where dependencies are merely passed as needed as data. That way you avoid the boilerplate and still get the benefits of DI.

That said, dependency injection has its place if a dependency has several implementations. I find that's pretty rare, but maybe its more common in your domain.

[-] FishFace@lemmy.world -1 points 7 months ago

The reason tests are a good candidate is that there is a lot of boilerplate and no complicated business logic. It can be quite a time saver. You probably know some untested code in some project - you could get an llm to write some tests that would at least poke some key code paths, which is better than nothing. If the tests are wrong, it's barely worse than having no tests.

[-] theolodis@feddit.org 1 points 7 months ago

Wrong tests will make you feel safe. And in the worst case, the next developer that is going to port the code will think that somebody wrote those tests with intention, and potentially create broken code to make the test green.

[-] sugar_in_your_tea@sh.itjust.works 1 points 7 months ago

Exactly! I've seen plenty of tests where the test code was confidently wrong and it was obvious the dev just copied the output into the assertion instead of asserting what they expect the output to be. In fact, when I joined my current org, most of the tests were snapshot tests, which automated that process. I've pushed to replace them such with better tests, and we caught bugs in the process.

[-] sugar_in_your_tea@sh.itjust.works 1 points 7 months ago

better than nothing

I disagree. I'd much rather have a lower coverage with high quality tests than high coverage with dubious tests.

If your tests are repetitive, you're probably writing your tests wrong, or at least focusing on the wrong logic to test. Unit tests should prove the correctness of business logic and calculations. If there's no significant business logic, there's little priority for writing a test.

[-] Draces@lemmy.world -1 points 7 months ago

What model are you using? I've had such a radically different experience but I've only bothered with the latest models. The old ones weren't even worth trying with

[-] jjjalljs@ttrpg.network 0 points 7 months ago

One of the guys at my old job submitted a PR with tests that basically just mocked everything, tested nothing. Like,

with patch("something.whatever", return_value=True):
  assert whatever(0) is True
  assert whatever(1) is True

Except for a few dozen lines, with names that made it look like they were doing useful.

He used AI to generate them, of course. Pretty useless.

[-] MangoCats@feddit.it 1 points 7 months ago

We have had guys submit tests like that, long before AI was a thing.

this post was submitted on 30 Sep 2025
123 points (96.9% liked)

Technology

84199 readers
251 users here now

This is a most excellent place for technology news and articles.


Our Rules


  1. Follow the lemmy.world rules.
  2. Only tech related news or articles.
  3. Be excellent to each other!
  4. Mod approved content bots can post up to 10 articles per day.
  5. Threads asking for personal tech support may be deleted.
  6. Politics threads may be removed.
  7. No memes allowed as posts, OK to post as comments.
  8. Only approved bots from the list below, this includes using AI responses and summaries. To ask if your bot can be added please contact a mod.
  9. Check for duplicates before posting, duplicates may be removed
  10. Accounts 7 days and younger will have their posts automatically removed.

Approved Bots


founded 2 years ago
MODERATORS