How to properly document code? (lemmy.world)

submitted 2 years ago* (last edited 2 years ago) by Asudox@lemmy.world to c/programming@programming.dev

30 comments fedilink hide all child comments

I just recently started documenting my code as it helped me. Though I feel like my documentations are a bit too verbose and probably unneeded on obvious parts of my code.

So I started commenting above a few lines of code and explain it in a short sentence what I do or why I do that, then leave a space under it for the next line so it is easier to read.

What do you think about this?

Edit: real code example from one of my projects:

async def discord_login_callback(request: HttpRequest) -> HttpResponseRedirect:
    async def exchange_oauth2_code(code: str) -> str | None:
        data = {
            'grant_type': 'authorization_code',
            'code': code,
            'redirect_uri': OAUTH2_REDIRECT_URI
        }
        headers = {
            'Content-Type': 'application/x-www-form-urlencoded'
        }
        async with httpx.AsyncClient() as client:
            # get user's access and refresh tokens
            response = await client.post(f"{BASE_API_URI}/oauth2/token", data=data, headers=headers, auth=(CLIENT_ID, CLIENT_SECRET))
            if response.status_code == 200:
                access_token, refresh_token = response.json()["access_token"], response.json()["refresh_token"]

                # get user data via discord's api
                user_data = await client.get(f"{BASE_API_URI}/users/@me", headers={"Authorization": f"Bearer {access_token}"})
                user_data = user_data.json()
                user_data.update({"access_token": access_token, "refresh_token": refresh_token}) # add tokens to user_data

                return user_data, None
            else:
                # if any error occurs, return error context
                context = generate_error_dictionary("An error occurred while trying to get user's access and refresh tokens", f"Response Status: {response.status_code}\nError: {response.content}")
                return None, context

    code = request.GET.get("code")
    user, context = await exchange_oauth2_code(code)

    # login if user's discord user data is returned
    if user:
        discord_user = await aauthenticate(request, user=user)
        await alogin(request, user=discord_user, backend="index.auth.DiscordAuthenticationBackend")
        return redirect("index")
    else:
        return render(request, "index/errorPage.html", context)

top 30 comments

sorted by: hot top controversial new old

[-] zarlin@lemmy.world 46 points 2 years ago

The code already describes what it does, your comments should describe why it does that, so the purpose of the code.

[-] Carighan@lemmy.world 21 points 2 years ago

Yeah, my general rule of thumb is that the following 4 things should be in the documentation:

Why?
Why not?, which IMO is often more important as you might know a few pitfalls of things people might want to try but that aren't being done for good reasons.
Quirks and necessities of parameters and return values, this ensures that someone doesn't need to skim your code just to use it.
If applicable, context for the code's existance, this is often helpful years down the line when trying to refactor something.

[-] MagicShel@programming.dev 6 points 2 years ago

Yep. I mostly document why the obvious or best practice solution is wrong. And the answer is usually because of reliance on other poorly written code - third party or internal.

[-] CookieOfFortune@lemmy.world 19 points 2 years ago

Your code should generally be self documenting: Have variable and method names that make sense.

Use comments when you need to explain something that might not be obvious to someone reading the code.

Also have documentation for your APIs: The interfaces between your components.

[-] Aurenkin@sh.itjust.works 7 points 2 years ago* (last edited 2 years ago)

One interesting thing I read was that commenting code can be considered a code smell. It doesn't mean it's bad, it just means if you find yourself having to do it you should ask yourself if there's a better way to write the code so the comment isn't needed. Mostly you can but sometimes you can't.

API docs are also an exception imo especially if they are used to generate public facing documentation for someone who may not want to read your code.

Agree with you though, generally people should be able to understand what's going on by reading your code and tests.

[-] MajorHavoc@lemmy.world 4 points 2 years ago

Great points. I'm a huge advocate for adding comments liberally, and then treating them as a code smell after.

During my team's code reviews, anything that gets a comment invariably raises a "could we improve this so the comment isn't need?" conversation.

Our solution is often an added test, because the comment was there to warn future developers not to make the same mistake we did.

[-] silas@programming.dev 6 points 2 years ago

I know there are documentation generators (like JSDoc in JavaScript) where you can literally write documentation in your code and have a documentation site auto-generated at each deployment. There’s definitely mixed views on this though

[-] CookieOfFortune@lemmy.world -3 points 2 years ago

To my knowledge that just formats existing comments. With LLMs you could probably do 95% of the actual commenting.

[-] superb@lemmy.blahaj.zone 10 points 2 years ago

Useful comments should provide context or information not already available in the code. There is no LLM that can generate good comments from the source alone

[-] silas@programming.dev 3 points 2 years ago* (last edited 2 years ago)

Codium does surprisingly well at generating JSDoc, and it processes your code within the context of your entire codebase. Still not quite there yet, but you might be surprised

[-] CookieOfFortune@lemmy.world 1 points 2 years ago

Why wouldn’t it be able to? It can link similar code structure to data in its training set. Maybe the ones that aren’t at that level but it’s hardly a stretch to make these inferences. Most of the code you write is hardly novel.

[-] superb@lemmy.blahaj.zone 6 points 2 years ago

If it’s not exactly novel, how many comments do you really need?

An LLM is just gonna describe the code it sees. Good comments should include information and context that is not already in the source.

[-] CookieOfFortune@lemmy.world 1 points 2 years ago

I’m mostly talking about when you need to use JSDoc format which are usually for interfaces, so it’s usually just a chore for humans.

Probably harder to get good comments inside code, but it might still be possible.

[-] xmunk@sh.itjust.works 14 points 2 years ago

Good comments describe the "why" or rationale. Not the what. This function doesn't need any comments at all... but it needs a far better name like logAndReturnSeed. That said, depending on what specifically you're doing I'd probably advocate for not printing the value in this function because it feels weird so I'd probably end up writing this function like

def function rollD10() -> int:
    return random.randInt(1, 10)

And I, as a senior developer, think that level of comments is great.

You mentioned that this is a trivial example but the main skill in commenting is using it sparingly when it adds value - so a more realistic example might be more helpful.

[-] DroneRights@lemm.ee 13 points 2 years ago

The big problem in your code is that the function name isn't descriptive. If I'm 500 lines down seeing this function called, how do I know what you're trying to do? I'm going to have to scroll up 500 lines to find out. The function name should be descriptive.

[-] jeffhykin@lemm.ee 8 points 2 years ago* (last edited 2 years ago)

I expected this comment section to be a mess, but actually it's really good:

"why not what"
"as self-documenting as possible"

If you want an example, look at the Atom codebase. It is incredibly well done.

[-] MajorHavoc@lemmy.world 4 points 2 years ago

Great summary. The only thing I would add is that when we say "Answer Why?" we're implicitly inlcuding "WTF?!". It's the one version of "what" that's usually worth the window line space it costs. - Usually with a link to the unsolved upstream bug report at the heart of the mess.

[-] eclipse@lemmy.world 3 points 2 years ago

I'm curious as to thoughts regarding documenting intent which cross over with what in my opinion.

Regarding self documenting: I agree, but I also think that means potentially using 5 lines when 1 would do if it makes maintenance more straightforward. This crazy perl one liner makes perfect sense today but not in 3 years.

[-] heeplr@feddit.de 8 points 2 years ago

I find too verbose comments less annoying than no comments.

Try to describe the bigger picture. Good comments allow understanding the current portion of the code without reading other code.

Also add comments later if you find yourself having to read other code to understand the code you're currently looking at.

Comments are also a good place to write out abrevations/acronyms.

Never optimize for sourcecode size.

[-] shrugal@lemm.ee 7 points 2 years ago* (last edited 2 years ago)

I like to do two kinds of comments:

Summarize and explain larger parts of code at the top of classes and methods. What is their purpose, how do they tackle the problem, how should they be used, and so on.
Add labels/subtitles to smaller chunks of code (maybe 4-10 lines) so people can quickly navigate them without having to read line by line. Stuff like "Loading data from X", "Converting from X to Y", "Handling case X". Occasionally I'll slip in a "because ..." to explain unusual or unexpected circumstances, e.g. an API doesn't follow expected standards or its own documentation. Chunks requiring more explanation than that should probably be extracted into separate methods.

There is no need to explain what every line of code is doing, coders can read the code itself for that. Instead focus on what part of the overall task a certain chunk of code is handling, and on things that might actually need explaining.

[-] GravitySpoiled@lemmy.ml 4 points 2 years ago* (last edited 2 years ago)

Write comments for functions

"Function x creates a number and prints it to the console"

"Function x fetches new content from the fediverse"

Commenting print("hello world") with "print hello world" doesn't make too much sense

[-] Asudox@lemmy.world 1 points 2 years ago

Yeah I know that. I wrote that just as an attempt to show how it looks like. I won't document a print statement in my code.

[-] Synthead@lemmy.world 3 points 2 years ago* (last edited 2 years ago)

Imagine your "code" as English sentences. If it is hard to read, you might rephrase it. If something is getting long and drawn out, use paragraphs (methods and functions). At the end of the day, the easier it is to read, the better, unless there's a performance cost that's worthy of considering.

Like the top-level comment suggests, you should comment your methods. I would go one step further and use a standard comment format. I like Ruby, so immediately, I think YARDoc. With a YARDoc comment, you define what it does, the parameter types and descriptions, what it returns, possible exceptions that could be returned, etc.

Even better, by using standardized comments, not only does this make it easier to read by you and others, but most of the time, you get documentation rendered for free. For example, here is a library I wrote:

https://github.com/synthead/timex_datalink_client

And here is the automatically-generated HTML documentation:

https://www.rubydoc.info/gems/timex_datalink_client/0.12.3

More specifically, here's some YARDoc for a method:

https://github.com/synthead/timex_datalink_client/blob/263bffa8a71a0af792c46bea2ae69ab2a015f670/lib/timex_datalink_client/protocol_3/time.rb#L38-L46

And here is the generated documentation from this comment:

https://www.rubydoc.info/gems/timex_datalink_client/0.12.3/TimexDatalinkClient/Protocol3/Time#initialize-instance_method

This style of auto-generated documentation is available for pretty much all mature languages, and I highly recommend that you hit the ground running with them 👍

[-] Asudox@lemmy.world 3 points 2 years ago

Thanks. It seems interesting and useful.

[-] theherk@lemmy.world 4 points 2 years ago

I rarely read comments in code, that is from within source code anyway. I of course write comments explaining the behavior of public facing interfaces and otherwise where they serve to generate documentation, but very rarely otherwise. And I use that generated documentation. So in a roundabout way I do read comments but outside of the code base.

For instance I might use godoc to get a general idea of components but if I’m in the code I’ll be reading the code instead.

As others have said, your code generally but not always should clearly express what it does. It is fine to comment why you have decided to implement something in a way that isn’t immediately clear.

I’m not saying others don’t read comments in code; some do. I just never find myself looking at docs in code. The most important skill I have cultivated over the decades has been learning to read and follow the actual code itself.

[-] lemmyvore@feddit.nl 3 points 2 years ago

There are several types of documentation:

Line or block comments. Reserved for when you're doing something non-obvious, like a hack, a workaround because of a bug that can't be fixed yet etc. Designed to help other programmers (or yourself a few months later) to understand what's going on. Ideally you shouldn't have any of these but life ain't perfect.
If parts of your code are intended to be used as libraries, modules, APIs etc. there are standard methods of documenting those and extracting the documentation automatically in a readable format — like JavaDoc, Swagger etc. Modern IDEs will generate interface hints on the fly so most people nowadays rely on those, but they're not a 100% substitute for the human-written description next to a class or method.
Unit tests describe the intent for a piece of code and offer concrete pass/fail instructions. Same goes for other type of tests, like end to end tests, regression tests etc. All tests come with specific frameworks, which have their own methods of outlining specifications.
Speaking of specifications those are also a very important type of documentation. Usually provided by the product owner and fleshed out by technical people like architects or team leads, they're documented in tools like JIRA as part of the development process. They are at the core of the work done by programmers and testers.
Speaking of processes and procedures, it helps everybody if they're documented as well, usually in a wiki. They help a new hire get up to speed faster and they explain how the toolchains are set up for development, testing, deployment and bug fixing.
The human interfaces are a particularly interesting and important aspect and they're usually modeled and shared in specific tools by UX people.
Last but not least the technical as well as business designs should be documented as well. These usually circulate as PDF, DOC, Excel, PPT over email and file shares. Typically made and contributed to by business analysts and software architects.

[-] aluminium@lemmy.world 3 points 2 years ago* (last edited 2 years ago)

For new code I'm writing I'm using mostly JsDoc function headers on public methods of classes and exported functions. With one or two sentences explaining what function does.

Also try to explain what to expect in edge cases, like when you pass am empty string, null, ... stuff of that nature - for which I then create unit tests.

I also always mention if a function is pure or not or if a method changes the state of its object. On a sidenote I find it odd that almost no language has a keyword for pure functions or readonly methods.

If I add a big new chunk of code that spans multiple files but is somewhat closed off, I create a md file explaining the big picture. For example I recently added my own closed off library to my angular frontend that handles websocket stuff like subscribing, unsubscribing, buffering, pausing,... for which a created a md file explaining it.

[-] TellusChaosovich@lemmy.world 3 points 2 years ago

What is a pure function? Never heard that before.

[-] aluminium@lemmy.world 4 points 2 years ago* (last edited 2 years ago)

Essentially a function that doesn't produce side effects, like modifying variables outside of its scope or modifying the function parameters. This something you should always try to incorporate into your code as it makes it much easier to test and makes the function's use less risky since you don't relay on external unrelated values.

To give you an example in JavaScript, here are two ways to replace certain numbers from an other list of numbers with the number 0

first a way to do it with a non pure function :

let bannedNumbers = [4,6]

const nums = [0,1,2,3,4,5,6,7,8,9]

function replaceWithZero(nums){
    for (let i = 0 ; i &lt; nums.length; i++){
        if (bannedNumbers.includes(nums[i])){
            nums[i] = 0
        }
    }
}
replaceWithZero(nums)
console.log("numbers are : ", nums)

here the function replaceWithZero does two things that make it impure. First it modifies its parameter. This can lead to issues, for example if you have Second it uses a non-constant variable outside of its scope (bannedNumbers). Which is bad because if somewhere else in the code someone changes bannedNumbers the behavior of the function changes.

A proper pure implementation could look something like this :

const nums = [0,1,2,3,4,5,6,7,8,9]
function repalceWithZero(nums){
    const  bannedNumbers = [4,6]
    const result = []
    for(const num of nums){
        result.push(bannedNumbers.includes(num) ? 0 : num)
    }
    return result
}
const replaced = replaceWithZero(nums)
console.log("numbers are : ", replaced)

Here we are not modifying anything outside of the function's scope or its parameters. This means that no matter where, when and how often we call this function it will always behave the same when given the same inputs! This is the whole goal of pure functions.

Obviously in practice can't make everything 100% pure, for example when making a HTTP request you are always dependent on external factors. But you can try to minimize external factors by making the HTTP request, and the running the result only through pure functions.

[-] Miaou@jlai.lu 2 points 2 years ago

I really wouldn't call anything that hits the network pure, because errors are quite likely. But I guess we all put the bar at a different level, I would not count logging as a side effect yet I've been bitten by overly verbose logs in hot loops.

const-ness gives a mini version of purity, although nothing prevents someone from opening /etc/lol in a const function... I think GCC has a pure attribute but I don't think it's enforced by the compiler, only used for optimizations

this post was submitted on 22 Nov 2023

25 points (90.3% liked)

Programming

26220 readers

316 users here now

Welcome to the main community in programming.dev! Feel free to post anything relating to programming here!

Cross posting is strongly encouraged in the instance. If you feel your post or another person's post makes sense in another community cross post into it.

Hope you enjoy the instance!

Rules

Follow the programming.dev instance rules
Keep content related to programming in some way
If you're posting long videos try to add in some form of tldr for those who don't want to watch videos

Wormhole

Follow the wormhole through a path of communities !webdev@programming.dev

founded 2 years ago

MODERATORS

snowe@programming.dev

Ategon@programming.dev

UlrikHD@programming.dev

bugsmith@programming.dev

Spyro@programming.dev