538
Jenga Stack (lemmy.blahaj.zone)
you are viewing a single comment's thread
view the rest of the comments
[-] chellomere@lemmy.world 10 points 1 day ago* (last edited 1 day ago)

HarfBuzz does opentype shaping, that is, transforming strings of unicode characters to lists of glyphs with positioning. The significance of this can be hard to understand for someone used to the latin script, as that needs very little shaping - kerning is often the only thing that's absolutely necessary.

But in complex scripts, most notably the Indic, there's a lot going on. Unicode characters can merge into one glyph under circumstances, or one character can split into several, and relative positioning in both the x and y axis is imperative.

A reason that OpenType shaping is complex is that part of the rules for what to do will be found in the font, and part will need to be hard-coded in the code implementing it.

If you're going to roll your own text renderer, you'll have to care about the following areas:

  • Rasterization/rendering to bitmaps, including hinting (notoriously difficult, old-style TrueType hinting instructions are bytecode, so you'll be writing a tiny VM for this)
  • Shaping (Kerning at a minimum, full OpenType shaping for international support)
  • BiDi (for full international support, primarily Hebrew and Perso-Arabic)
  • A caching system for rendered text glyphs and shaped text runa, as it will be too slow to perform this each time you want to render some text

Let's just say that I do not recommend going this route unless you're prepared to spend a lot of time on it.

[-] calcopiritus@lemmy.world 2 points 1 day ago* (last edited 1 day ago)

I've got all that. I just needed to convert a string of characters into a list of glyph IDs.

For context, I'm doing a code editor.

I don't use harfbuzz for shaping or whatever, since I planned on rendering single lines of mono spaced text. I can do everything except string->glyphs conversion.

Just trying to implement basic features such as ligatures is incredibly hard, since there's almost no documentation. Therefore you can't make assumptions that are necessary to take shortcuts and make optimizations. I don't know if harfbuzz uses a source of documentation that I haven't been able to find, or maybe they are just way smarter than me, or if fonts are made in a way that they work with harfbuzz instead of the other way around.

As someone trying to have as little dependencies as possible, it is a struggle. But at the same time, harfbuzz saved me soo much time.

EDIT: I don't do my own glyph rasterization, but that's because I haven't gotten to it yet, so I do use a library. I don't know if it's going to be harder than string->glyphs, but I doubt so.

[-] chellomere@lemmy.world 3 points 1 day ago* (last edited 1 day ago)

It would make sense that a code editor could use a more limited subset of text rendering that could be more optimized.

Perhaps a bit surprisingly, Microsoft actually has pretty good documentation on OpenType. Here's info on what shaping applies to "standard" scripts:

https://learn.microsoft.com/en-us/typography/script-development/standard

And here's the landing page for the latest OpenType spec:

https://learn.microsoft.com/en-us/typography/opentype/spec/

Specifically for ligatures, you're looking for the liga feature which is specified in the font's GSUB table.

this post was submitted on 20 Nov 2025
538 points (97.2% liked)

Programmer Humor

27444 readers
2114 users here now

Welcome to Programmer Humor!

This is a place where you can post jokes, memes, humor, etc. related to programming!

For sharing awful code theres also Programming Horror.

Rules

founded 2 years ago
MODERATORS