[-] Corbin@programming.dev 1 points 3 days ago

If you want to know how Google specifically does things, search for "TeraGoogle"; it's not a secret name although I don't think it has a whitepaper. The core insight is that there are tiers of search results. When you search for something popular that many other people are searching for, your search is handled by a pop-culture tier which is optimized for responding to those popular topics. The first and second pages of Google results are served by different tiers; on Youtube, the first few results are served from a personalized tier which (I expect has) cached your login and knows what you like, and the rest of the results are from a generalist tier. This all works because searches, video views, etc. are Pareto-allocated; most of the searches are for a tiny amount of cacheable content.

There's also a UX component. Suppose that you dial Alice's server and Alice responds with a Web app that also fetches resources from Bob's server. This can only be faster for you in the case where Bob is so close to you (and so responsive) that you can dial Bob and get a reply faster than Alice finishes sending her app. But Alice and Bob are usually colocated in a datacenter, so Alice will always be closer to Bob than you. This suggests that if Alice wants to incorporate content from Bob then Alice might as well dial Bob herself and not tell you about Bob at all. This is where microservices shine. When you send a search to Google, Youtube, Amazon, or other big front pages, you're receiving a composite result which has queries from many different services mixed in. For the specific case of Google, when you connect to google.com, you're connecting to a machine running GWS, and GWS connects to multiple search backends on your behalf.

Finally, how typical of a person are you? You might not realize how often your queries are handled by pop-culture tiers. I personally have frequent experiences where my search turns up zero documents on DDG or Google, where there are no matching videos on Youtube, etc. and those searches take multiple seconds to come up empty. If you're a weird person who constantly finds googlewhacks then you're not going to perceive these services as optimized for you, because they cannot optimize for the weird.

[-] Corbin@programming.dev 34 points 9 months ago

The author would do well to look up SGML; Markdown is fundamentally about sugaring the syntax for tag-oriented markup and is defined as a superset of HTML, so mistaking it for something like TeX or Word really demonstrates a failure to engage with Markdown per se. I suppose that the author can be forgiven somewhat, considering that they are talking to writers, but it's yet another example of how writers really only do research up to the point where they can emit a plausible article and get paid.

It’s worth noting that Microsoft bought PowerPoint, GitHub, LinkedIn, and many other things—but it did in fact create Word and Excel. Microsoft is, in essence, a sales company. It’s not too great at designing software.

So close to a real insight! The correct lesson is that Microsoft, like Blizzard, is skilled at imitating what's popular in the market; like magpies, they don't need to have a culture of software design as long as they have a culture of software sales. In particular, Microsoft didn't create Word or Excel, but ripped off WordPerfect and Lotus 1-2-3.

[-] Corbin@programming.dev 25 points 1 year ago

Yeah, writing your own squeeblerizer sucks, but there's no better option. GNU Scrimble can be used off-the-shelf as a passthrough, so the only real tasks are implementing Squeeb's algorithm and a sprongler; then, your entire pipeline is merely something like:

$ gscrimble --passthrough --args -- ./your_squeeb | ./your_sprongler

Edit: Whoops! Forgot to mention, GNU Scrimble also has Snorble support out-of-the-box, and Scrimble clients have content auto-negotiation, so your_squeeb can just take JSON on stdin. GNU Scrimble is really nice for this sort of thing, just...big.

And if you want to sprongle directly into a database or etc. then you can write your_sprongler to taste. Full disclosure: I have a fairly fast implementation of Squeeb's algorithm in rpypkgs. However, I'd really recommend writing your own; it's like twenty lines of code you can copy from Wikipedia and it'll give you a good basis for extending it with your own desired changes later.

You can read snorblite's code if you need to figure out a specific sprongling technique, but it's way easier to just go look up the original SprongCode from SprongReg. Use a search engine to get around the university's paywall. This gets you the SprongCode UUID and you don't have to read code written by a batshit fascist.

[-] Corbin@programming.dev 31 points 1 year ago

It's because the Booleans sometimes are flipped in display-server technology from the 1980s, particularly anything with X11 lineage, and C didn't have Boolean values back then. More generally, sometimes it's useful to have truthhood be encoded low or 0, as in common Forths or many lower-level electrical-engineering protocols. The practice died off as popular languages started to have native Boolean values; today, about three quarters of new developers learn Python or ECMAScript as their first language, and FFI bindings are designed to paper over such low-level details. You'll also sometimes see newer C/C++ libraries depending on newer standards which add native Booleans.

As a fellow vim user with small hands, here are some tricks. The verb gU will uppercase letters but not underscores or hyphens, so sentences like gUiw can be used to uppercase an entire constant. The immediate action ~ which switches cases can be turned into a verb by :set tildeop, after which it can be used in a similar way to gU. If constants are all namespaced with a prefix followed by something unique like an underscore, then the prefix can be left out of new sections of code and added back in with a macro or a :%s replacement.

[-] Corbin@programming.dev 18 points 1 year ago

Well, I don't want to pull the kernel-hacker card, but it sounds like you might not have experienced being yelled at by Linus during a kernel summit. It's not fun and not worth the money. Also it's well-known that LF can't compete with e.g. Collabora or Red Hat on salary, so the only folks who stick around and focus on Linux infrastructure for the sake of Linux are bureaucrats, in the sense of Pournelle's Iron Law of Bureaucracy.

[-] Corbin@programming.dev 37 points 1 year ago

Sounds like it's time to start training code-writing models on leaked Microsoft source code. Don't worry, it's not like it'll "emit memorized code".

[-] Corbin@programming.dev 38 points 2 years ago

Because frankly, Ronald (the current maintainer, not the original author) is very competent. I say this as somebody who has personally been yelled at by Ronald at a kernel summit; I didn't deserve it, but none of his technical points were wrong. I like to think of myself as the kind of person that, given enough time and documentation, can maintain anything; I think it'd still take three of me to do Ronald's job. (Well, "job." I think he technically works for Red Hat or something?) Not to excuse his conduct, just to explain why he's not been replaced yet.

[-] Corbin@programming.dev 27 points 2 years ago

Other way around, actually; C was one of several languages proposed to model UNIX without having to write assembly on every line, and has steadily increased in abstraction. Today, C is specified relative to a high-level abstract machine and doesn't really resemble any modern processing units' capabilities.

Incidentally, coming to understand this is precisely what the OP meme is about.

[-] Corbin@programming.dev 22 points 2 years ago

Direct rendering infrastructure in Linux predates widespread use of "digital rights management" as a term of art by about two or three years. "We were here first," as the saying goes. That said, the specific concept of direct rendering managers is a little newer, and probably was a mistake on its own merits, regardless of the name.

[-] Corbin@programming.dev 23 points 2 years ago

It's because most of the hard questions and theorems can be phrased over the Booleans. Lawvere's fixed-point theorem, which has Turing's theorem and Rice's theorem as special cases (see also Yanofsky 2003), applies to Boolean values just as well as to natural numbers.

That said, you're right to feel like there should be more explanation. Far too many parser papers are written with only recognition in mind, and the actual execution of parsing rules can be a serious engineering challenge. It would be nice if parser formalisms were described in terms of AST constructors and not mere verifiers.

130
56
8

The abstract:

This paper presents μKanren, a minimalist language in the miniKanren family of relational (logic) programming languages. Its implementation comprises fewer than 40 lines of Scheme. We motivate the need for a minimalist miniKanren language, and iteratively develop a complete search strategy. Finally, we demonstrate that through sufcient user-level features one regains much of the expressiveness of other miniKanren languages. In our opinion its brevity and simple semantics make μKanren uniquely elegant.

14

Everybody's talking about colored and effectful functions again, so I'm resharing this short note about a category-theoretic approach to colored functions.

[-] Corbin@programming.dev 26 points 2 years ago

Most consumer-grade NICs have a default MAC address which is retrievable with device drivers, but delegate (Ethernet) packet assembly to the OS. If the OS asks the NIC to emit a packet, then the NIC often receives the packet as a blob, DMA'd from main memory, and emits the bytes as octets. Other NICs do manage packet assembly, but allow overwriting the default MAC address. By the time I was learning Linux, we had GNU MAC Changer available in userland with the macchanger command, and many distros have configuration for randomizing or hardcoding MAC addresses upon boot.

I want to say that this is all because olden corporate network management policies could require a technician to replace a NIC without changing the MAC address, but more likely it is because framing and packet assembly was not traditionally handed to a second controller, and was instead bit-banged or MMIO'd by the CPU.

[-] Corbin@programming.dev 19 points 2 years ago

Hi! Please don't link anything from this subdomain again. It was considered a plague back on Reddit, and this sort of content-free post shouldn't be encouraged here either.

view more: next ›

Corbin

joined 2 years ago