Ruminations on excerpts of research papers, blogs and books

Types of codebases/software

In my little time of learning and loving programming, I have come about to view it as a tool. There are people who love and cherish their tool, there are those who simply use them to get work done, there those that want to master the tool, automate the tool's job, etc, etc.. This tool has given rise to enormous amounts of economic value, entertainment and livelihoods. This tool has led us to built projects, from big ones to small, from simple to complex. Here, I am writing about certain types of projects one might encounter on Github. Of course not every project will fit into these categories that I present, but I implicitly tend to categorise projects I encounter into some category, hence I might as well write about them here.

Small scale business applications, most programmers when starting on their journey to become an SWE would start here. These often serve a simple and small business need, and include a simple CRUD app with a db (all the various stacks). Note that it is not the stack, but the scale that determines that these projects are albeit easy to get into, and serve a real life need. Could be your To-Do lists or smaller scale e-commerce sites.

Small libraries, not a lot of programmers would venture out to build libraries, but still a lot actually do. These libraries generally serve a single purpose and do it well. They are generally written by some person(s) who came across a unique problem, and being the great programmer that they are, implement a solution and gift it to the world. Some good examples are MiniSearch, an in-memory text search library in Javascript, or HNSW in Go, which implements the HNSW vector index in Golang. Small, elegant yet powerful.

Large scale applications, one normally gets here by simply scaling up the above mentioned business applications, and eventually run into unique and hard problems. These problems are worked on and solved by some of the best programmers in the world, and often have a user base in the millions. Usual examples (at least the ones which are open-source), are Telegram, Tiaga and one of my favourite, Tldraw. (Is Tldraw a legit business application ? Don't know, but love the repository).

Large/Complex libraries, probably contains some of the best (if not THE best) code written by us. Larger libraries and business applications differ in their goal and how they plan on achieving it: one is profit oriented, while the other is more or less scientific. I am not sure if this distinction holds in some examples I am about to give, but nonetheless, these massive libraries are all open sourced, and hence provide for some excellent codebases to read through. Some favourites: Postgres, Linux, Torchlib, Glasgow Haskell Compiler, LLVM, React, Numpy, FFmpeg, SQLite, V8 and finally GCC.

Why more examples of larger libraries ? Because they more or less drive the software world ahead and have stood the test of time, with relentless innovations and selfless contributions from the open source community.

Hosted on streams.place.