Lines of Code

Mar 28, 2020 · ☕ 5 min read

Lines of Code is a lousy metric — at least on a small scale. I accept that it may be useful to compare big projects or repositories, but most often, when I encounter it, it is just meaningless.

The Big Scale

It may be valuable on a big scale.

Google Stores Billions of Lines of Code in a Single Repository

A billion is a big number. This is meaningful. We can state things and ask questions about it.

  • Google has a huge repository. Maybe even the biggest one in the world.
  • Did they modify git/svn/Perforce to make it possible?
  • My workflow is git-centric. I’m curious about how this repository affects their workflow? How is it different from mine?

The Linux kernel has around 27.8 million lines of code in its Git repository

  • The Linux kernel is way smaller than the Google codebase.
  • It’s still a tremendous effort.
  • But hey. Isn’t it too big for a “kernel”? Oh! The drivers are there. Makes sense!

Developers seem to care about lines of code much more than I’d expect.

Why care about LoC?

I heard a story about a programmer paid by lines of code. Hillarious. I hope it was a joke. This is like paying a construction worker by the materials he uses! Lines of code have no intrinsic value. This is a cost we pay, a ballast.

We can use it as an approximation of the effort we already invested. Better yet, we can take numbers of added and deleted lines of code through a timespan and compare it to LoC of the entire repository, so we get the idea of how the project is changing over time. I’d argue that a changelog or a list of work items (Jira tickets, PRs merged?) gives us more information.

On its own, LoC of a file for example — It’ just a number. Sad and lonely.

150 🍎 < 75 🍊

Can we say that a 500 line long file is long? Or short? Or is it just perfect?

We can’t and this would be pointless. Don’t let any lint rule tell you otherwise. We need more information. The language isn’t enough. Let’s take modern JavaScript.

Is it declarative, imperative, or functional? The answer may differ between files in the same repository, and when we’re actually close enough to read what’s inside the file, we may form better models to think about it than LoC.

Let me continue with the declarative. Declarative is easier to read than imperative, right? We don’t have to think as much while reading, because it just is, while imperative does. For example, HTML is easier to read than JavaScript of the same length.

In a React class component, 150 lines of JSX in a render function is “less code” than 75 lines of class logic.

I will go further. 150 lines of JSX in a functional component is “less code” than 75 lines of hooks. useState, useEffect, useLayoutEffect, useSelector, useMachine. A lot happens there. The difference may not be as big as when comparing declarative UI composition with imperative code in lifecycle hooks, but I’d argue that it still holds. We have fewer things to comprehend in JSX because much of it is self-explanatory. (Go away <Fetch /> component, you’re the outlier.)

This is all JavaScript, but aren’t we comparing apples to oranges?

There are different kinds of code.

  • Declarative, but non-functional code, meaning HTML, CSS, SQL and GraphQL may be verbose, but it’s trivial to read.
  • Imperative code will certainly be harder to read, and way harder to maintain.
  • Functional code doing the same thing may be more concise and easier to debug, but require detailed reading at first.

We can divide it in more ways! In the same language, the same codebase, there will be some important code and some cheap code. The text doesn’t hold this information.

Better Metrics

We would like to measure things that matter, obviously.

Ease of comprehension is a good one. It is a very soft thing, though. Can we measure something easier and assume it’s correlated with cognitive complexity?

Enter cyclomatic complexity, a measure of the number of linearly independent paths in a program’s control flow graph.

The more paths we have, the more we need to think about, and what’s important, the more we have to test.

Further reading

Table of Contents

Edited 4 times
  1. 49a3e89Sep 14, 2020More work on brain-style notes
  2. 5822ca5Apr 17, 2020Fix few errors in Lines of Code article
  3. 949d361Mar 29, 2020Add table of contents
  4. 239d1fbMar 28, 2020Add "Lines of Code"