February 16, 2020

Reverse Engineering GitHub’s Color Contrast Algorithm

Previously published at https://zephraph.svbtle.com/getting-a-visually-distinct-set-of-colors

I’m working on a tool called autobot that works along side auto to help library authors automate deployments on PR merges. For a particular feature that I’m working on, I needed to render a replica of GitHub’s labels in the body of a PR.

For such a simple problem, there are a ton of challenges. In this post, I’m going to talk about one challenge in particular.

GitHub labels have a configurable background color. Internally GitHub uses a color contrast algorithm to determine what the text color of the label should be for it to be legible.

Example labels from GitHub
Example labels from GitHub

The GitHub user @rtsao had built a similar service as to what I was trying to build and included an algorithm that he pulled directly from GitHub’s js bundle.

function isDarkColor(color) {
  const [r, g, b] = color.values;
  const yiq = (r * 299 + g * 587 + b * 114) / 1000;
  // Note: the value 150 is hardcoded into GitHub
  return yiq < 150;

I thought this solved all my problems… but, GitHub changed their color contrast algorithm.

From GitHub’s announcement, they changed their contrast algorithm to match with WCAG guidelines on contrast (which you can read here).

Just from my observation of labels, I’ve only seen either black or white text. Given that info, I found a package from NPM that would give a contrast score for two colors. I used that to pick the text color with the highest contrast score.

import { rgb } from "wcag-contrast";

const calcFontColor = (color: RGBColor) => {
  const blackContrastScore = rgb(color.values, [0, 0, 0]);
  const whiteContrastScore = rgb(color.values, [255, 255, 255]);

  return blackContrastScore >= whiteContrastScore ? "#000" : "#FFF";

This worked out pretty well and I thought it was good enough. I didn’t really have any info to back it up, but I shared it back with @rtsao just to be a good OSS citizen.

He responded with another algorithm he found in their JS bundle (which was different than the one he used previously) and a URL that GitHub uses to actually determine label previews on the labels page. Since we know GitHub uses the URL as a source of label colors, that’s our source of truth.


Now that we have a way of correctly determining label colors we can test the three algorithms to determine which one is correct (or at least the closest to being correct).

Easy, right?… well… maybe not. What colors do we choose to test? There are 16,777,215 colors that can be represented by a hex color code. If we requested all 16M colors from GitHub’s preview link, we’re definitely going to hit an abuse rate limit. If we added some time between each request to not trigger the abuse limit, say 250ms, it’d take about 4,194,304 seconds… or 1,165 hours. I don’t know about you, but that’s more time that I have to wait.

Thinking about it though, surely most colors are incredibly similar. If it’s hard for a person to recognize the difference between two colors, does it really matter to test both colors? I didn’t think so, so I set out to find an algorithm to tell me if two colors aren’t visually distinct.

Diving down this rabbit hole I stumbled upon a metric called delta E. GitHub user @zschuessler has an excellent resource which describes delta E as…

The measure of change in visual perception of two given colors

That sounded like it’s absolutely what I needed. Given an algorithm that can tell me if two colors have a low delta E (or they’re very visually similar) I can just discard one of them.

There are several algorithms that have been produced over the years to calculate this value, but the latest is called dE00 (for delta E 2000, the year it was made). Again, for more information about these algorithm’s checkout @zschuessler’s introduction to delta E.

I found a package on NPM called delta-e with the algorithm I wanted. I took an extremely naive approach to narrow down the color set. I simply iterated through all colors from #000000 to #FFFFFF (or 0 to 16,777,215) and dropped any colors that were similar to the current color. Colors would continue to be dropped until a visually distinct enough color was found and then that became the new current color. (Demo below)

This had a pretty surprising impact. The color set went from 16,777,215 to 383,254 which is a 98% decrease! 400k colors is still too much to hit GitHub’s api with though. I decided I only wanted to make 5k calls (which is the api rate limit for an hour for most of their endpoints). 383,254 / 5000 is about 77, so I took every 77th color from the resultant list an used that as the set of colors to test.

This is an extremely naive approach, but efficiency is better left to code that’ll actually run more than once.

I wrote a little codesandbox example to demonstrate what colors would be tested.

From this point I took the resulting list of colors and wrote a webdriver.io script to fetch all the colors from GitHub. That script essentially calls the URL with a given color, grabs the css color and background-color attributes from the preview label, and adds those colors to a file.

The only thing left to do a this point was to clean up the colors and write a test to find out the results.

Here’s what the final results were:

  • WCAG method I provided: 29/3901 invalid (99% success rate)
  • Old method (isDarkColor): 2903/3901 invalid (26% success rate)
  • Algorithm pulled from GitHub js bundle: 1049/3901 invalid (73% success rate)

In the end, the method I wrote to determine contrast wasn’t perfect, but it did a pretty good job!

All in all this was a fun learning experience. I hope you learned something along with me. I was pretty open about what I was learning in the original issue so check that out for more context.

Side note: If anyone from GitHub is reading this I’d really love it if you could share what algorithm you actually use with the community!

Thanks all for reading.

Be well.