How GIMP’s Color-to-Alpha Tool Works

10 min readJul 28, 2023

GIMP is well known for being one of the best free pieces of software for image editing and manipulation. Nearly as powerful as Adobe’s Photoshop, it boasts a ton of features and scripts/plugins for experimentation as well as practical use. of

We’re going to be recreating one of my favorite tools, color-to-alpha, from GIMP in Python. In addition, we will also be using Streamlit to create an interactive page where users can upload photos, choose a color, and experiment with options and see the effects in real time.

About Color-to-Alpha

Color-to-alpha (CTA henceforth) is a simple algorithm to take a color and simply make those pixels transparent on an image. However, CTA will take nearby colors and adjust the alpha levels accordingly so that the user can get a feathering effect where edges are soft. Using this tool, the user can place another background under an image and pixels that are partially transparent will allow the original background to flow through

Picture with white background, background removed, background replaced with green — Using CTA to replace a background

CTA was used here to remove the white background on this 32x32 image of a smiley face I quickly made. This image is placed above a mint green background. Importantly, there are a few notable differences between the images besides just the background color that I will address later

How the Algorithm Works

GIMP’s documentation tells us exactly how the algorithm works:

At the risk of being a bit technical, this can be visualized by thinking of the RGB cube. The background color is a point within the cube, and the transparency and opacity thresholds are two sub-cubes centered around the background color. Everything inside the transparency-threshold cube becomes fully transparent, everything outside the opacity-threshold cube remains fully opaque, and everything in between gradually transitions from transparent to opaque

RGB Cube visualization — RGB cube. Courtesy of Wikimedia, SharkD, CC BY-SA 4.0 <https://creativecommons.org/licenses/by-sa/4.0>, via Wikimedia Commons

Since each pixel in an image has a red, green, and blue channel, we can visualize all possible colors in RGB-space as a cube where each dimension corresponds to a channel. The algorithm takes in a transparency and opacity threshold used to determine the alpha levels of colors within the bounds. To explain, let’s consider the following slice of the RGB cube generated by some simple Python code:

from PIL import Image

img = Image.new('RGB', (256, 256), (0, 0, 0))
for h in range(256):
    for w in range(256):
        img.putpixel((w, h), (w, 200, h))

Let us also a consider a pixel in this image. The transparency and opacity thresholds can be visualized as concentric squares around the pixel where the radius of the transparency square is always less than that of the opacity square. And pixels within the blue transparency square will turn transparent while pixels outside the purple opacity square will remain untouched. Pixels between the squares will be assigned an alpha value depending on where they fall in the region.

color to alpha applied onto an RGB-square — CTA applied on an RGB-square

Beginning the Implementation

Let’s use the following 2x3 image as a simple example:

3x2 test Image — Test Image with 6 different colors

We will use Python’s PIL and NumPy libraries to convert the image into a 3D array of pixel values as follows

from PIL import Image
import numpy as np

img = Image.open('32test.png')
np.array(img)

> array([[[134, 134, 134, 255],
          [ 90, 137, 255, 255],
          [255,  90,  90, 255]],
  
         [[255, 255, 255, 157],
          [  0,   0,   0, 255],
          [255, 255, 255, 255]]], dtype=uint8)

The array has dimensions 3x2x4 since the image is 3x2 and there are 4 channels (RGBA). We will disregard the alpha channels and work with only the RGB-space.

CTA uses an orthogonal distance metric to determine pixel distances in RGB-space; It determines the distance in each dimension and then returns the max of those. We can also opt to use Euclidean distance (or shortest distance) as a distance metric between two colors. This means considering concentric spheres around our color as the transparency and opacity thresholds. GIMP only uses the orthogonal distance between pixels, so this is an opportunity to add options and functionality to the CTA algorithm

def rgb_distance(pixels: np.array, color: np.array, shape='cube'):
    '''
    If shape is 'cube', the radius is the maximum orthogonal distance between the two colors
    If shape is 'sphere', the radius is the distance between the two colors in 3D space
    
    Returned value should always be between 0 and 255
    '''

    # Ensure parameters are RGB (three channels)
    pixels = pixels[:,:,:3]

    # Take advantage of numpy's vectorization here
    if shape == 'cube':
        return np.amax(abs(pixels - color), axis=2)
    elif shape == 'sphere':
        return np.linalg.norm(pixels - color, axis=2)

When we run the function with our target color as white we get the following

pixels = np.array(img)
white = np.array([255,255,255])
rgb_distance(pixels, white, shape='cube')

> array([[121, 165, 165],
         [  0, 255,   0]])

Getting the alpha (transparency) values are also not difficult since they are based solely off where the pixel falls between the two thresholds

threshold_difference = opacity_threshold - transparency_threshold
alpha = (distances - transparency_threshold) / threshold_difference
alpha = np.clip(alpha, 0, 1)

We can run this code with a transparency threshold of 100 and an opacity threshold of 200 to get our alpha array

array([[0.21, 0.65, 0.65],
       [0.  , 1.  , 0.  ]])

Working with Semitransparency

The most difficult part of the implementation is determining the RGB values for the pixels that are semi-transparent. It may seem correct to simply apply the alpha value onto the pixels within the thresholds to get the transparency effect. However, we must remember that an idealized output from the CTA algorithm would not allow us to recover any information about the target color; all traces of it would have been removed. Thus, it is correct to discard the RGB values of those pixels.

bar gradients of colors showing to transparent and to color — A gradient split into a foreground to transparent and a background

Consider the above image. A gradient (bottom) is split into a foreground-to-transparent gradient layered on top of the background. In the original gradient, the RGB value of the pixels is constantly changing in RGB-space. However in the top gradient, the RGB values are constant and only the alpha value changes. In this case, it is impossible to determine that the gradient has previously faded to blue based solely on the top gradient. This is our goal.

Let’s look at a few example points to learn more about the implementation:

the RGB color space with annotations and markup

In the picture above, we can see the effects of our parameters on the color space we have selected. The target color is shown in green, and the transparency and opacity thresholds are blue and pink respectively. The points contained within the blue square will become completely transparent, and the point located outside of the pink square will remain unaltered.

Since the other 3 points are between the two thresholds, we will need to extrapolate along the lines passing through them until we reach the opacity threshold to get the new color information to use for those pixels. The alpha values for the pixels will be approximately 0.45, 0.33, and 0.8 (top to bottom)

The extrapolation of the final point simply needs to by clamped to the RGB space to make it produce a valid color.

Creating the Algorithm

We will begin by disregarding all alpha information from the original image and work with a copy of the original pixels with no alpha. We will also apply our distance function to get the distance of each pixel in the screen from our target color.

# Make new pixels and th channel for RGBA
pixels = pixels[:,:,:3]
new_pixels = np.copy(pixels)
new_pixels = np.append(new_pixels, np.zeros((new_pixels.shape[0], new_pixels.shape[1], 1), dtype=np.uint8), axis=2)

# Get the distance matrix
distances = rgb_distance(pixels, color, shape=shape)

We will also need to create masks to tell the program which pixels are completely transparent and opaque. These will be Boolean arrays with the same dimensions as the original image

# Create masks for pixels that are transparent and opaque
transparency_mask = distances <= transparency_threshold
opacity_mask = distances >= opacity_threshold

After calculating the alpha values as mentioned previously, we can change the interpolation of the alpha values if we wish. GIMP’s interpolation is done linearly, so this is again another option to expand upon the original. Since alpha is always in the interval [0, 1], we can choose to use other functions that pass through (0, 0) & (1, 1) where 0 < f(alpha) < 1

# Different scalar interpolation functions
def interpolate(x: int | float, interpolation=None):
    if interpolation == 'power':
        return x**2
    elif interpolation == 'root':
        return np.sqrt(x)
    elif interpolation == 'smooth':
        return (np.sin(np.pi/2*x))**2
    elif interpolation == 'inverse-sin':
        return np.arcsin(2*x-1)/np.pi + 0.5
    else:
        return x

# Interpolate based on method provided
alpha = interpolate(alpha, interpolation=interpolation)

To extrapolate onto the opacity threshold, we will determine how close the pixel is to the opacity threshold from the target color and multiply the displacement vector by the reciprocal (divide). Division by zero will happen here, but that pixel is the target color which we already determined will become completely transparent.

# Extrapolate along line passing through color and pixel onto the opacity threshold
# This is the RGB value that will be used for the pixel
proportion_to_opacity = distances / opacity_threshold
extrapolated_colors = (pixels - color) / proportion_to_opacity[:, :, np.newaxis] + color

Out of good practice, we will use NumPy’s nan_to_num function before we clamp the extrapolated pixels to [0, 256].

extrapolated_colors = np.nan_to_num(extrapolated_colors, nan=0)
extrapolated_colors = np.clip(np.around(extrapolated_colors), 0, 255).astype(np.uint8)

Finally, we will reassign the color values of only the semitransparent pixels. This is where the layer masks come in. We will also change the alpha values of all pixels (no masks needed here since it is 0 when transparent and 1 when opaque).

# Reassign color values of intermediate pixels
new_pixels[~transparency_mask & ~opacity_mask, :3] = extrapolated_colors[~transparency_mask & ~opacity_mask]
# Reassign the alpha values of intermediate pixels
new_pixels[:, :, 3] = alpha * 255

After returning the update pixel matrix, we have successfully finished the algorithm!

Deploying as a Web App

We can use Streamlit to take our Python script and deploy it as a simple interactive web app. I won’t be going into great detail on the creation of the app, but I will review a few of the most important parts

We can make a file uploader in the side panel with the following code:

st.sidebar.title('Upload an Image')
file = st.sidebar.file_uploader('Original image', type=['png', 'jpg', 'jpeg', 'webp'])
if file is None:
    img = Image.open('colExp.png')
else:
    img = Image.open(file)

If the user has not yet uploaded a file, we will use an example file as default. We will add some sliders and selectors to change the settings of the function

# User settings
st.sidebar.title('Settings :gear:')
shape = st.sidebar.selectbox('Shape (used for calculating distance in RGB-space)', ['sphere', 'cube'])
interpolation = st.sidebar.selectbox('Interpolation', ['linear', 'power', 'root', 'smooth', 'inverse-sin'])

top_threshold_bound = 255 if shape == 'cube' else 442
transparency_threshold = st.sidebar.slider('Transparency Threshold', 0, top_threshold_bound, 18)
opacity_threshold = st.sidebar.slider('Opacity Threshold', 0, top_threshold_bound, 193)

Keeping in mind that if the user wants to use the Euclidean (spherical) distance metric, that the furthest distance between two points in the RGB-space is approximately 442. To help the user select an appropriate color to remove, we will recommend the top colors in the image to the user with the following auxiliary function:

# To get the top colors in the image
def get_pixel_distribution(img):
    width, height = img.size
    counter = dict()

    # Count the number of pixels of each color
    for h in range(height):
        for w in range(width):
            pix = img.getpixel((w, h))
            if pix not in counter:
                counter[pix] = 1
            else:
                counter[pix] += 1

    # Sort the dictionary by value, highest to lowest
    counter = sorted(counter.items(), key=lambda x: x[1], reverse=True)
    return counter

Finally to finish the process and the web app, we will save it to a BytesIO buffer and retrieve the information when the user clicks the download button.

# For downloading the new image
def convert_image(img):
    buf = BytesIO()
    img.save(buf, format='PNG')
    byte_im = buf.getvalue()
    return byte_im


# Add download button
col2.download_button('Download Image', convert_image(cta_img), file_name='color_to_alpha.png', mime='image/png')

Takeaways and Learnings

Creating the web app was a much easier task than coding the CTA algorithm. During my first attempt to make this, I used two for loops to traverse through each pixel in the image. This was a valid method since the new RGBA values for each pixel are independent of other pixels in the image and only a factor of the user settings. Or another way to say it is that the output is consistent with respect to the parameters (color, thresholds, interpolation, shape).

But I quickly learned how to make use of NumPy’s vectorization abilities and worked with the ndarrays themselves rather than each of their values individually. This was the first time I found that feature to be more of a necessity than a luxury (for aesthetic and efficiency purposes) and I feel confident in my ability to vectorize multidimensional data with NumPy.

Right now the algorithm will affect all pixels of a given color no matter where they are in the image. Maybe CTA could be combined with a background/edge detector to help feather the edges around foreground objects when removing their backgrounds.

But in the meantime, feel free to experiment with the algorithm here. Happy editing!