It’s always nice to understand how things work in the background, especially if your job surrounds that field. We know little of what hard labor our computers really handle while we pull all the easy strings. Game Developers, Graphic Designers, Photo Editors, Social Media Specialists, in fact, any field that has to do with you editing an image falls under what is referred to as Pixel Manipulation or if you really want to get technical Kernel Convolution.
Let’s start off by understanding filters, we use these a lot on social media, most people imagine this as an overlay image but it’s actually a combination of low-level processing of various types; blurs, sharpen, contrast changes, color changes, edge detection, etc. This is what a filter really is, it takes an image, processes it, and returns and outcomes based on the inputs.
Pixels & Binary
If you ever looked close enough you may have noticed squares bunched up together that end up creating an image, these squares are called a pixel and numerous image editors for both desktops & mobile exist today that allow anyone to manipulate pixels in a photo, so it’s very likely you have seen letters known as the RBG values before.
These letters are acronyms for Red, Blue, Green and are the three base colors used to mix and create all shades of colors on the color scale. An easy fun fact to remember these colors is that our human eyes also only see colors in Red, Blue & Green, we mix shades how computers do so we can visualize all the different colors the world has to offer.
In order for our devices to communicate color, it needs to translate into binary, so each pixel’s color is converted into 8 bits, also called 1 byte. To calculate bytes, each color holds a value from 0–255 that is a gradient of shades for that color, and they result in a blend to find the final color.
A stands for alpha, which is short for transparency, this can be used in numerous scenarios both 2D and 3D.
Hexadecimal is used in web development and other programming languages for a quick reference to color, instead of having to remember each R, G & B value you only need one value.
Images are filled with pixels, these pixels are what make up a complete image once zoomed out, and based on the color of these pixels, they are given a number. These numbers correspond to their shade in color, which is 0–255, in order to edit an image we need to find out what numbers our image is composed of.
This is where Pixel Manipulation/Kernel Convolution bakes the cake and gets the job done by passing over a small 3x3 brush filled with our inputs that scan over our image and mathematically to generate new colors for our desired effect by using what we input into our Kernel. The brush size dictates on our desired effect as well.
In order to manipulate an image, we create the “brush” we mentioned earlier where we can dictate a center and affect the outer permitter based on our inputs, remember each number is referring to the color of an image.
We scan over our image and based on our inputs we will modify the numbers, for a blur we multiply by 1 for each number we are overlaying. We then add the sum of all numbers inside our Kernel, ours, for example, is 37, then divide that result by our total sum of numbers which is 9, giving us a result of 4.11111, round to the nearest whole number, giving us 4.
That is the number replacing our center pixel, which was 2, and was replaced by 4, we continue to scan over every pixel in this manner resulting in a blurred image.
Edge detection is when we are trying to find the regions in an area where we have a sharp change in intensity or a sharp change in color. A high value indicates a steep change and a low value indicates a shallow change. We can detect edges by using something called the “Sobel Operator” which is a very common method used when detecting edges.
Our goal is finding the difference between our X & Y axis in our image, you will scan for sharp changes both in vertical and horizontal lines, in the image above we are doing the same thing we did for our blur, except our inputs have changed. We will multiply each pixel to find the Y-axis difference first then switch to a horizontal line for the X-axis.
This is our horizontal line that will return our X-axis, remember this is the input in our Kernel while it scans over our pixels and multiplies the values.
If we now output each axis as a grayscale image, we will have a mostly gray image with black on one side of our detected edges & white on the other side. What we really want though is a mixture of both these end results and we can do this by blending our X & Y values.
We use Pythagoras’s Theorem to calculate our blend and might I add this was discovered in 1900 B.C. ! For example, let’s say our Y value is 54 and X value is 23, what is our blend?
C = Final Blend
A = Y Axis
B = X Axis
54(2) + 23(2) = 2,916 + 529 = 3,445 (Square root)
C= Square Root = 58.69412
Our final outputted resulted should be something like a black and white image.
We can also calculate the angles by using “Octane” of (Axis Y / Axis X) that will return a value in degrees that say what orientation in the image is that particular pixel, which can be useful for finding structures or objects.
It’s unusual to run the “Sobel Operation” on color images, so your image will need to be converted into grayscale before any edge detection, so really what we are doing is manipulating the pixel intensity to find edges.
Another thing to mention is that the “Sobel Operation” is very noisy in the way it detects edges, for HD images it’s a good practice to blur the image before looking for edges, as this method can detect sharp changes in a photo that are not really edges, crisp folds, creaks, etc.
This is the basic idea of how filters are created, each has its own inputs based on mathematical equations that in the end give us our desired effects.