StickerBox: Why We Failed When We Tried to Segment Faces With Algorithms

5 min readAug 8, 2018

The idea of StickerBox came to us when we found out that our corporate messenger Slack supports custom-made emojis. A nice way to cheer up boring corporate chats, isn’t it?

Creating our own emojis for Slack became our favorite corporate pastime. Most of all, we enjoyed making emojis out of our teammates’ photos — the reaction was always priceless. Soon we figured: if we loved it so much, other people would also.

That’s how StickerBox, a mobile app allowing users to transform selfies or friends’ photos into stickers and share them across social media, was officially born.

The main issue we had to resolve in order to let StickerBox see the light of day was face segmentation. We needed technology that would find a face and hair in a photo, separate them from the rest of the body, and cut them out from the background for further use. In this series of articles, we describe what we did to resolve the task step-by-step. Stay with us if you want to know the outcome!

Step 1: Defining object contours

What we used. We started with Canny Edge Detector, a mathematical algorithm that detects the edges of an object in an image. The idea was to use this algorithm so that StickerBox could define the contours of the face and hair in an uploaded image and further use them to create a sticker.

Did it work?

Original picture / Picture with detected edges after Canny Edge was applied

Not as we expected. As you can see in the picture above, the algorithm helped us reduce the amount of data to be processed by StickerBox as it removed colors and noise from the images. However, it couldn’t cut the face and hair out from the background and remove other objects from an image.

Step 2: Recognizing the face boundaries

What we used. Now that we could detect the contours of the face and hair, we needed to remove the background and the rest of the objects from an image. To do that, we used an algorithm that detects facial landmarks: the nose, mouth, and eyes. Taking them as starting points, the algorithm defined the potential contours of a face and drew a “donut” (as we call it) around them, while the other objects remained unidentified and were automatically removed from the image.

Did it work?

Finding the face landmarks (nose, eyes, mouth, ears) and creating the “donut”, which predicts the area that should contain the face boundaries

Telling the face edges apart from the other edges (by finding the intersection between edges found by the Canny Edge Detector and the head area predicted by the “donut”) and applying the mask to the original image

The red line in the above picture displays the actual result the algorithm produced in most cases — a face with distinct contours and no redundant objects in the background.

However, another problem emerged. Occasionally, the algorithm might leave contours not connected at random spots (see the picture below). As a result, when we applied filling to the face, it was applied to the unnecessary parts in the image as well.

Step 3: Connecting contour ends. Method A

What we used. First, we tried the superpixels method. The algorithm worked simply: it detected the places where the face contours were not connected and enlarged the radius of pixels there to connect the contour ends together.

Did it work?

Not in 100 percent of the cases, as seen above. If the gaps between the contour ends were too big, the pixels also became too large and the final picture looked a bit crooked.

Step 3: Still connecting the contour ends. Method B

What we used. The Graham algorithm was another option we resorted to when trying to connect face contours. The algorithm took the existing face contours as a baseline and wrapped them around with another line, thus removing gaps between unconnected pixels.

Did it work?

In most cases, yes, when it did not occasionally include irrelevant pixels to an image, as you can see in the picture above (outlined with the blue line).

A pivot in the approach: Trying to detect face color

What we used. At this stage, we decided to change the method. We used Color Blob Detection, an algorithm detecting the color of objects rather than their contours. The algorithm took a sample of face and hair color and then applied it to the entire area.

Did it work?

Outlining the predicted face area and recognizing face color pallette

Applying the Color Blob Detection method with the palette that we extracted

Not always. Sometimes, there were problems with defining hair color, especially when the hair was too short or dyed several colors. The same problem occurred with beards. Moreover, the face filling was sometimes also automatically applied to necks, obviously because they are the same color as faces, as you can judge from the picture above.

Sometimes, the hair was not included because of the color difference

Conclusion

The result that was provided with the help of algorithms described above was not bad, but not good enough and quite unstable. Well, at least, we learned a lesson. Not everything in software can be done algorithmically; sometimes we need something “cleverer.”

In the second article, we are going to tell you how neural networks helped us with face segmentation. Stay tuned!

Originally published at https://yellow.systems.

StickerBox: Why We Failed When We Tried to Segment Faces With Algorithms

Step 1: Defining object contours

Step 2: Recognizing the face boundaries

Step 3: Connecting contour ends. Method A

Step 3: Still connecting the contour ends. Method B

A pivot in the approach: Trying to detect face color

Conclusion

Written by Yellow

No responses yet