Crazy Blind Daters' identities not so hidden.

January 21, 2013

Programming

Last weekend, OkCupid has launched CrazyBlindDate, which is a website that allows you to go on blind dates with people in your area. There have already been a few discussions about it on Hacker News and other websites, so I'm not going to discuss the web site itself because most of those conversations have already been talked about. I checked out the site after getting an invitation in my inbox. On the landing page it presents you a list of people that you could go on a blind date with, which showed their first name, date and time of the date, a location, and a scrambled picture of the person. The reason a date is considered blind is because you don't know anything about your date, or what they look like. Providing minimal details about them is a good way to do this, and having some sort of picture of them to see that they are a real person.

Love is Blind

At first glance, I thought the picture was generated from multiple pictures and selected for parts of the picture that had the person's face on it somehow. Upon closer inspection, I realized that the sub-pictures seemed to be connected somehow, and suspected that it was probably a form of a 15-puzzle and it would be pretty easy to unscramble. After trying it out manually in paint.net, I was able to unscramble one of the pictures within a few minutes. Most of the time was spent trying to figure out how to get it to split the image into 16 layers that I could individually drag around.

Single Scrambled

At this point, since I'm a programmer, I wondered how hard it would be to write a program automatically figure out how to arrange them. I fired up Visual Studio and started a new project. The first step would be to get the images and split them into their 16 separate image parts. This ended up being fairly simple after figuring out how to use the imaging library in WPF since I haven't used it in a while.

var bmpImg = new BitmapImage(new Uri(baseFileName, UriKind.Absolute));
var baseImage = BitmapFactory.ConvertToPbgra32Format(bmpImg);
int pieceHeight = (int)(baseImage.Height / 4);
int pieceWidth = (int)(baseImage.Width / 4);
for (int x = 0; x < 4; x++) {
    for (int y = 0; y < 4; y++) {
        var bmp = BitmapFactory.New(pieceHeight, pieceWidth);
        bmp.Blit(
            new Rect(0, 0, pieceWidth, pieceHeight),
            baseImage,
            new Rect(x * pieceWidth, y * pieceHeight,
                pieceWidth, pieceHeight));
        imagePieces.Add(bmp);
    }
}

The next step was to figure out how related each tile was to the other ones. To do this, I took each pair of images and calculated the average color difference between the pixels that would be adjacent to each other if it was the correct placement. In the code snippet below, a is to the left of b, so using the far right pixels of a and the far left pixels of b. The idea for this is that if you took adjacent pixels in the original image, they should be close in color for the most part, unless they were part of a sharp edge, which shouldn't happen too often.

foreach (var a in Enumerable.Range(0, imagePieces.Count)) {
    foreach (var b in Enumerable.Range(0, imagePieces.Count)) {
        var aImg = imagePieces[a];
        var bImg = imagePieces[b];
        leftRightScores[new Tuple(a, b)] =
            Enumerable.Range(0, pieceHeight)
                .Average(i => ColorDiff(aImg.GetPixel(pieceWidth - 1, i),
                bImg.GetPixel(0, i)));
    }
}

After this was done, the goal is now to place the 16 tiles in the correct arrangement, which would be the one without any errors caused by having a tile next to one that wasn't its correct neighbor. I thought about using brute force to figure the best score from all of the possibilities, but realized that there were 16! ~= 20,900,000,000,000 combinations, which is more than is feasible to do. This means I need to turn to a heuristic. My first try was to be greedy and place a random tile initially in the middle of a larger grid (9x9), and then pick the tile that matches best with that one next to it, and then repeat until all the tiles have been placed. Unfortunately, that didn't work very well. My next idea was to use a recursive search algorithm that was able to prune large subsets of the search tree out, like the common solution to the Eight Queen's Problem we had to do in one of the computer science classes I took at college. Basically, you recursively try to place each piece in the next available slot, and stop trying if there are too many errors. Some simplified code for this is below (note that this is missing some variables that are passed in to make it simpler in this post)

private double RecursiveSolver(IList remaining, int[,] fullMap) {
    var score = Evaluate(fullMap);
    if (score < THRESHOLD !remaining.Any()) return score;
    double alpha = double.MaxValue;
    int[,] best = null;
    foreach (var next in remaining) {
        var mapCopy = (int[,])fullMap.Clone();
        mapCopy[x, y] = next;
        double s = RecursiveSolver(remaining.Where(i => i != next)
            .ToList(), mapCopy);

        if (s < alpha) {
            alpha = s;
            best = mapCopy;
        }
    }
    
    for (int xx=0; xx<4; xx++)
        for (int yy=0; yy<4; yy++)
            fullMap[xx, yy] = best[xx, yy];
    
    return alpha;
}

The threshold variable above is the only thing that needed tweaking. I tried doing it manually by hand, but if it was too large, then it would never finish since there were so many paths to take. If the number was too low, then it would finish really fast but with no solutions. To fix this, I dynamically determine what the threshold should be by starting with a small threshold, and gradually increase it until the algorithm finishes. I got the idea for this from iterative deepening, but it's not quite the same process.

double beta = 0.000001;
do {
    fullMap = new int[4, 4] { ... }; // initialized to -1

    RecursiveSolver(Enumerable.Range(0, 16).ToList(),
        fullMap,
        beta);
    
    beta *= 1.05;
} while (!FullyDone(fullMap));

And finally, with all of this, I have some reconstructed images.

scrambled

As you can see, some of them didn't work right, since the images had a solid colored border, which the algorithm thought were perfect matches. If I were going farther, I would try to fix this, but as you can see, it does a pretty good job of reconstructing the images. Note that I've blurred out their faces to protect their identities. I'm not entirely sure what I think about having done this, but it does make me personally feel less likely to use the service since I know that I can see their pictures, and that kind of defeats the purpose of it being a blind date if I can use their picture to select which ones I am interested in. The source code is available on github for those that are interested.