Friday, October 22, 2010

Calculating the number of repeating objects in an image

Recently I was asked to help solve a problem determining the number of seats in an airline. An example of such an airline layout is shown below (click to view the bigger image.)



I decided to use the method I knew best, Cross Correlation. The idea is to cross correlate the template of the seat (i.e., an image of a single seat) with every pixel in the airline layout image. The template (coordinate origin is its center) is moved over to a particular pixel in the image. The cross correlation coefficient is calculated and the value is used as the intensity of a new image. This is repeated by moving the template to every pixel in the airline layout image. The pixels for which the template matches perfectly with the airline layout image, will have the correlation coefficient close to 1.


Above: A hard to see template of the seat.

The results of the cross correlation is shown below (click to view the bigger image.)

The bright spots in this image are the points with the highest correlation. It then becomes a simple process of segmenting the high intensity pixel.

To perform these operations, my natural choice was Matlab and its Image processing toolbox.

im = imread('airline_seating.jpg');
im = rgb2gray(im);

im_template = imread('template.jpg');
im_template= rgb2gray(im_template);

C = normxcorr2(im_template, im);
C1 = C>0.7;

stat = regionprops(C1);
noofseats = size(stat,1);
disp(['Number of seats = ',num2str(noofseats)]);


In the first 4 lines of code, we read the airline layout image and the template image. To obtain the correlation image, I did not have to write my own correlation function instead Matlab has one already ready to be used. This function, normxcorr2 needs the airline layout and the template matrix. Once the correlation image is obtained, we segment it based on the logic that any pixel with value more than 0.7 is considered as pixels corresponding to the center of seat. Since the center of the seat did not segment as a single pixel, I could not count the number of pixels as the number of seats. Instead I calculated the number of regions using regionprops and store it as a structure. The number of elements of the structure is the number of seats.

Wednesday, October 6, 2010

Spamming of HTML forms - one case

Recently I found that a newspaper in its online edition switched from image based CAPTCHA system to solving a mathematical puzzle, in-order to prevent spamming of their comment section using a computer program. A screen capture of the same can be found below.



The problem with such a system is that they can be easily solved using a computer, which defeats the purpose of using it to differentiate human and computers apart. To test my own skill, I wanted to write a program that can download the page, read it and solve the puzzle as well. Using the information I obtain, I could then post comments without human intervention.

To accomplish this task, I used the usual suspects like python and and the HTML parser, BeautifulSoup. BeautifulSoup reads a string of html or xml and converts it to a tree. Using the tree, it is easy to navigate through the tags or search for a particular one based on id or name. It is also powerful enough to differentiate tags based on CSS class in html tags.


1. import urllib
2. from BeautifulSoup import BeautifulSoup
3. import string,re

4. doc = urllib.urlopen('http://www.somesight.com/comment/reply/1854565').read()
5. soup = BeautifulSoup(''.join(doc))

6. a = soup.findAll("span",{"class":"field-prefix"})
7. b = a[0].contents[0].split("=")[0].split("+")
8. c = [int(bs) for bs in b]
9. captcha_response = sum(c)
10. print a,captcha_response

11. token1 = soup.findAll("input",id="edit-captcha-token")
12. token1_val = token1[0]['value']
13. print token1,token1_val


The two important information that I need to calculate are the captcha_response which is the solution to the mathematical problem and the captcha_token, a hidden html field in the webpage. Line #6 searches for a class, field-prefix in span tag. This tag contains the string for the mathematical puzzle that needs to be solved. I obtain the contents of this string and split it in-order to obtain the individual numbers in a list. Finally I convert those numbers from string to integer in Line #8 and sum them using line #9.

Line #11 searches the hidden captcha token, stored in the input tag with id="edit-captcha-token".

Armed with these two information, we can post any name and comment to the form. The comments were moderated but it would still require lot of human intervention to clear the spams.

I informed the webmaster of this issue. They have since moved to a image based system. I removed all reference to the site in this blog post and program in-order to keep their anonymity.