Many sites use a CAPTCHA to prevent bots or automation from using functions like email for… nefarious purposes.
However, there’s not any one standard–nothing is stopping people from using home-brewed solutions, or ones that either have vulnerabilities or can be gotten around.
Image recognition and OCR are not the only ways to get past a CAPTCHA. If the software you test uses these, this is a fun way to find a bug that really oughta be fixed.
Today I’ll demonstrate a method using DOM manipulation.
Here We Go!
Step 1: Find a wild CAPTCHA:
This code is mAYr3QK. We’ll use it later on.
Step 2: Right-click it and select “Inspect Element” in the menu (Chrome and Firefox):
Then take a look at the code driving the CAPTCHA:
Try this a few times and notice what changes and what doesn’t. In this case, there are four pieces of info that look like they’re being dynamically generated:
- two hidden
tsparameter in the
Pieces of data like this are generated as a result of making the CAPTCHA. There’s an algorithm that reads them back in and rebuilds the CAPTCHA string, then compares it to what you entered. This data is sent out in the HTML code, so that it can be picked back up. It’s kind of like a secret decoder ring. It also helps the server to not have to remember what it sent to what user.
But, since we have that data in our hands, we’re allowed to change it in the browser. So let’s try it.
Get another CAPTCHA generated (either by refreshing the page, or clicking to get a new one, if available), and repeat the actions of getting the code mentioned above:
At this point, we’ll change the
captcha_sid value, the
captcha_token value, and the
src attributes, to match the ones that we got the first time.
To change an attribute, double click it in the window in your browser, and modify it how you want. If you changed it to match the code for the previous maYr3QK code, then your code should look like this:
/en/image_captcha?sid=11345264&ts=1433615246" width="252" height="60" alt="Image CAPTCHA" title="Image CAPTCHA">
Since this is the exact same setup you had when you got the first CAPTCHA, the image on the screen should change to the code that you already know about.
From there, try entering the known code and see if you can fool the site into doing whatever the CAPTCHA was meant to protect.
$b.execute_script("var element = document.evaluate('//input[@name="captcha_sid"]', document, null, XPathResult.ANY_UNORDERED_NODE_TYPE, null ).singleNodeValue;
if (element != null)
If your software can have this done, the CAPTCHA is able to be gotten around, automatically. Probably not good. Here are some possible ways to combat this happening to you:
- If you’re using a custom CAPTCHA that you built, consider having something in the hidden values that contains what time the CAPTCHA was generated. If you get back values that indicate a creation longer ago than seems reasonable for a person sitting there filling it in, reject the answer.
- If you have the horsepower, you could feasibly remember all the CAPTCHA challenge strings sent to an IP address. If a new one is requested, drop the previous one and store the new one. That approach would prevent any data getting sent out that someone could use to bypass your challenge.
- Try a stronger type. reCAPTCHA is pretty good (also helps OCR software for eBooks know what certain words are that it has trouble with!).
And, if you have a nice streak, and find this tactic working for a site you like, consider taking the time to explain how you got past it. You can even link to this post.
Have fun, and happy testing!