Rucaptcha.com-client

CAPTCHAFORUM

Administrator
rucaptcha-client.jpg


he ruCaptcha.com service solves, as already mentioned, a wide variety of captchas , from reCaptcha of all types and versions to keyCaptcha , hCaptcha and FunCaptcha ; As a basis for the experiment, we will take the most probably currently popular solution on the web - reCaptcha v.2 , and this is the official demo from Google:

URL_RECAPTCHA = 'https://www.google.com/recaptcha/api2/demo'

By the way, instead of google.com, we could easily substitute the address of a page, for example, my blog, written on the basis of the Ruby on Rails framework , where reCaptcha works through the ambethia / recaptcha jam ; everything will turn out the same. I will not say that it is always like this: for example, reCaptcha on the pages of the blog that you are currently leafing through is called by the scripts of K2, a Joomla component , with which everything is probably not so unambiguous. But more often than not, the described scenario will work, which we will now check.

We read the ruCaptcha.com API documentation : so, first of all, we need the data-sitekey value on the captcha page. Ok, let's parse HTML and quickly find what we need:

Code:
  def data_sitekey

    # Parsing url and getting a data-sitekey of recaptcha

    url = URL_RECAPTCHA

    html = open(url)

    doc = Nokogiri::HTML(html)

    doc.xpath('//@data-sitekey')

  end

Let me explain for those who are just making their first steps in OOP, object-oriented programming: now a call to the data_sitekey method gives us the desired data-sitekey value , which you can find yourself in the source code of the page containing reCaptcha . It's not difficult at all, huh? - yes, this is the magic of Nokogiri .

Now, having received the data-sitekey reCaptcha , we can already form the first request to the ruCaptcha.com API , as a response to which we will receive the identifier of the task set for the RuCaptcha.com service ...

Code:
  def first_request

    target = 'https://rucaptcha.com/in.php'

    params = {

      key: APIKEY,

      method: 'userrecaptcha',

      googlekey: data_sitekey,

      pageurl: URL_RECAPTCHA

    }

    request(target, params)

  end

... which we implement with another method, called by me request , to which we pass target and params by calling . The composition of params , I think, is now quite transparent: the access key issued by the service during registration, the method we are accessing (described in the documentation), the data-sitekey value just received, and the address of the captcha page.

Something like that:
Code:
  def request(target, params)

    uri = URI.parse(target)

    uri.query = URI.encode_www_form(params)

    uri.open.read

  end

Having received the ruCaptcha.com API response , containing the ID and thus confirming that the task has been received and accepted for work, we wait 10 seconds and send the second request; if the answer already received for it does not include OK (most often CAPCHA_NOT_READY is returned, meaning that you need to wait a little longer), we repeat it with the same interval over and over again until the token we are looking for is finally returned:

Code:
     target = 'https://rucaptcha.com/res.php'

      params = {

        key: APIKEY,

        action: 'get',

        id: answer.gsub('OK|', '')

      }



      1.times do

        begin

          sleep 10

          request = request(target, params)

          raise unless request.include? 'OK'

        rescue StandardError

          retry

        end

      end

What to do with the received answer? - hmm, well, it's a matter of taste. Alternatively, you can substitute it in the field hidden on the page with the id g-recaptcha-response , which is what the reCaptcha solution implies . But only this is a completely, completely different story ... go to the light, we will continue.

Documentation https://github.com/cmirnow/rucaptcha-client