Sunday, October 4, 2009

Scrubyt in the house

Okay, I figured out the problem with Scrubyt.  And guess what, it was a bad gem.  You know, I'm starting to hate all this stuff already.  I don't want to mess with gem package versions, and it looks like I'm getting into this too often nowadays.  

Whatever, lets go through the problem.

Here is my code:

  1 require 'rubygems'
  2 require 'scrubyt'
  3 
  4 google_data = Scrubyt::Extractor.define do
  5   fetch          'http://www.google.com/ncr'
  6   fill_textfield 'q', 'ruby'
  7   submit
  8 
  9   result 'Ruby Programming Language'
 10 end
 11 
 12 google_data.to_xml.write($stdout, 1)
 13 Scrubyt::ResultDumper.print_statistics(google_data)

I was running mechanize-0.9.3 (I think it was the most recent one).  The problem was solved with uninstalling 0.9.3, and getting 0.8.5:

white@iwhite:~> sudo gem install mechanize -v=0.8.5 
Successfully installed mechanize-0.8.5
1 gem installed

And now lets try the code:

white@iwhite:~> ruby s.rb
  <root>
    <result>Ruby Programming Language</result>
    <result>Download Ruby</result>
    <result>Ruby - The Inspirational Weight Loss Journey on the Style Network ...</result>
    <result>Ruby (programming language) - Wikipedia, the free encyclopedia</result>
    <result>Ruby - Wikipedia, the free encyclopedia</result>
    <result>Ruby on Rails</result>
    <result>Ruby&amp;#39;s Diner - rubys.com</result>
    <result>Ruby Central</result>
    <result>Ruby Annotation</result>
    <result>[Ruby-Doc.org: Documenting the Ruby  Language]</result>
    <result>Blog posts about ruby</result>
/Library/Ruby/Gems/1.8/gems/activesupport-2.3.4/lib/active_support/dependencies.rb:440:in `load_missing_constant': uninitialized constant Scrubyt::ResultDumper (NameError)
        from /Library/Ruby/Gems/1.8/gems/activesupport-2.3.4/lib/active_support/dependencies.rb:80:in `const_missing'
        from s.rb:13
  </root>

Wow, it works.

Well, we do have an error with ResultDumper, but that's on the easy side and can be ignored.  ResultDumper was indeed dropped, should be rewritten and back in the future.  But it's not a big deal though.

I know also that some people were able to tweak the line 4  of the code:

  4 google_data = Scrubyt::Extractor.define :agent => :firefox do

and the requests would be made through Firefox, but it didn't work for me:

white@iwhite:~> ruby s.rb 
/Library/Ruby/Gems/1.8/gems/firewatir-1.6.2/lib/firewatir/firefox.rb:271:in `set_defaults': Unable to connect to machine : 127.0.0.1 on port 9997. Make sure that JSSh is properly installed and Firefox is running with '-jssh' option (Watir::Exception::UnableToStartJSShException)
        from /Library/Ruby/Gems/1.8/gems/firewatir-1.6.2/lib/firewatir/firefox.rb:161:in `initialize'
        from /Library/Ruby/Gems/1.8/gems/scrubyt-0.4.06/lib/scrubyt/core/navigation/agents/firewatir.rb:17:in `new'
        from /Library/Ruby/Gems/1.8/gems/scrubyt-0.4.06/lib/scrubyt/core/navigation/agents/firewatir.rb:17:in `included'
        from /Library/Ruby/Gems/1.8/gems/scrubyt-0.4.06/lib/scrubyt/core/navigation/agents/firewatir.rb:16:in `module_eval'
        from /Library/Ruby/Gems/1.8/gems/scrubyt-0.4.06/lib/scrubyt/core/navigation/agents/firewatir.rb:16:in `included'
        from /Library/Ruby/Gems/1.8/gems/scrubyt-0.4.06/lib/scrubyt/core/shared/extractor.rb:20:in `include'
        from /Library/Ruby/Gems/1.8/gems/scrubyt-0.4.06/lib/scrubyt/core/shared/extractor.rb:20:in `define'
        from /Library/Ruby/Gems/1.8/gems/scrubyt-0.4.06/lib/scrubyt/core/shared/extractor.rb:19:in `class_eval'
        from /Library/Ruby/Gems/1.8/gems/scrubyt-0.4.06/lib/scrubyt/core/shared/extractor.rb:19:in `define'
        from s.rb:4

But this is a different and I actually never tried to hunt it down.  But I think it's an easy fix anyways. :)

Have fun!

# Posted via email from opportunity__cost