Thursday, October 1, 2009

Problem with Scrubyt

Well, working on one of the projects, I came over the Scrubyt.  It wasn't something very new to me, as we played with mashups for a while at Atomkeep and Everytalks.  But I really loved the execution piece.  It's simple.  It's Ruby.  It's Web scrapping bundle of tools. 


Well, only theoretically though.  It never worked for me.

I've tried to run the sample code:

white@iwhite:~> cat scr.rb 
require 'rubygems'
require 'scrubyt'

google_data = Scrubyt::Extractor.define do

   #Perform the action(s)
   fetch 'http://www.google.com/ncr'
   fill_textfield 'q', 'ruby'
   submit

   #Construct the wrapper
   link "Ruby Programming Language" do
      url "href", :type => :attribute
   end

   next_page "Next", :limit => 2
end

puts google_data.to_xml

And, guess what, it doesn't run.

white@iwhite:~> ruby scr.rb 
/Library/Ruby/Gems/1.8/gems/scrubyt-0.4.06/lib/scrubyt/core/navigation/agents/mechanize.rb:178:in `fill_textfield': undefined method `[]=' for nil:NilClass (NoMethodError)
        from /Library/Ruby/Gems/1.8/gems/scrubyt-0.4.06/lib/scrubyt/core/navigation/navigation_actions.rb:27:in `eval'
        from /Library/Ruby/Gems/1.8/gems/scrubyt-0.4.06/lib/scrubyt/core/navigation/agents/mechanize.rb:178:in `fill_textfield'
        from /Library/Ruby/Gems/1.8/gems/scrubyt-0.4.06/lib/scrubyt/core/navigation/navigation_actions.rb:27:in `fill_textfield'
        from scr.rb:8
        from /Library/Ruby/Gems/1.8/gems/scrubyt-0.4.06/lib/scrubyt/core/shared/extractor.rb:75:in `instance_eval'
        from /Library/Ruby/Gems/1.8/gems/scrubyt-0.4.06/lib/scrubyt/core/shared/extractor.rb:75:in `initialize'
        from /Library/Ruby/Gems/1.8/gems/scrubyt-0.4.06/lib/scrubyt/core/shared/extractor.rb:32:in `new'
        from /Library/Ruby/Gems/1.8/gems/scrubyt-0.4.06/lib/scrubyt/core/shared/extractor.rb:32:in `define'
        from scr.rb:4

I've found several error reports similar to this one, but none was answered and even discussed.  Any ideas?

P.S. Well, yes, I tried different codes, but while it eats up the fetch, it never works with submit.

# Posted via email from opportunity__cost