PlanningAlerts.org.au

New Scraper for Lane Cove

Details

  • Type: New Feature New Feature
  • Status: Open Open
  • Priority: Major Major
  • Resolution: Unresolved
  • Affects Version/s: None
  • Fix Version/s: None
  • Component/s: Scraper
  • Labels:
    None

Description

Another new scraper. This one is for Lane Cove based on a copy/paste of the Woollahra scraper. They have a similar backend (from civica), but the LC scraper uses a date-based search instead, so there's not really much in the way of common scraper code.

Activity

Hide
Matthew Landauer added a comment -

Hi Roger,

When I test the scraper I get this:

$ ./scraper_output.rb lane_cove
WARNING: Nokogiri was built against LibXML version 2.7.6, but has dynamically loaded 2.7.7
./scrapers/lane_cove_scraper.rb:16:in `applications': undefined method `value=' for nil:NilClass (NoMethodError)
from (eval):12:in `form_with'
from ./scrapers/lane_cove_scraper.rb:15:in `applications'
from ./lib/scraper.rb:57:in `results'
from ./lib/scraper.rb:52:in `results_as_xml'
from ./scraper_output.rb:38

The scraper_output.rb is a very useful little program for testing the result of the scraper.

Show
Matthew Landauer added a comment - Hi Roger, When I test the scraper I get this: $ ./scraper_output.rb lane_cove WARNING: Nokogiri was built against LibXML version 2.7.6, but has dynamically loaded 2.7.7 ./scrapers/lane_cove_scraper.rb:16:in `applications': undefined method `value=' for nil:NilClass (NoMethodError) from (eval):12:in `form_with' from ./scrapers/lane_cove_scraper.rb:15:in `applications' from ./lib/scraper.rb:57:in `results' from ./lib/scraper.rb:52:in `results_as_xml' from ./scraper_output.rb:38 The scraper_output.rb is a very useful little program for testing the result of the scraper.
Hide
Roger Barnes added a comment -

I've been using scraper_output.rb for testing, looks like I need to be more thorough to make it resilient to situations that don't occur on my chosen test date. Sorry about the premature commit, shall rectify this and willoughby asap.

Show
Roger Barnes added a comment - I've been using scraper_output.rb for testing, looks like I need to be more thorough to make it resilient to situations that don't occur on my chosen test date. Sorry about the premature commit, shall rectify this and willoughby asap.
Hide
Matthew Landauer added a comment -

@Roger - Absolutely no worries at all. Yeah, writing scrapers is a bit of trial and error process. When you're ready just upload an updated patch, set the ticket to "Merge Required" again. Thanks!

Show
Matthew Landauer added a comment - @Roger - Absolutely no worries at all. Yeah, writing scrapers is a bit of trial and error process. When you're ready just upload an updated patch, set the ticket to "Merge Required" again. Thanks!
Hide
Roger Barnes added a comment -

@Matthew

I found and fixed an unrelated problem where no results are returned for a given date for Lane Cove. Having done that, I couldn't reproduce the errors you noted in the tickets for both Wollongong and Lane Cove for a range of dates. Perhaps we have different library versions?

roger@tomato:/usr/local/src/planningalerts-parsers$ ruby --version
ruby 1.8.7 (2010-01-10 patchlevel 249) [i486-linux]

roger@tomato:/usr/local/src/planningalerts-parsers$ gem list --local

      • LOCAL GEMS ***

builder (2.1.2)
htmlentities (4.2.1)
mechanize (1.0.0)
nokogiri (1.4.3.1)

These all worked...
roger@tomato:/usr/local/src/planningalerts-parsers$ for i in 2010-08-01 "" 2010-08-20 2010-08-1{6,7,8,9} 2010-01-01 2010-07-20; do echo "Testing $i..."; ./scraper_output.rb wollongong $i; ./scraper_output.rb willoughby $i; ./scraper_output.rb lane_cove $i; done;

Show
Roger Barnes added a comment - @Matthew I found and fixed an unrelated problem where no results are returned for a given date for Lane Cove. Having done that, I couldn't reproduce the errors you noted in the tickets for both Wollongong and Lane Cove for a range of dates. Perhaps we have different library versions? roger@tomato:/usr/local/src/planningalerts-parsers$ ruby --version ruby 1.8.7 (2010-01-10 patchlevel 249) [i486-linux] roger@tomato:/usr/local/src/planningalerts-parsers$ gem list --local
      • LOCAL GEMS ***
builder (2.1.2) htmlentities (4.2.1) mechanize (1.0.0) nokogiri (1.4.3.1) These all worked... roger@tomato:/usr/local/src/planningalerts-parsers$ for i in 2010-08-01 "" 2010-08-20 2010-08-1{6,7,8,9} 2010-01-01 2010-07-20; do echo "Testing $i..."; ./scraper_output.rb wollongong $i; ./scraper_output.rb willoughby $i; ./scraper_output.rb lane_cove $i; done;
Hide
Peter McC added a comment -

Hi there.
Any news on this being fixed/tested/merged so that it works for Lane Cove?
I'd try and help but I don't know anything about Ruby...
Peter

Show
Peter McC added a comment - Hi there. Any news on this being fixed/tested/merged so that it works for Lane Cove? I'd try and help but I don't know anything about Ruby... Peter
Hide
Roger Barnes added a comment -

The Lane Cove one as submitted still works for me, so I think it's just a matter of working through why it doesn't work for Matthew. He was quite busy last time I checked in, but I'll nudge him again

Show
Roger Barnes added a comment - The Lane Cove one as submitted still works for me, so I think it's just a matter of working through why it doesn't work for Matthew. He was quite busy last time I checked in, but I'll nudge him again
Hide
Roger Barnes added a comment -

Hi Matthew,

Not sure if you're the best one to ask about this, do point me in another direction if necessary.

I just retested the Lane Cove patch and still can't reproduce the issue you found. Any ideas?

  • Roger
Show
Roger Barnes added a comment - Hi Matthew, Not sure if you're the best one to ask about this, do point me in another direction if necessary. I just retested the Lane Cove patch and still can't reproduce the issue you found. Any ideas?
  • Roger
Hide
Henare Degan added a comment -

Here's my test results:

$ git apply ~/Desktop/0002-Added-Lane-Cover-scraper.patch
/home/henare/Desktop/0002-Added-Lane-Cover-scraper.patch:41: trailing whitespace.

/home/henare/Desktop/0002-Added-Lane-Cover-scraper.patch:29: new blank line at EOF.
+
warning: 1 line applied after fixing whitespace errors.
$ git status

  1. On branch master
  2. Changed but not updated:
  3. (use "git add <file>..." to update what will be committed)
  4. (use "git checkout – <file>..." to discard changes in working directory)
    #
  5. modified: scraper_factory.rb
    #
  6. Untracked files:
  7. (use "git add <file>..." to include in what will be committed)
    #
  8. scrapers/lane_cove_scraper.rb
    no changes added to commit (use "git add" and/or "git commit -a")
    $ ./scraper_output.rb lane_cove
    ./scrapers/lane_cove_scraper.rb:16:in `applications': undefined method `value=' for nil:NilClass (NoMethodError)
    from (eval):12:in `form_with'
    from ./scrapers/lane_cove_scraper.rb:15:in `applications'
    from ./lib/scraper.rb:57:in `results'
    from ./lib/scraper.rb:52:in `results_as_xml'
    from ./scraper_output.rb:38
    $ ./scraper_output.rb lane_cove 2010-12-21
    ./scrapers/lane_cove_scraper.rb:16:in `applications': undefined method `value=' for nil:NilClass (NoMethodError)
    from (eval):12:in `form_with'
    from ./scrapers/lane_cove_scraper.rb:15:in `applications'
    from ./lib/scraper.rb:57:in `results'
    from ./lib/scraper.rb:52:in `results_as_xml'
    from ./scraper_output.rb:38
    $

I can't imagine why you're getting different results - could you have a different revision history to upstream? i.e. some patches that haven't been integrated are making your code base work.

Show
Henare Degan added a comment - Here's my test results: $ git apply ~/Desktop/0002-Added-Lane-Cover-scraper.patch /home/henare/Desktop/0002-Added-Lane-Cover-scraper.patch:41: trailing whitespace. /home/henare/Desktop/0002-Added-Lane-Cover-scraper.patch:29: new blank line at EOF. + warning: 1 line applied after fixing whitespace errors. $ git status
  1. On branch master
  2. Changed but not updated:
  3. (use "git add <file>..." to update what will be committed)
  4. (use "git checkout – <file>..." to discard changes in working directory) #
  5. modified: scraper_factory.rb #
  6. Untracked files:
  7. (use "git add <file>..." to include in what will be committed) #
  8. scrapers/lane_cove_scraper.rb no changes added to commit (use "git add" and/or "git commit -a") $ ./scraper_output.rb lane_cove ./scrapers/lane_cove_scraper.rb:16:in `applications': undefined method `value=' for nil:NilClass (NoMethodError) from (eval):12:in `form_with' from ./scrapers/lane_cove_scraper.rb:15:in `applications' from ./lib/scraper.rb:57:in `results' from ./lib/scraper.rb:52:in `results_as_xml' from ./scraper_output.rb:38 $ ./scraper_output.rb lane_cove 2010-12-21 ./scrapers/lane_cove_scraper.rb:16:in `applications': undefined method `value=' for nil:NilClass (NoMethodError) from (eval):12:in `form_with' from ./scrapers/lane_cove_scraper.rb:15:in `applications' from ./lib/scraper.rb:57:in `results' from ./lib/scraper.rb:52:in `results_as_xml' from ./scraper_output.rb:38 $
I can't imagine why you're getting different results - could you have a different revision history to upstream? i.e. some patches that haven't been integrated are making your code base work.
Hide
Roger Barnes added a comment -

I haven't forgotten about this, but haven't spent much time on it. I did get as far as updating from upstream and testing it again. There is a bug in there for some use cases, but not the one you guys are seeing. Will have another crack at it soon.

Show
Roger Barnes added a comment - I haven't forgotten about this, but haven't spent much time on it. I did get as far as updating from upstream and testing it again. There is a bug in there for some use cases, but not the one you guys are seeing. Will have another crack at it soon.

People

Vote (0)
Watch (2)

Dates

  • Created:
    Updated: