Web::Scrape with XPath and the dangers of Chrome

My first time submitting a question to Stackoverflow – what a great site!

I’d already written the SNMP discovery tool, which I was pretty proud of.  But, there was still some information that was only available by logging into the device’s web interface, use basic auth to get in, navigate to a page, and get 2 strings, then navigate to 5 other pages, getting the same two strings.

Sounds like a job for a script!

Everything was going well, decided to use Web::Scraper, wrote my program, and then … nothing.  It wouldn’t work, and I didn’t know enough about XPATH to even know where to start.  Plus, the Web::Scraper module was a bit scary to read.

Long story short, the XPATH being returned by Chrome was incorrect.  In the developer tools, right click on an element in the elements window and select “copy Xpath”.  What you might not know is that Chrome inserts HTML elements into the DOM that aren’t in the source HTML.  In my case, it was for extra <tbody> elements inside tables.

Oh, and learning the XPATH syntax helped too!

2 thoughts on “Web::Scrape with XPath and the dangers of Chrome”

  1. Knowing xpath is important but I recommend using css with html, xpath is for xml. Web::Scraper accepts css selector syntax as well as xpath and it’s generally easier on the eyes.

Leave a Reply

Your email address will not be published.