SOS

Shane O'Sullivan's technical blog… really ties the room together

Archive for January, 2009

Dojo & GreaseMonkey == DaftMonkey

Posted by Shane O'Sullivan on January 26, 2009

Given that I’m looking for a new apartment, and I live in Ireland, I use the property search website Daft.ie.  Everyone does.  However I wasn’t very happy with how slow it is to scan through the many results that match my meagre budget.  I realised that it could be readily fixed with GreaseMonkey, using the Dojo Ajax Toolkit to make life easier when it comes to parsing the page, adding effects etc.

The result is DaftMonkey.

I wasn’t even sure if Dojo could be used from within a GreaseMonkey script, as it sandboxes away the custom script code.  However, with a little hackery it was (more or less) possible.  The steps I took were:

  1. Set up the djConfig parameter in the host window to tell Dojo that the page had already loaded, using unsafeWindow.djConfig = {afterOnLoad: true};. unsafeWindow is what GreaseMonkey calls the normal, non-sandboxed window.
  2. Added the <script> tag for dojo.js to the head of the document.  In this case I used the dojo.js.file hosted on AOL’s CDN servers – see http://dev.aol.com/dojo .
  3. Now you have to wait for Dojo to load.  This can be done with a simple setInterval function call, checking if unsafeWindow.dojo exists or not.  (Update: thanks a comment from James, this has been changed to use the djConfig.addOnLoad function)
  4. Once Dojo is loaded, you can call a function kicking off whatever it is that you  script is supposed to do.  In this case, I wanted to add a bunch of DOM nodes to the page (which you can do without Dojo), and add some cool effects, so I also included the dojo.fx bundle.
  5. Copy the dojo variable back into the sandbox window using var dojo = unsafeWindow.dojo, otherwise you’ll have to refer to it as unsafeWindow.dojo all the time.

Screen Scraping With dojo.query

A lot of the features of DaftMonkey rely on asynchronously fetching remote HTML pages and scraping the required data from them.  The approach I used for this was:

  1. Perform a remote request using GreaseMonkey’s native Ajax function GM_xmlhttpRequest.  This works more or less the same as dojo.xhrGet, and I saw no reason to not use it.
  2. When the text is returned, create a DIV, and absolutely position it far to the left.  Fix it’s size to just one pixel so it doesn’t mess with the scroll bars.
  3. Set the innerHTML of the DIV to the text you have retrieved.  Congratulations, you can now use dojo.query to find whatever nodes you need.  e.g. to find all images inside anchor tags, use dojo.query(“a img”, tempDiv).  Note the second parameter, this tells Dojo to only search inside the temporary DIV we created, and not the whole document.

Some other site-specific things were required as part of the screen scraping process.  Many of the sites had iframes included, and as soon as you add those to the temporary DIV, they start loading another page.  This was a nasty performance hit, so I had to remove them from the HTML string before setting the innerHTML of the temporary DIV.

Problems

One problem I found is that calling dojo.declare didn’t work from inside a GreaseMonkey script.  I don’t know why.  Therefore widgets had to be defined the old fashioned way.

A second problem was more related to the website I was writing the script for, Daft.ie.  The entire site is programmed using TABLES!  Seriously, there’s barely one or two DIVs on the page, with practically no CSS either.  This makes it quite difficult and brittle to screen scrape using dojo.query, as there’s really no classes to match.  Still it was possible, but could break relatively easily if the site layout is changed.

Get the Source

You can get the entire source for the script at http://userscripts.org/scripts/show/41105 .

To read a bit more about DaftMonkey, I’ve put up a page about it at http://www.chofter.com/apps?n=daftmonkey .

Posted in Ajax, Dojo, dojo.query, greasemonkey, Javascript, open source, search, widget | 8 Comments »

Broadband in Dublin: O2 Admits Defeat

Posted by Shane O'Sullivan on January 13, 2009

I’ve been documenting my issues with broadband in Dublin (Ireland for the geographically challenged) on and off for a while – see here for previous rants.  I’ve been using mobile broadband from O2 for almost a year now, and have seen it get slower and slower as the year progressed.

Now they’ve officially told me to piss off and stop complaining about my pathetic speeds, as they have no intention of fixing the problem.

In November it started averaging about 5kB/s between 5pm and 10pm every weekday and Saturday mornings.  Yes, the same speed that I got back in ’94 when I discovered the wonders of IRC channels and bulletin boards.  I reported this pathetic fact to O2, (see here, here, here, and here , ok so I bugged them..) and they replied that a mast in my area (a few minutes walk from the centre of the capital of our wonderful Knowledge Economy) was being upgraded in November and this would solve all my problems.

This being Ireland, it wasn’t upgraded until the end of December.  My speeds did indeed improve: they now average around 7kB/s.

Upon reporting this to O2, I was sent an email with this single line:

You have been taken off contract, and can cancel your subscription at any time

They simply admitted that if you live in the city centre, they either cannot or will not invest in sufficient infrastructure to overcome the contention issues.  So please piss off and annoy someone else with your talk of “getting what you pay for”.

O2 still continue to advertise the product as a 3 Megabit service.  I have never gotten this speed, and have not gotten over 1Mbit for 6 months.  I am sure something in there is illegal…. any ideas?

P.S.  I’ll of course be cancelling my contract with them very soon.

Posted in broadband, Ireland, O2 | Leave a Comment »

Deleting DOM nodes with Dojo

Posted by Shane O'Sullivan on January 2, 2009

This is a quick and simple tip for deleting multiple DOM nodes in JavaScript using the Dojo Ajax Toolkit.

A common pattern in non-trivial Ajax applications is the need to delete many nodes that may or may not have been created.  The code to do this without Dojo (or a similar library such as jQuery) would look something like the following code (assuming nodes are identified by ID):

var nodeIDs = ["node1", "node2", "node3"];
for (var i = 0; i < nodeIDs.length; i++) {
 var node = document.getElementById(nodeIDs[i]);
 // Verify that the node actually exists, otherwise an error occurs
 if (node) {
  node.parentNode.removeChild(node);
 }
}

This is of course a bit messy.   It would be infinitely messier if you needed to match nodes not by ID, but by another method such as by class name, relative position to it’s parent or sibiling nodes etc.

To achieve the same in Dojo, use this simple code.

var nodeIDs = ["#node1", "#node2", "#node3"];
dojo.query(nodeIDs.join(",")).orphan();

Update: There is a bug in Dojo that means that if the node with ID “node2″ does not exist, “node3″ will not be deleted.  (Thanks to Pete Higgins for letting me know). Instead use the following code:
dojo.forEach(arr, "dojo.query(item).orphan();");

Very simple, and much smaller and cleaner than the previous example.  It also scales very well to match nodes using CSS selectors, not just by ID.  For example, to delete nodes both by ID and class name, use the following code.

var nodeIDs = ["#node1", "#node2", ".className1", ".parentClass .childClass"];
dojo.query(nodeIDs.join(",")).orphan();

Update: as with the first example, replace the second line with:
dojo.forEach(arr, "dojo.query(item).orphan();");

For more info on the cool features of dojo.query, see the docs at http://docs.dojocampus.org/dojo/query .

Posted in Ajax, Dojo, dojo.query, Javascript, open source | 14 Comments »

 
Follow

Get every new post delivered to your Inbox.

Join 533 other followers

%d bloggers like this: