| Randal | step 3 - write like mad step 4 - produce and ship book we haven't done step 1 yet. ) |
| revdiablo | Maybe "Learning Perl 6 (kinda)" |
| Randal | "Learning Perl 5.99" |
| Dodde | I use Mechanize to fetch a webpage, but imagine each webpage is 50kb of size, is there a way to check existance of a webpage without actually retrieving it, and thus cause less server load and bandwidth? imagine 1 million pages would be checked, so totally it makes a significant difference... anyone has an idea? |
| revdiablo | The new "Learning X (kinda)" series of O'Reilly books |
| Randal | sometimes HEAD works instead of GET |
| alester | yes, use head() |
| Randal | it doesn't necessarily though if you do that to most of my pages, you'll still get the whole page saves me nothing. :) |
| jagerman | Even worse, sometimes HEAD returns a 200 when a GET returns a 404. |
| QtPlatypus | And while I'm Leanring Haskell I'm not yet good enought to help Pugs. And as for my C and C++ skills they have aptrophied. |
| Randal | yeah I keep *starting* to learn haskell. then I don't grok a particular "obviously..." example and then it doesn't get any easier. :) people who grok haskell are Smarter Than Me |
| Dodde | hmm so what does head() actually check for? |
| Randal | head() invokes HEAD it's up to the server how that differs HEAD /some/url vs GET /some/url |
| QtPlatypus | Randal: So I guess I will not be seeing a "Learning Haskell" book anytime soon? |
| Randal | totally server-side |
| Dodde | it's for a wikipedia bot... so it's checking wikipedia pages |
| revdiablo | Dodde: TIAS |
| Randal | is that within the terms of service? |
| jagerman | In theory, HEAD is meant to return just the header of the same GET request. In practise, ignorant Windows server users break it all the time. |
| Randal | I know they're not too happy with automated hits |
| QtPlatypus | Dodde: Isn't there a web service gateway to wikipedia? |
| Dodde | QtPlatypus: I am not sure what you mean |
| action | QtPlatypus thought that there was a SOAP interface or a REST interface to wikipedia... but I could be wrong. |
| QtPlatypus | thought that there was a SOAP interface or a REST interface to wikipedia... but I could be wrong. |
| Dodde | Randal: how you mean within terms of service? |
| Randal | heh. HEAD on wikipedia returns 403 forbidden on a page that's definitely there so yeah, it won't work on wikip |
| Dodde | Randal: it is meant to check the existance of translations on foreign wiktionaries... based on the result the bot will save the info to the entry checked... so yes it's a service |