I've never played much with asynchronous page loads before and then it's come up three times this month.
First, I saw this post on AnyEvent, so I gave that try
#!/usr/bin/env perl use v5.14; use warnings; use AnyEvent; use AnyEvent::HTTP; use Time::HiRes qw(time); my $cv = AnyEvent->condvar( cb => sub { warn "done";}); my @urls = ( "https://www.google.com", "http://www.windley.com/", "https://www.bing.com", "http://www.example.com", "http://www.wetpaint.com", "http://www.uh.cu", ); my $start = time; my $result; $cv->begin(sub { shift->send($result) }); foreach my $url (@urls) { $cv->begin; my $now = time; my $request; $request = http_request( GET => $url, timeout => 2, # seconds sub { my ($body, $hdr) = @_; my $u = sprintf "%-25s", $url; my $et = sprintf "%5.3f", time - $now; my $len = sprintf "%8s", $hdr->{'content-length'}; push @$result, ($hdr->{Status} =~ /^2/) ? "$u has length $len and loaded in $et s" : "Error for $u: ($hdr->{Status}) $hdr->{Reason}"; undef $request; $cv->end; } ); } $cv->end; warn "End of loop\n"; my $foo = $cv->recv or die; say for @$foo; my $tet = sprintf "%5.3f", time - $start; say "Total elapsed time: $tet s";
$ ./anyevent.pl End of loop done at ./anyevent.pl line 11. https://www.bing.com has length 31711 and loaded in 0.826 s http://www.example.com has length 2966 and loaded in 0.854 s https://www.google.com has length 32384 and loaded in 1.767 s http://www.windley.com/ has length 91898 and loaded in 2.120 s http://www.wetpaint.com has length 93698 and loaded in 2.506 s http://www.uh.cu has length 18493 and loaded in 4.269 s Total elapsed time: 4.284 s
Jesse Shy saw that as well, which resulted in this Mojolicious solution
#!/usr/bin/env perl use v5.14; use warnings; use Mojo::UserAgent; use Mojo::IOLoop; use Time::HiRes qw(time); my @urls = ( "https://www.google.com", "http://www.windley.com/", "https://www.bing.com", "http://www.example.com", "http://www.wetpaint.com", "http://www.uh.cu", ); my $ua = Mojo::UserAgent->new; my $start = time; foreach my $u (@urls) { my $now = time; $ua->get($u => sub { my($ua, $tx) = @_; if (my $res = $tx->success) { my $url = sprintf "%-25s", $u; my $et = sprintf "%5.3f", time - $now; my $len = sprintf "%8s", length($tx->res->body); say "$url has length $len and loaded in $et s"; } else { my($message, $code) = $tx->error; say "Error for $u : ($code) $message"; } Mojo::IOLoop->stop; }); Mojo::IOLoop->start; } my $tet = sprintf "%5.3f", time - $start; say "Total elapsed time: $tet s";
$ ./anymojo.pl https://www.google.com has length 11287 and loaded in 0.521 s http://www.windley.com/ has length 91898 and loaded in 0.590 s https://www.bing.com has length 0 and loaded in 0.117 s http://www.example.com has length 0 and loaded in 0.079 s http://www.wetpaint.com has length 93698 and loaded in 1.489 s http://www.uh.cu has length 18530 and loaded in 2.993 s Total elapsed time: 5.792 ms
Then Naveed Massjouni mentioned HTTP::Async just by coincidence, so I had to try that to compare
#!/usr/bin/env perl use v5.14; use warnings; use HTTP::Async; use HTTP::Request; use Time::HiRes qw(time); my @urls = ( "https://www.google.com", "http://www.windley.com/", "https://www.bing.com", "http://www.example.com", "http://www.wetpaint.com", "http://www.uh.cu", ); my $start = time; my $async = HTTP::Async->new; # These ids are just a one-up numbers, starting at 1? Why not 0? my @ids = $async->add( map {HTTP::Request->new(GET => $_)} @urls ); my $now = time; while (my($response, $id) = $async->wait_for_next_response) { my $url = sprintf "%-25s", $urls[$id-1]; my $len = sprintf "%8s", length $response->content; my $et = sprintf "%5.3f", time - $now; say "$url has length $len and loaded in $et s"; $now = time; } my $tet = sprintf "%5.3f", time - $start; say "Total elapsed time: $tet s";
$ ./httpasync.pl https://www.bing.com has length 32157 and loaded in 0.157 s http://www.example.com has length 2966 and loaded in 0.106 s http://www.windley.com/ has length 83915 and loaded in 0.154 s https://www.google.com has length 32379 and loaded in 0.052 s http://www.uh.cu has length 18593 and loaded in 1.464 s http://www.wetpaint.com has length 92333 and loaded in 0.556 s Total elapsed time: 3.800 s
All three seem to work, but the AnyEvent timings are the only ones that really make sense. I'm not quite sure what to do about that.
Update: In addition to Blipsofadoug's improvements to the Mojo solution and Dimitry Karasik's IO::Lambda solution down in the comments below, don't miss ttjjss's Perl 6 solution over here!
At the very least, the Mojo example doesn't make any sense, as it starts and stops the event loop, making it act as if it were not using an event loop. I changed it to match the AnyEvent example here: https://gist.github.com/0dbc430a67f28bac0b39
Posted by: Blipsofadoug | 03/10/2012 at 07:30 PM
Here's one with IO::Lambda:
https://gist.github.com/2021875
Posted by: Dmitry Karasik | 03/12/2012 at 06:23 AM
Why do you think the HTTP::Async approach does not make sense? One advantage of HTTP::Async is that it uses proper HTTP::Request and HTTP::Response objects, which have interfaces that are familiar to users of the standard LWP module. AnyEvent::HTTP seems to have re-invented those wheels.
Posted by: Naveed Massjouni | 03/13/2012 at 04:05 PM
No, no. I just mean the timings. The AnyEvent timings are very satisfying because the individual load times seem reasonable and they all add up to more than the total, which clearly illustrates things were done in parallel. I don't think I'm timing the right things in the HTTP::Async example (or any of the other examples, including the new ones in the comments).
I agree, HTTP::Async was very easy to use.
Posted by: oylenshpeegul | 03/13/2012 at 05:32 PM