argra****@users*****
argra****@users*****
2011年 3月 21日 (月) 04:05:04 JST
Index: docs/modules/libwww-perl-5.813/lwptut.pod diff -u docs/modules/libwww-perl-5.813/lwptut.pod:1.1 docs/modules/libwww-perl-5.813/lwptut.pod:1.2 --- docs/modules/libwww-perl-5.813/lwptut.pod:1.1 Fri Mar 11 00:53:12 2011 +++ docs/modules/libwww-perl-5.813/lwptut.pod Mon Mar 21 04:05:04 2011 @@ -27,13 +27,10 @@ LWP ("Library for WWW in Perl" ÌZk`) ÍAWeb ãÌf[^É ANZX·é½ßÌñíÉL¼È Perl W [QÅ·B -Like most Perl -module-distributions, each of LWP's component modules comes with -documentation that is a complete reference to its interface. However, -there are so many modules in LWP that it's hard to know where to start -looking for information on how to do even the simplest most common -things. -(TBT) +ÙÆñÇÌ Perl W [zzƯlALWP ÌR|[lgW [Ì +»ê¼êÉÍ®SÈC^[tF[XÌt@X¶ª¯«³êĢܷB +µ©µALWP Éͽ³ñÌW [ª éÌÅAÅàPÅÅàêÊIÈ +±Æð·é½ßÌîñÅ·çADZ©çT¹Îæ¢Ì©ªí©èɢŷB =begin original @@ -44,11 +41,10 @@ =end original -Really introducing you to using LWP would require a whole book -- a book -that just happens to exist, called I<Perl & LWP>. But this article -should give you a taste of how you can go about some common tasks with -LWP. -(TBT) +LWP Ìg¢ûðྷéÉÍ{ 1 ûªKvÅ· -- ½Ü½Ü I<Perl & LWP> Æ¢¤ +{ª èÜ·B +µ©µA±ÌLÍ LWP Ţ©ÌêÊIÈìÆð·éû@̳íèð +ྵܷB =head2 Getting documents with LWP::Simple @@ -61,9 +57,8 @@ =end original -If you just want to get what's at a particular URL, the simplest way -to do it is LWP::Simple's functions. -(TBT) +PÉÁèÌ URL Ìàeðæ¾µ½¢ÈçAÅàÈPÈû@Í +LWP::Simple ÌÖðg¤±ÆÅ·B =begin original @@ -73,10 +68,10 @@ =end original -In a Perl program, you can call its C<get($url)> function. It will try -getting that URL's content. If it works, then it'll return the -content; but if there's some error, it'll return undef. -(TBT) +Perl vOÅA±ÌW [Ì C<get($url)> ÖðÄÑoµÜ·B +±êÍwè³ê½ URL Ìàeðæ¾µæ¤ÆµÜ·B +¤Üs¯ÎAàeðԵܷ; µ©µAൽ©G[ªN±êÎA +¢è`lðԵܷB my $url = 'http://freshair.npr.org/dayFA.cfm?todayDate=current'; # Just an example: the URL for the most recent /Fresh Air/ show @@ -102,10 +97,10 @@ =end original -The handiest variant on C<get> is C<getprint>, which is useful in Perl -one-liners. If it can get the page whose URL you provide, it sends it -to STDOUT; otherwise it complains to STDERR. -(TBT) +ÅàÖÈ C<get> ÌoG[VÍ C<getprint> ÅAPerl 1sìYÅ +LpÅ·B +wèµ½ URL ©çy[Wðæ¾Å«êÎAàeð STDOUT Éo͵ܷ; +³àȯêÎ STDERR ÉG[ðo͵ܷB % perl -MLWP::Simple -e "getprint 'http://cpan.org/RECENT'" @@ -118,11 +113,10 @@ =end original -That is the URL of a plaintext file that lists new files in CPAN in -the past two weeks. You can easily make it part of a tidy little -shell command, like this one that mails you the list of new -C<Acme::> modules: -(TBT) +±êÍ CPAN àÌß 2 TÔÌVKt@CÌêÌv[eLXgt@CÌ +URL Å·B +±êÉæÁÄ¿åÁƵ½VFR}hÌêƵÄg¦Ü·; +á¦Î Vµ¢ C<Acme::> W [Ìêð[·éÉÍ: % perl -MLWP::Simple -e "getprint 'http://cpan.org/RECENT'" \ | grep "/by-module/Acme" | mail -s "New Acme modules! Joy!" $USER @@ -138,13 +132,11 @@ =end original -There are other useful functions in LWP::Simple, including one function -for running a HEAD request on a URL (useful for checking links, or -getting the last-revised time of a URL), and two functions for -saving/mirroring a URL to a local file. See L<the LWP::Simple -documentation|LWP::Simple> for the full details, or chapter 2 of I<Perl -& LWP> for more examples. -(TBT) +LWP::Simple Éͻ̼ÉàÖÈÖª èÜ·; URL É HEAD NGXgð +éÖ (NÌ`FbNâA é URL ÌÅIXVúÌæ¾ÉÖÅ·) âA +URL Ìàeð[Jt@CÉÛ¶/~[·é½ßÌñÂÌÖÈÇÅ·B +®SÈÚ×É¢ÄÍ L<the LWP::Simple documentation|LWP::Simple> A +XÈéáÉ¢ÄÍ I<Perl & LWP> Ìæ 2 ÍðQƵľ³¢B =for comment ########################################################################## @@ -164,13 +156,11 @@ =end original -LWP::Simple's functions are handy for simple cases, but its functions -don't support cookies or authorization, don't support setting header -lines in the HTTP request, generally don't support reading header lines -in the HTTP response (notably the full HTTP error message, in case of an -error). To get at all those features, you'll have to use the full LWP -class model. -(TBT) +LWP::Simple ÌÖÍPÈóµÅÍÖÅ·ªA±ÌÖÍNbL[âFØÉ +εĢܹñµAHTTP NGXgÌwb_sÌÝèÉàεܹñµA +êÊIÉÍ HTTP X|XÌwb_sÌÇÝÝ(ÁÉAG[Ì +®SÈ HTTP G[bZ[W)àεĢܹñB +±êçÌ@\SÄðg¤ÉÍA®SÈ LWP NXfðg¤Kvª èÜ·B =begin original @@ -182,12 +172,11 @@ =end original -While LWP consists of dozens of classes, the main two that you have to -understand are L<LWP::UserAgent> and L<HTTP::Response>. LWP::UserAgent -is a class for "virtual browsers" which you use for performing requests, -and L<HTTP::Response> is a class for the responses (or error messages) -that you get back from those requests. -(TBT) +LWP Í\ÌNXÅ\¬³êĢܷªAð·éKvª éåÈñÂÌàÌÍ +L<LWP::UserAgent> Æ L<HTTP::Response> Å·B +LWP::UserAgent ÍNGXgðÀs·éÆ«Ég¤u¼zuEUvÅA +L<HTTP::Response> Í»ÌNGXg©çÔ³ê½X|X( é¢Í +G[bZ[W) ̽ßÌNXÅ·B =begin original @@ -196,9 +185,8 @@ =end original -The basic idiom is C<< $response = $browser->get($url) >>, or more fully -illustrated: -(TBT) +î{Iȵp@Í C<< $response = $browser->get($url) >> ÅAव +®Sɦ·Æ: # Early in your program: @@ -239,12 +227,11 @@ =end original -There are two objects involved: C<$browser>, which holds an object of -class LWP::UserAgent, and then the C<$response> object, which is of -class HTTP::Response. You really need only one browser object per -program; but every time you make a request, you get back a new -HTTP::Response object, which will have some interesting attributes: -(TBT) +ñÂÌIuWFNgªÖíÁĢܷ: C<$browser> Í LWP::UserAgent NXÌ +IuWFNgðÛµAC<$response> IuWFNgÍ HTTP::Response NXÅ·B +{ÉKvÈuEUIuWFNgÍ 1 vOÉ꾯ŷ; +µ©µNGXgðo·ÉVµ¢ HTTP::Response IuWFNgªÔ³êA +±êÉ͢©̻¡[¢®«ð۵Ģܷ: =over @@ -258,10 +245,8 @@ =end original -A status code indicating -success or failure -(which you can test with C<< $response->is_success >>). -(TBT) +¬÷©¸s©ð¦µÄ¢éXe[^XR[h(C<< $response->is_success >> Å +eXgÅ«Ü·)B =item * @@ -274,11 +259,9 @@ =end original -An HTTP status -line that is hopefully informative if there's failure (which you can -see with C<< $response->status_line >>, -returning something like "404 Not Found"). -(TBT) +¸sµ½Æ«ÌîñÉÈé©àµêÈ¢ HTTP Xe[^Xs +(C<< $response->status_line >> Å©é±ÆªÅ«A"404 Not Found" Ì +æ¤Èà̪ԳêÜ·)B =item * @@ -290,10 +273,8 @@ =end original -A MIME content-type like "text/html", "image/gif", -"application/xml", etc., which you can see with -C<< $response->content_type >> -(TBT) +"text/html", "image/gif", "application/xml" Ìæ¤È MIME Reg^Cv; +C<< $response->content_type >> Å©é±ÆªÅ«Ü·B =item * @@ -306,11 +287,9 @@ =end original -The actual content of the response, in C<< $response->decoded_content >>. -If the response is HTML, that's where the HTML source will be; if -it's a GIF, then C<< $response->decoded_content >> will be the binary -GIF data. -(TBT) +C<< $response->decoded_content >> É éX|XÌÀÛÌàeB +X|Xª HTML ÌêA±±ª HTML \[XªüÁÄ¢éêÅ·; +GIF ÌêAC<< $response->decoded_content >> Í GIF f[^oCiÅ·B =item * @@ -322,10 +301,9 @@ =end original -And dozens of other convenient and more specific methods that are -documented in the docs for L<HTML::Response>, and its superclasses -L<HTML::Message> and L<HTML::Headers>. -(TBT) +»µÄ½³ñ̻̼ÌÖÅæèÁLÌ\bhÍA +L<HTML::Response> ¨æÑ»ÌX[p[NXÅ é +L<HTML::Message> Æ L<HTML::Headers> ̶Ŷ»³êĢܷB =back @@ -334,6 +312,8 @@ =head2 Adding Other HTTP Request Headers +(»Ì¼Ì HTTP NGXgwb_ðÇÁ·é) + =begin original The most commonly used syntax for requests is C<< $response = @@ -343,11 +323,10 @@ =end original -The most commonly used syntax for requests is C<< $response = -$browser->get($url) >>, but in truth, you can add extra HTTP header -lines to the request by adding a list of key-value pairs after the URL, -like so: -(TBT) +NGXg̽ßÌÅàêÊIÈg¢û̶@Í +C<< $response = $browser->get($url) >> Å·ªAÀÛÌAȺÌæ¤ÉA +URL ÌãÉL[/lÌgÌXgðÇÁ·é±ÆÅÇÁÌ HTTP wb_ð +ÇÁÅ«Ü·: $response = $browser->get( $url, $key1, $value1, $key2, $value2, ... ); @@ -358,9 +337,8 @@ =end original -For example, here's how to send some more Netscape-like headers, in case -you're dealing with a site that would otherwise reject your request: -(TBT) +á¦ÎANetscape Ìwb_ȵÅÍNGXgðÛ·éTCg𵤽ßÉ +»Ìæ¤Èwb_ðÇÁ·éÉÍ: my @ns_headers = ( 'User-Agent' => 'Mozilla/4.76 [en] (Win98; U)', @@ -379,8 +357,7 @@ =end original -If you weren't reusing that array, you could just go ahead and do this: -(TBT) +zñðÄpµÈ¢ÈçAPÉȺÌæ¤ÉÅ«Ü·: $response = $browser->get($url, 'User-Agent' => 'Mozilla/4.76 [en] (Win98; U)', @@ -397,10 +374,9 @@ =end original -If you were only ever changing the 'User-Agent' line, you could just change -the C<$browser> object's default line from "libwww-perl/5.65" (or the like) -to whatever you like, using the LWP::UserAgent C<agent> method: -(TBT) +ൠ'User-Agent' s¾¯ðÏX·éÈçALWP::UserAgent Ì C<agent> \bhð +gÁÄAC<$browser> IuWFNgÌftHgsÅ é +"libwww-perl/5.65" ( é¢Í½æ¤ÈàÌ) ©çDÝÌàÌÉÏXÅ«Ü·: $browser->agent('Mozilla/4.76 [en] (Win98; U)'); @@ -424,15 +400,14 @@ =end original -A default LWP::UserAgent object acts like a browser with its cookies -support turned off. There are various ways of turning it on, by setting -its C<cookie_jar> attribute. A "cookie jar" is an object representing -a little database of all -the HTTP cookies that a browser can know about. It can correspond to a -file on disk (the way Netscape uses its F<cookies.txt> file), or it can -be just an in-memory object that starts out empty, and whose collection of -cookies will disappear once the program is finished running. -(TBT) +ftHgÌ LWP::UserAgent IuWFNgÍANbL[ÎðItɵ½ +uEUÌæ¤ÉUé¢Ü·B +C<cookie_jar> ®«ðÝè·é±ÆÅLøÉ·é¢Â©Ìû@ª èÜ·B +uNbL[eív("cookie jar") ÍAuEUªmÁÄ¢éSÄÌ HTTP +NbL[Ìf[^x[Xð\»·éIuWFNgÅ·B +±êÍfBXNãÌt@C (Netscape ª F<cookies.txt> t@CÅ +gÁÄ¢éû@)©APÉó©çJnµÄvOI¹ÉÁ¦ÄµÜ¤ +ãÌIuWFNgÉγ¹é±ÆªÅ«Ü·B =begin original @@ -441,9 +416,8 @@ =end original -To give a browser an in-memory empty cookie jar, you set its C<cookie_jar> -attribute like so: -(TBT) +ãÉóÌNbL[eíðuEUÉÝè·éÉÍAȺÌæ¤É +C<cookie_jar> ®«ÉÝèµÜ·: $browser->cookie_jar({}); @@ -455,10 +429,8 @@ =end original -To give it a copy that will be read from a file on disk, and will be saved -to it when the program is finished running, set the C<cookie_jar> attribute -like this: -(TBT) +fBXNãÌt@C©çÇÝñ¾f[^ðwèµÄAvOI¹É +Û¶·é½ßÉÍAC<cookie_jar> ®«ðȺÌæ¤ÉÝèµÜ·: use HTTP::Cookies; $browser->cookie_jar( HTTP::Cookies->new( @@ -476,10 +448,9 @@ =end original -That file will be an LWP-specific format. If you want to be access the -cookies in your Netscape cookies file, you can use the -HTTP::Cookies::Netscape class: -(TBT) +±Ìt@CÍ LWP ÅLÌ`®Å·B +Netscape ÌNbL[t@CÌNbL[ðANZX·éæ¤É·éÉÍA +HTTP::Cookies::Netscape NXðg¦Ü·: use HTTP::Cookies; # yes, loads HTTP::Cookies::Netscape too @@ -497,10 +468,9 @@ =end original -You could add an C<< 'autosave' => 1 >> line as further above, but at -time of writing, it's uncertain whether Netscape might discard some of -the cookies you could be writing back to disk. -(TBT) +ãqÌæ¤É C<< 'autosave' => 1 >> sðÇÁ·é±ÆàÅ«Ü·ªA +«ÝÉ Netscape ªfBíÉ«ß»¤Æµ½NbL[Ìêð +jü·é©Ç¤©ÍsmèÅ·B =for comment ########################################################################## @@ -516,9 +486,8 @@ =end original -Many HTML forms send data to their server using an HTTP POST request, which -you can send with this syntax: -(TBT) +½Ì HTML tH[Í HTTP POST NGXgðgÁÄT[oÉf[^ð +èÜ·ªA±êÉÍȺÌæ¤È¶@ðg¢Ü·: $response = $browser->post( $url, [ @@ -534,8 +503,7 @@ =end original -Or if you need to send HTTP headers: -(TBT) + é¢Í HTTP wb_ðéKvª éêÍ: $response = $browser->post( $url, [ @@ -555,10 +523,9 @@ =end original -For example, the following program makes a search request to AltaVista -(by sending some form data via an HTTP POST request), and extracts from -the HTML the report of the number of matches: -(TBT) +á¦ÎAȺÌvOÍ AltaVista É (tH[f[^ð HTTP POST +NGXgoRÅM·é±ÆÅ) õNGXgðÁÄAHTML ©ç +}b`OÌÌñðWJµÜ·: use strict; use warnings; @@ -619,9 +586,7 @@ =end original -To run the same search with LWP, you'd use this idiom, which involves -the URI class: -(TBT) +¯¶õð LWP ÅÀs·éÉÍAURI ðgÁ½±Ìè^¶ðg¢Ü·: use URI; my $url = URI->new( 'http://us.imdb.com/Tsearch' ); @@ -642,10 +607,9 @@ =end original -See chapter 5 of I<Perl & LWP> for a longer discussion of HTML forms -and of form data, and chapters 6 through 9 for a longer discussion of -extracting data from HTML. -(TBT) +HTML tH[ÆtH[f[^ÉÖ·éæè·¢c_É¢ÄÍI<Perl & LWP> Ì +æ 5 ÍðAHTML ©çÌf[^ÌoÉÖ·éæè·¢c_É¢ÄÍ +æ 6 Í©çæ 9 ÍðQƵľ³¢B =head2 Absolutizing URLs @@ -686,9 +650,8 @@ =end original -For example, consider this program that matches URLs in the HTML -list of new modules in CPAN: -(TBT) +á¦ÎACPAN ÌVµ¢W [Ì HTML XgÉ é URL É}b`O·é +±ÌvOðl¦Ü·: use strict; use warnings; @@ -711,8 +674,7 @@ =end original -When run, it emits output that starts out something like this: -(TBT) +Às·éÆA±Ìæ¤ÈàÌðo͵ܷ: MIRRORING.FROM RECENT @@ -730,10 +692,8 @@ =end original -However, if you actually want to have those be absolute URLs, you -can use the URI module's C<new_abs> method, by changing the C<while> -loop to this: -(TBT) +µ©µAÀÛÉÍâÎ URL ªÙµ¢êAURI W [Ì +C<new_abs> \bhðgÁÄAC<while> [vð±Ìæ¤ÉÏXÅ«Ü·: while( $html =~ m/<A HREF=\"(.*?)\"/g ) { print URI->new_abs( $1, $response->base ) ,"\n"; @@ -748,7 +708,8 @@ =end original -(The C<< $response->base >> method from L<HTTP::Message|HTTP::Message> +(L<HTTP::Message|HTTP::Message> Ì C<< $response->base >> \bhÍ +method from is for returning what URL should be used for resolving relative URLs -- it's usually just the same as the URL that you requested.)