Ñò §ÚêLc@s}dZddkZddkZdgZdd d„ƒYZdd d„ƒYZdd d„ƒYZd eifd „ƒYZdS(s< robotparser.py Copyright (C) 2000 Bastian Kleineidam You can choose between two licenses when using this package: 1) GNU GPLv2 2) PSF license for Python 2.2 The robots.txt Exclusion Protocol is implemented as specified in http://info.webcrawler.com/mak/projects/robots/norobots-rfc.html iÿÿÿÿNtRobotFileParsercBsbeZdZdd„Zd„Zd„Zd„Zd„Zd„Zd„Z d „Z d „Z RS( ss This class provides a set of methods to read, parse and answer questions about a single robots.txt file. tcCs>g|_d|_t|_t|_|i|ƒd|_dS(Ni(tentriestNonet default_entrytFalset disallow_allt allow_alltset_urlt last_checked(tselfturl((s#/usr/lib64/python2.6/robotparser.pyt__init__s      cCs|iS(s·Returns the time the robots.txt file was last fetched. This is useful for long-running web spiders that need to check for new robots.txt files periodically. (R (R ((s#/usr/lib64/python2.6/robotparser.pytmtime scCsddk}|iƒ|_dS(sYSets the time the robots.txt file was last fetched to the current time. iÿÿÿÿN(ttimeR (R R((s#/usr/lib64/python2.6/robotparser.pytmodified)s cCs/||_ti|ƒdd!\|_|_dS(s,Sets the URL referring to a robots.txt file.iiN(R turlparsethosttpath(R R ((s#/usr/lib64/python2.6/robotparser.pyR1s cCs¾tƒ}|i|iƒ}g}|D]}||iƒq&~}|iƒ|i|_|idjo t|_nF|idjo t|_n)|idjo|o|i |ƒndS(s4Reads the robots.txt URL and feeds it to the parser.i‘i“iiÈN(i‘i“( t URLopenertopenR tstriptcloseterrcodetTrueRRtparse(R topenertft_[1]tlinetlines((s#/usr/lib64/python2.6/robotparser.pytread6s '    cCsEd|ijo!|idjo ||_qAn|ii|ƒdS(Nt*(t useragentsRRRtappend(R tentry((s#/usr/lib64/python2.6/robotparser.pyt _add_entryDscCs6d}d}tƒ}xü|D]ô}|d7}|pQ|djotƒ}d}q„|djo |i|ƒtƒ}d}q„n|idƒ}|djo|| }n|iƒ}|pqn|iddƒ}t|ƒdjo#|diƒiƒ|d|djo| o t}nti|ƒ|_||_dS(NR(RR.R6RR8(R RR8((s#/usr/lib64/python2.6/robotparser.pyR šs cCs |idjp|i|iƒS(NR (Rt startswith(R tfilename((s#/usr/lib64/python2.6/robotparser.pyR7¡scCs |iodpdd|iS(NtAllowtDisallows: (R8R(R ((s#/usr/lib64/python2.6/robotparser.pyR=¤s(R>R?R@R R7R=(((s#/usr/lib64/python2.6/robotparser.pyR1—s  R)cBs2eZdZd„Zd„Zd„Zd„ZRS(s?An entry has one or more user-agents and zero or more rulelinescCsg|_g|_dS(N(R!R0(R ((s#/usr/lib64/python2.6/robotparser.pyR ªs cCsjg}x'|iD]}|id|dgƒqWx*|iD]}|it|ƒdgƒq:Wdi|ƒS(Ns User-agent: s R(R!textendR0R<R;(R trettagentR((s#/usr/lib64/python2.6/robotparser.pyR=®s  cCsa|idƒdiƒ}xA|iD]6}|djotS|iƒ}||jotSq#WtS(s2check if this entry applies to the specified agentR5iR (R+R-R!RR(R R9RG((s#/usr/lib64/python2.6/robotparser.pyR7¶s     cCs0x)|iD]}|i|ƒo|iSq WtS(sZPreconditions: - our agent applies to this entry - filename is URL decoded(R0R7R8R(R RBR((s#/usr/lib64/python2.6/robotparser.pyR8Ãs   (R>R?R@R R=R7R8(((s#/usr/lib64/python2.6/robotparser.pyR)¨s    RcBs#eZd„Zd„Zd„ZRS(cGs tii||Œd|_dS(NiÈ(R.tFancyURLopenerR R(R targs((s#/usr/lib64/python2.6/robotparser.pyR ÍscCsdS(N(NN(R(R Rtrealm((s#/usr/lib64/python2.6/robotparser.pytprompt_user_passwdÑscCs(||_tii||||||ƒS(N(RR.RHthttp_error_default(R R tfpRterrmsgtheaders((s#/usr/lib64/python2.6/robotparser.pyRLÖs (R>R?R RKRL(((s#/usr/lib64/python2.6/robotparser.pyRÌs  (((( R@RR.t__all__RR1R)RHR(((s#/usr/lib64/python2.6/robotparser.pyt s   …$