Tryag File Manager
Home
-
Turbo Force
Current Path :
/
proc
/
self
/
root
/
proc
/
self
/
root
/
usr
/
lib
/
python2.4
/
Upload File :
New :
File
Dir
//proc/self/root/proc/self/root/usr/lib/python2.4/robotparser.pyc
mò ‚=5Dc�����������@���s°���d��Z��d�k�Z�d�k�Z�d�g�Z�d�a�d�„��Z�d�f��d�„��ƒ��YZ�d�f��d�„��ƒ��YZ�d�f��d �„��ƒ��YZ�d �e�i �f�d�„��ƒ��YZ �d�„��Z�d �„��Z�e �d�j�o�e�ƒ��n�d�S(���s<�� robotparser.py Copyright (C) 2000 Bastian Kleineidam You can choose between two licenses when using this package: 1) GNU GPLv2 2) PSF license for Python 2.2 The robots.txt Exclusion Protocol is implemented as specified in http://info.webcrawler.com/mak/projects/robots/norobots-rfc.html Nt���RobotFileParseri����c���������C���s���t��o �|��GHn�d��S(���N(���t���debugt���msg(���R���(����(����t!���/usr/lib/python2.4/robotparser.pyt���_debug���s�����c�����������B���sb���t��Z�d��Z�d�d�„�Z�d�„��Z�d�„��Z�d�„��Z�d�„��Z�d�„��Z�d�„��Z �d �„��Z �d �„��Z�RS(���ss��� This class provides a set of methods to read, parse and answer questions about a single robots.txt file. t����c���������C���s>���g��|��_�d��|��_�t�|��_�t�|��_�|��i�|�ƒ�d�|��_ �d��S(���Ni����( ���t���selft���entriest���Nonet ���default_entryt���Falset���disallow_allt ���allow_allt���set_urlt���urlt���last_checked(���R���R���(����(����R���t���__init__���s���� c���������C���s���|��i�S(���s·���Returns the time the robots.txt file was last fetched. This is useful for long-running web spiders that need to check for new robots.txt files periodically. N(���R���R���(���R���(����(����R���t���mtime$���s�����c���������C���s���d�k��}�|�i��ƒ��|��_�d�S(���sY���Sets the time the robots.txt file was last fetched to the current time. N(���t���timeR���R���(���R���R���(����(����R���t���modified-���s����� c���������C���s/���|�|��_��t�i�|�ƒ�d�d�!\�|��_�|��_�d�S(���s,���Sets the URL referring to a robots.txt file.i���i���N(���R���R���t���urlparset���hostt���path(���R���R���(����(����R���R ���5���s����� c���������C���sû���t��ƒ��}�|�i�|��i�ƒ�}�g��}�|�i�ƒ��}�x+�|�o#�|�i �|�i �ƒ��ƒ�|�i�ƒ��}�q0�W|�i�|��_�|��i�d�j�p�|��i�d�j�o�t�|��_ �t�d�ƒ�nZ�|��i�d�j�o�t�|��_�t�d�ƒ�n3�|��i�d�j�o"�|�o�t�d�ƒ�|��i�|�ƒ�n�d�S( ���s4���Reads the robots.txt URL and feeds it to the parser.i‘��i“��s���disallow alli��s ���allow alliÈ���s���parse linesN(���t ���URLopenert���openert���openR���R���t���ft���linest���readlinet���linet���appendt���stript���errcodet���TrueR���R���R���t���parse(���R���R���R���R���R���(����(����R���t���read:���s&����� � c���������C���s1���d�|�i�j�o �|�|��_�n�|��i�i�|�ƒ�d��S(���Nt���*(���t���entryt ���useragentsR���R ���R���R���(���R���R%���(����(����R���t ���_add_entryN���s���� c���������C���s«��d�}�d�}�t�ƒ��}�xZ|�D]R}�|�d�}�|�p_�|�d�j�o!�t�d�|�ƒ�t�ƒ��}�d�}�q’�|�d�j�o �|��i�|�ƒ�t�ƒ��}�d�}�q’�n�|�i �d�ƒ�}�|�d�j�o�|�|� }�n�|�i�ƒ��}�|�p�q�n�|�i�d�d�ƒ�}�t �|�ƒ�d�j�o_|�d�i�ƒ��i�ƒ��|�d�<t�i�|�d�i�ƒ��ƒ�|�d�<|�d�d�j�oS�|�d�j�o(�t�d�|�ƒ�|��i�|�ƒ�t�ƒ��}�n�|�i�i�|�d�ƒ�d�}�qn|�d�d �j�oF�|�d�j�o�t�d �|�ƒ�qV|�i�i�t�|�d�t�ƒ�ƒ�d�}�qn|�d�d�j�o@�|�d�j�o�t�d �|�ƒ�qV|�i�i�t�|�d�t�ƒ�ƒ�qnt�d�|�|�d�f�ƒ�q�t�d �|�|�f�ƒ�q�W|�d�j�o�|��i�i�|�ƒ�n�t�d�t�|��ƒ�ƒ�d�S(���s���parse the input lines from a robots.txt file. We allow that a user-agent: line is not preceded by one or more blank lines.i����i���s]���line %d: warning: you should insert allow: or disallow: directives below any user-agent: linei���t���#t���:s ���user-agentsP���line %d: warning: you should insert a blank line before any user-agent directivet���disallowsH���line %d: error: you must insert a user-agent: directive before this linet���allows ���line %d: warning: unknown key %ss!���line %d: error: malformed line %ss���Parsed rules: %sN(���t���statet ���linenumbert���EntryR%���R���R���R���R���R'���t���findt���iR���t���splitt���lent���lowert���urllibt���unquoteR&���R���t ���rulelinest���RuleLineR ���R!���R���t���str(���R���R���R0���R-���R,���R%���R���(����(����R���R"���U���s^����� � ! c���������C���s»���t��d�|�|�f�ƒ�|��i�o�t�Sn�|��i�o�t�Sn�t�i �t �i �t�i�|�ƒ�ƒ�d�ƒ�p�d�}�x2�|��i�D]'�}�|�i�|�ƒ�o�|�i�|�ƒ�Sqn�qn�W|��i�o�|��i�i�|�ƒ�Sn�t�S(���s=���using the parsed robots.txt decide if useragent can fetch urls=���Checking robots.txt allowance for: user agent: %s url: %si���t���/N(���R���t ���useragentR���R���R���R ���R���R!���R4���t���quoteR���R5���R���R%���t ���applies_tot ���allowanceR ���(���R���R:���R���R%���(����(����R���t ���can_fetch•���s����� , � c���������C���s2���d�}�x%�|��i�D]�}�|�t�|�ƒ�d�}�q�W|�S(���NR���s��� (���t���retR���R���R%���R8���(���R���R%���R?���(����(����R���t���__str__ª���s ���� �(���t���__name__t ���__module__t���__doc__R���R���R���R ���R#���R'���R"���R>���R@���(����(����(����R���R�������s���� @ R7���c�����������B���s)���t��Z�d��Z�d�„��Z�d�„��Z�d�„��Z�RS(���so���A rule line is a single "Allow:" (allowance==True) or "Disallow:" (allowance==False) followed by a path.c���������C���s>���|�d�j�o�|�o �t�}�n�t�i�|�ƒ�|��_��|�|��_�d��S(���NR���(���R���R=���R!���R4���R;���R���(���R���R���R=���(����(����R���R���´���s���� c���������C���s ���|��i�d�j�p�|�i�|��i�ƒ�S(���NR$���(���R���R���t���filenamet ���startswith(���R���RD���(����(����R���R<���»���s����c���������C���s ���|��i�o�d�p�d�d�|��i�S(���Nt���Allowt���Disallows���: (���R���R=���R���(���R���(����(����R���R@���¾���s����(���RA���RB���RC���R���R<���R@���(����(����(����R���R7���±���s���� R.���c�����������B���s2���t��Z�d��Z�d�„��Z�d�„��Z�d�„��Z�d�„��Z�RS(���s?���An entry has one or more user-agents and zero or more rulelinesc���������C���s���g��|��_�g��|��_�d��S(���N(���R���R&���R6���(���R���(����(����R���R���Ä���s���� c���������C���sX���d�}�x#�|��i�D]�}�|�d�|�d�}�q�Wx%�|��i�D]�}�|�t�|�ƒ�d�}�q6�W|�S(���NR���s���User-agent: s��� (���R?���R���R&���t���agentR6���R���R8���(���R���R���RH���R?���(����(����R���R@���È���s���� � �c���������C���sg���|�i�d�ƒ�d�i�ƒ��}�xG�|��i�D]<�}�|�d�j�o�t�Sn�|�i�ƒ��}�|�|�j�o�t�Sq#�q#�Wt�S(���s2���check if this entry applies to the specified agentR9���i����R$���N(���R:���R1���R3���R���R&���RH���R!���R ���(���R���R:���RH���(����(����R���R<���Ð���s����� � c���������C���sO���xH�|��i�D]=�}�t�|�t�|�ƒ�|�i�f�ƒ�|�i�|�ƒ�o�|�i�Sq �q �Wt�S(���sZ���Preconditions: - our agent applies to this entry - filename is URL decodedN( ���R���R6���R���R���RD���R8���R=���R<���R!���(���R���RD���R���(����(����R���R=���Ý���s����� �(���RA���RB���RC���R���R@���R<���R=���(����(����(����R���R.���Â���s ���� R���c�����������B���s���t��Z�d�„��Z�d�„��Z�RS(���Nc���������G���s ���t��i�i�|��|�Œ�d�|��_�d��S(���NiÈ���(���R4���t���FancyURLopenerR���R���t���argsR ���(���R���RJ���(����(����R���R���è���s����c���������C���s(���|�|��_��t�i�i�|��|�|�|�|�|�ƒ�S(���N( ���R ���R���R4���RI���t���http_error_defaultR���t���fpt���errmsgt���headers(���R���R���RL���R ���RM���RN���(����(����R���RK���ì���s���� (���RA���RB���R���RK���(����(����(����R���R���ç���s��� c���������C���s;���|�p �d�}�n�d�}�|��|�j�o �d�GHn �d�|�GHHd��S(���Ns ���access denieds���access allowedt���faileds���ok (%s)(���t���bt���act���a(���RR���RP���RQ���(����(����R���t���_checkñ���s���� c����������C���s†��t��ƒ��}��d�a�|��i�d�ƒ�|��i�ƒ��t�|��i�d�d�ƒ�d�ƒ�t�|��i�d�d�ƒ�d�ƒ�t�|��i�d�d�ƒ�d�ƒ�t�|��i�d �d�ƒ�d�ƒ�t�|��i�d �d�ƒ�d�ƒ�t�|��i�d�d�ƒ�d�ƒ�t�|��i�d �d�ƒ�d�ƒ�t�|��i�d�d�ƒ�d�ƒ�t�|��i�d�d�ƒ�d�ƒ�t�|��i�d�d�ƒ�d�ƒ�t�|��i�d�d�ƒ�d�ƒ�t�|��i�d�d�ƒ�d�ƒ�|��i�d�ƒ�|��i�ƒ��t�|��i�d�d�ƒ�d�ƒ�d��S(���Ni���s"���http://www.musi-cal.com/robots.txtR$���s���http://www.musi-cal.com/R���i����t���CherryPickerSEs?���http://www.musi-cal.com/cgi-bin/event-search?city=San+Franciscos���CherryPickerSE/1.0s���CherryPickerSE/1.5t���ExtractorPros���http://www.musi-cal.com/blubbat���extractorpros���toolpak/1.1t���spams���http://www.musi-cal.com/searchs#���http://www.musi-cal.com/Musician/mes���http://www.lycos.com/robots.txtt���Mozillas���http://www.lycos.com/search(���R����t���rpR���R ���R#���RS���R>���(���RY���(����(����R���t���_testü���s4����� t���__main__(���RC���R���R4���t���__all__R���R���R����R7���R.���RI���R���RS���RZ���RA���( ���R���R\���R���RZ���R4���R���RS���R7���R.���R����(����(����R���t���?���s��� ›% '