Jc @sdZddkZddkZyddklZWn#ej oddklZnXddklZddk l Z ddk l Z l Z l Z ddk lZlZyeWn#ej oddklZnXyed ZWneefj o eZnXyed ZWneefj o eZnXyed ZWneefj o eZnXyed ZWn%eefj oeefZnXd ddddddgZeideieiBZ eideiZ!eideiZ"eidi#Z$eideieiBZ%ei&dZ'ei&ddhe d6Z(de)fdYZ*e*Z+e+i,Z,eideieideigZ-d d!d"d#d$d%gZ.eid&eieid'eieid(gZ/d)gZ0e-e.e/e0d*Z1d+Z2d,Z3e1ie3_d!d d"gZ4d-gZ5d.e4e5ed/d0Z6d1Z7d2Z8eid3eiZ9d4Z:dS(5scA cleanup tool for HTML. Removes unwanted tags and content. See the `Cleaner` class for details. iN(turlsplit(tetree(tdefs(t fromstringttostringtXHTML_NAMESPACE(t_nonst_transform_result(tSettunichrtunicodetbytest basestringt clean_htmltcleantCleanertautolinkt autolink_htmlt word_breaktword_break_htmlsexpression\s*\(.*?\)s @\s*imports:\s*(?:javascript|jscript|livescript|vbscript|about|mocha):s\s+s\[if[\s\n\r]+.*?][\s\n\r]*>sdescendant-or-self::*[@style]sdescendant-or-self::a [normalize-space(@href) and substring(normalize-space(@href),1,1) != '#'] |descendant-or-self::x:a[normalize-space(@href) and substring(normalize-space(@href),1,1) != '#']t namespacestxcBs:eZdZeZeZeZeZeZ eZ eZ eZ eZ eZeZeZdZdZeZeZeZdZeddgZdZedddddd d gddddd dd dZd ZdZdZdZ dZ!ddZ"dZ#e$i%de$i&i'Z(dZ)dZ*RS(s Instances cleans the document of each of the possible offending elements. The cleaning is controlled by attributes; you can override attributes in a subclass, or set them in the constructor. ``scripts``: Removes any ``