"""HTML/XHTML tag builder
HTML Builder provides an ``HTML`` object that creates (X)HTML tags in a
Pythonic way, a ``literal`` class used to mark strings containing intentional
HTML markup, and a smart ``escape()`` function that preserves literals but
escapes other strings that may accidentally contain markup characters ("<",
">", "&") or malicious Javascript tags. Escaped strings are returned as
literals to prevent them from being double-escaped later.
``literal`` is a subclass of ``unicode``, so it works with all string methods
and expressions. The only thing special about it is the ``.__html__`` method,
which returns the string itself. ``escape()`` follows a simple protocol: if
the object has an ``.__html__`` method, it calls that rather than ``.__str__``
to get the HTML representation. Third-party libraries that do not want to
import ``literal`` (and this create a dependency on WebHelpers) can put an
``.__html__`` method in their own classes returning the desired HTML
representation.
When used in a mixed expression containing both literals and ordinary strings,
``literal`` tries hard to escape the strings and return a literal. However,
this depends on which value has "control" of the expression. ``literal`` seems
to be able to take control with all combinations of the ``+`` operator, but
with ``%`` and ``join`` it must be on the left side of the expression. So
these all work::
"A" + literal("B")
literal(", ").join(["A", literal("B")])
literal("%s %s") % (16, literal("kg"))
But these return an ordinary string which is prone to double-escaping later:
"\n".join([literal('Foo!'), literal('Bar!')])
"%s %s" % (literal("16"), literal("<em>kg</em>"))
Third-party libraries that don't want to import ``literal`` and thus avoid a
dependency on WebHelpers can add an ``.__html__`` method to any class, which
can return the same as ``.__str__`` or something else. ``escape()`` trusts the
HTML method and does not escape the return value. So only strings that lack
an ``.__html__`` method will be escaped.
The ``HTML`` object has the following methods for tag building:
``HTML(*strings)``
Escape the string args, concatenate them, and return a literal. This is
the same as ``escape(s)`` but accepts multiple strings. Multiple args are
useful when mixing child tags with text, such as::
html = HTML("The king is a >>", HTML.strong("fink"), "<>> HTML.tag("a", href="http://www.yahoo.com", name=None,
... c="Click Here")
literal(u'Click Here')
``HTML.__getattr__``
Same as ``HTML.tag`` but using attribute access. Example:
>>> HTML.a("Foo", href="http://example.com/", class_="important")
literal(u'Foo')
The protocol is simple: if an object has an ``.__html__`` method, ``escape()``
calls it rather than ``.__str__()`` to obtain a string representation.
About XHTML and HTML
--------------------
This builder always produces tags that are valid as *both* HTML and
XHTML. "Empty" tags (like ``
``, ```` etc) are written like ``
``,
with a space and a trailing ``/``.
*Only* empty tags get this treatment. The library will never, for example,
product ````, which is invalid HTML.
The `W3C HTML validator