Skip to content
Projects
Groups
Snippets
Help
Loading...
Help
Support
Keyboard shortcuts
?
Submit feedback
Contribute to GitLab
Sign in / Register
Toggle navigation
C
cpython
Project overview
Project overview
Details
Activity
Releases
Repository
Repository
Files
Commits
Branches
Tags
Contributors
Graph
Compare
Issues
0
Issues
0
List
Boards
Labels
Milestones
Merge Requests
0
Merge Requests
0
Analytics
Analytics
Repository
Value Stream
Wiki
Wiki
Members
Members
Collapse sidebar
Close sidebar
Activity
Graph
Create a new issue
Commits
Issue Boards
Open sidebar
Kirill Smelkov
cpython
Commits
b00e8f10
Commit
b00e8f10
authored
Apr 16, 2012
by
Vinay Sajip
Browse files
Options
Browse Files
Download
Email Patches
Plain Diff
Added cookbook example for BOM insertion.
parent
ee9e485c
Changes
1
Hide whitespace changes
Inline
Side-by-side
Showing
1 changed file
with
44 additions
and
0 deletions
+44
-0
Doc/howto/logging-cookbook.rst
Doc/howto/logging-cookbook.rst
+44
-0
No files found.
Doc/howto/logging-cookbook.rst
View file @
b00e8f10
...
...
@@ -1544,3 +1544,47 @@ works::
if __name__ == '__main__':
main()
Inserting a BOM into messages sent to a SysLogHandler
-----------------------------------------------------
`RFC 5424 <http://tools.ietf.org/html/rfc5424>`_ requires that a
Unicode message be sent to a syslog daemon as a set of bytes which have the
following structure: an optional pure-ASCII component, followed by a UTF-8 Byte
Order Mark (BOM), followed by Unicode encoded using UTF-8. (See the `relevant
section of the specification <http://tools.ietf.org/html/rfc5424#section-6>`_.)
In Python 2.6 and 2.7, code was added to
:class:`~logging.handlers.SysLogHandler` to insert a BOM into the message, but
unfortunately, it was implemented incorrectly, with the BOM appearing at the
beginning of the message and hence not allowing any pure-ASCII component to
appear before it.
As this behaviour is broken, the incorrect BOM insertion code is being removed
from Python 2.7.4 and later. However, it is not being replaced, and if you
want to produce RFC 5424-compliant messages which includes a BOM, an optional
pure-ASCII sequence before it and arbitrary Unicode after it, encoded using
UTF-8, then you need to do the following:
#. Attach a :class:`~logging.Formatter` instance to your
:class:`~logging.handlers.SysLogHandler` instance, with a format string
such as::
u"ASCII section\ufeffUnicode section"
The Unicode code point ``u'\feff```, when encoded using UTF-8, will be
encoded as a UTF-8 BOM -- the bytestring ``'\xef\xbb\bf'``.
#. Replace the ASCII section with whatever placeholders you like, but make sure
that the data that appears in there after substitution is always ASCII (that
way, it will remain unchanged after UTF-8 encoding).
#. Replace the Unicode section with whatever placeholders you like; if the data
which appears there after substitution is Unicode, that's fine -- it will be
encoded using UTF-8.
If the formatted message is Unicode, it *will* be encoded using UTF-8 encoding
by ``SysLogHandler``. If you follow these rules, you should be able to produce
RFC 5424-compliant messages. If you don't, logging may not complain, but your
messages will not be RFC 5424-compliant, and your syslog daemon may complain.
Write
Preview
Markdown
is supported
0%
Try again
or
attach a new file
Attach a file
Cancel
You are about to add
0
people
to the discussion. Proceed with caution.
Finish editing this message first!
Cancel
Please
register
or
sign in
to comment