Skip to content
Projects
Groups
Snippets
Help
Loading...
Help
Support
Keyboard shortcuts
?
Submit feedback
Contribute to GitLab
Sign in / Register
Toggle navigation
C
cpython
Project overview
Project overview
Details
Activity
Releases
Repository
Repository
Files
Commits
Branches
Tags
Contributors
Graph
Compare
Issues
0
Issues
0
List
Boards
Labels
Milestones
Merge Requests
0
Merge Requests
0
Analytics
Analytics
Repository
Value Stream
Wiki
Wiki
Members
Members
Collapse sidebar
Close sidebar
Activity
Graph
Create a new issue
Commits
Issue Boards
Open sidebar
Kirill Smelkov
cpython
Commits
7012673d
Commit
7012673d
authored
Oct 04, 2002
by
Marc-André Lemburg
Browse files
Options
Browse Files
Download
Email Patches
Plain Diff
Extending the encoding name normalization to handle more non-alphanumeric
characters.
parent
399a6890
Changes
1
Hide whitespace changes
Inline
Side-by-side
Showing
1 changed file
with
20 additions
and
8 deletions
+20
-8
Lib/encodings/__init__.py
Lib/encodings/__init__.py
+20
-8
No files found.
Lib/encodings/__init__.py
View file @
7012673d
...
...
@@ -3,9 +3,9 @@
Standard Python encoding modules are stored in this package
directory.
Codec modules must have names corresponding to
standard lower-case
encoding names with hyphens mapped to underscores, e.g. 'utf-8' is
implemented by the module 'utf_8.py'.
Codec modules must have names corresponding to
normalized encoding
names as defined in the normalize_encoding() function below, e.g.
'utf-8' must be
implemented by the module 'utf_8.py'.
Each codec module must export the following interface:
...
...
@@ -18,9 +18,8 @@
* getaliases() -> sequence of encoding name strings to use as aliases
Alias names returned by getaliases() must be standard encoding
names as defined above (lower-case, hyphens converted to
underscores).
Alias names returned by getaliases() must be normalized encoding
names as defined by normalize_encoding().
Written by Marc-Andre Lemburg (mal@lemburg.com).
...
...
@@ -28,16 +27,29 @@ Written by Marc-Andre Lemburg (mal@lemburg.com).
"""
#"
import
codecs
,
exceptions
import
codecs
,
exceptions
,
re
_cache
=
{}
_unknown
=
'--unknown--'
_import_tail
=
[
'*'
]
_norm_encoding_RE
=
re
.
compile
(
'[^a-zA-Z0-9.]'
)
class
CodecRegistryError
(
exceptions
.
LookupError
,
exceptions
.
SystemError
):
pass
def
normalize_encoding
(
encoding
):
""" Normalize an encoding name.
Normalization works as follows: all non-alphanumeric
characters except the dot used for Python package names are
collapsed and replaced with a single underscore, e.g. ' -;#'
becomes '_'.
"""
return
'_'
.
join
(
_norm_encoding_RE
.
split
(
encoding
))
def
search_function
(
encoding
):
# Cache lookup
...
...
@@ -51,7 +63,7 @@ def search_function(encoding):
# encoding in the aliases mapping and retry the import using the
# default import module lookup scheme with the alias name.
#
modname
=
encoding
.
replace
(
'-'
,
'_'
)
modname
=
normalize_encoding
(
encoding
)
try
:
mod
=
__import__
(
'encodings.'
+
modname
,
globals
(),
locals
(),
_import_tail
)
...
...
Write
Preview
Markdown
is supported
0%
Try again
or
attach a new file
Attach a file
Cancel
You are about to add
0
people
to the discussion. Proceed with caution.
Finish editing this message first!
Cancel
Please
register
or
sign in
to comment