Skip to content
Projects
Groups
Snippets
Help
Loading...
Help
Support
Keyboard shortcuts
?
Submit feedback
Contribute to GitLab
Sign in / Register
Toggle navigation
C
cpython
Project overview
Project overview
Details
Activity
Releases
Repository
Repository
Files
Commits
Branches
Tags
Contributors
Graph
Compare
Issues
0
Issues
0
List
Boards
Labels
Milestones
Merge Requests
0
Merge Requests
0
Analytics
Analytics
Repository
Value Stream
Wiki
Wiki
Members
Members
Collapse sidebar
Close sidebar
Activity
Graph
Create a new issue
Commits
Issue Boards
Open sidebar
Kirill Smelkov
cpython
Commits
06bf3902
Commit
06bf3902
authored
Oct 09, 2002
by
Andrew M. Kuchling
Browse files
Options
Browse Files
Download
Email Patches
Plain Diff
Minor edits and markup fixes
parent
4248dcd1
Changes
1
Hide whitespace changes
Inline
Side-by-side
Showing
1 changed file
with
31 additions
and
28 deletions
+31
-28
Doc/whatsnew/whatsnew23.tex
Doc/whatsnew/whatsnew23.tex
+31
-28
No files found.
Doc/whatsnew/whatsnew23.tex
View file @
06bf3902
...
...
@@ -316,24 +316,25 @@ Hisao and Martin von L\"owis.}
\section
{
PEP 277: Unicode file name support for Windows NT
}
On Windows NT, 2000, and XP, the system stores file names as Unicode
strings. Traditionally, Python has represented file names a
re
byte
strings, which is inadequate
sinc
e it renders some file names
strings. Traditionally, Python has represented file names a
s
byte
strings, which is inadequate
becaus
e it renders some file names
inaccessible.
Python
allows now to use arbitrary Unicode strings (within limitations
of the file system) for all functions that expect file names, in
particular
\function
{
open
}
. If a Unicode string is passed to
\function
{
os.listdir
}
, Python returns now a list of Unicode strings.
A new function
\function
{
getcwdu
}
returns the current directory as a
Unicode string.
Python
now allows using arbitrary Unicode strings (within the
limitations of the file system) for all functions that expect file
names, in particular the
\function
{
open()
}
built-in. If a Unicode
string is passed to
\function
{
os.listdir
}
, Python now returns a list
of Unicode strings. A new function,
\function
{
os.getcwdu()
}
, returns
the current directory as a
Unicode string.
Byte strings
continue to work as file names, the system will
transparently
convert them to Unicode using the
\code
{
mbcs
}
encoding.
Byte strings
still work as file names, and Python will transparently
convert them to Unicode using the
\code
{
mbcs
}
encoding.
Other systems allow Unicode strings as file names as well, but convert
them to byte strings before passing them to the system, which may
cause UnicodeErrors. Applications can test whether arbitrary Unicode
strings are supported as file names with
\code
{
os.path.unicode
_
file
_
names
}
.
Other systems also allow Unicode strings as file names, but convert
them to byte strings before passing them to the system which may cause
a
\exception
{
UnicodeError
}
to be raised. Applications can test whether
arbitrary Unicode strings are supported as file names by checking
\member
{
os.path.unicode
_
file
_
names
}
, a Boolean value.
\begin{seealso}
...
...
@@ -493,31 +494,33 @@ strings \samp{True} and \samp{False} instead of \samp{1} and \samp{0}.
\section
{
PEP 293: Codec Error Handling Callbacks
}
When encoding a Unicode string into a byte string, unencodable
characters may be encountered. So far, Python allowed to specify the
error processing as either ``strict'' (raise
\code
{
UnicodeError
}
,
default), ``ignore'' (skip the character), or ``replace'' (with
question mark). It may be desirable to specify an alternative
processing of the error, e.g. by inserting an XML character reference
or HTML entity reference into the converted string.
characters may be encountered. So far, Python has allowed specifying
the error processing as either ``strict'' (raising
\exception
{
UnicodeError
}
), ``ignore'' (skip the character), or
``replace'' (with question mark), defaulting to ``strict''. It may be
desirable to specify an alternative processing of the error, e.g. by
inserting an XML character reference or HTML entity reference into the
converted string.
Python now has a flexible framework to add additional processing
strategies
; n
ew error handlers can be added with
strategies
. N
ew error handlers can be added with
\function
{
codecs.register
_
error
}
. Codecs then can access the error
handler with
\code
{
codecs.lookup
_
error
}
. An equivalent C API has been
added for codecs written in C. The error handler gets various state
information, such as the string being converted, the position in the
string where the error was detected, and the target encoding. It can
then either raise an exception, or return a replacement string.
handler with
\function
{
codecs.lookup
_
error
}
. An equivalent C API has
been added for codecs written in C. The error handler gets the
necessary state information, such as the string being converted, the
position in the string where the error was detected, and the target
encoding. The handler can then either raise an exception, or return a
replacement string.
Two additional error handlers have been implemented using this
framework: ``backslashreplace'' us
ing
Python backslash quoting to
framework: ``backslashreplace'' us
es
Python backslash quoting to
represent the unencodable character, and ``xmlcharrefreplace'' emits
XML character references.
\begin{seealso}
\seepep
{
293
}{
Codec Error Handling Callbacks
}{
Written and implemented by
Walter D
ö
rwald.
}
Walter D
\"
o
rwald.
}
\end{seealso}
...
...
Write
Preview
Markdown
is supported
0%
Try again
or
attach a new file
Attach a file
Cancel
You are about to add
0
people
to the discussion. Proceed with caution.
Finish editing this message first!
Cancel
Please
register
or
sign in
to comment