Skip to content
Projects
Groups
Snippets
Help
Loading...
Help
Support
Keyboard shortcuts
?
Submit feedback
Contribute to GitLab
Sign in / Register
Toggle navigation
C
cython
Project overview
Project overview
Details
Activity
Releases
Repository
Repository
Files
Commits
Branches
Tags
Contributors
Graph
Compare
Issues
0
Issues
0
List
Boards
Labels
Milestones
Merge Requests
0
Merge Requests
0
Analytics
Analytics
Repository
Value Stream
Wiki
Wiki
Snippets
Snippets
Members
Members
Collapse sidebar
Close sidebar
Activity
Graph
Create a new issue
Commits
Issue Boards
Open sidebar
Gwenaël Samain
cython
Commits
49e31be2
Commit
49e31be2
authored
Jun 02, 2013
by
Stefan Behnel
Committed by
Stefan Behnel
Jul 14, 2013
Browse files
Options
Browse Files
Download
Email Patches
Plain Diff
add explicit section on Cython's Python string types
parent
e510060a
Changes
1
Hide whitespace changes
Inline
Side-by-side
Showing
1 changed file
with
37 additions
and
8 deletions
+37
-8
docs/src/tutorial/strings.rst
docs/src/tutorial/strings.rst
+37
-8
No files found.
docs/src/tutorial/strings.rst
View file @
49e31be2
...
...
@@ -3,14 +3,43 @@
Unicode and passing strings
===========================
Similar to the string semantics in Python 3, Cython also strictly
separates byte strings and unicode strings. Above all, this means
that by default there is no automatic conversion between byte strings
and unicode strings (except for what Python 2 does in string operations).
All encoding and decoding must pass through an explicit encoding/decoding
step. For simple cases, the module-level ``c_string_type`` and
``c_string_encoding`` directives can be used to implicitly insert these
encoding/decoding steps to ease conversion between Python and C strings.
Similar to the string semantics in Python 3, Cython strictly separates
byte strings and unicode strings. Above all, this means that by default
there is no automatic conversion between byte strings and unicode strings
(except for what Python 2 does in string operations). All encoding and
decoding must pass through an explicit encoding/decoding step. To ease
conversion between Python and C strings in simple cases, the module-level
``c_string_type`` and ``c_string_encoding`` directives can be used to
implicitly insert these encoding/decoding steps.
Python string types in Cython code
----------------------------------
Cython supports three Python string types: :type:`bytes`, :type:`str`
and :type:`unicode`. The :type:`str` type is special in that it is the
byte string in Python 2 and the Unicode string in Python 3 (for Cython
code compiled with language level 2, i.e. the default). Thus, in Python
2, both :type:`bytes` and :type:`str` represent the byte string type,
whereas in Python 3, :type:`str` and :type:`unicode` represent the Python
Unicode string type. The switch is made at C compile time, the Python
version that is used to run Cython is not relevant.
When compiling Cython code with language level 3, the :type:`str` type
is identified with exactly the Unicode string type at Cython compile time,
i.e. it no does not identify with :type:`bytes` when running in Python 2.
Note that the :type:`str` type is not compatible with the :type:`unicode`
type in Python 2, i.e. you cannot assign a Unicode string to a variable
or argument that is typed :type:`str`. The attempt will result in either
a compile time error (if detectable) or a ``TypeError`` exception at
runtime. You should therefore be careful when you statically type a
string variable in code that must be compatible with Python 2, as this
Python version allows a mix of byte strings and unicode strings for data
and users normally expect code to be able to work with both. Code that
only targets Python 3 can safely type variables and arguments as either
:type:`bytes` or :type:`unicode`.
General notes about C strings
-----------------------------
...
...
Write
Preview
Markdown
is supported
0%
Try again
or
attach a new file
Attach a file
Cancel
You are about to add
0
people
to the discussion. Proceed with caution.
Finish editing this message first!
Cancel
Please
register
or
sign in
to comment