Commit 87813c9f authored by scoder's avatar scoder Committed by GitHub

Merge pull request #2389 from gabrieldemarmiesse/test_string_14

Adding tests for "Unicode and passing strings" part 14
parents 216f998e 3f5848fc
cdef bytes bytes_string = b'hello world'
cdef char c
for c in bytes_string:
if c == 'A':
print("Found the letter A")
cdef unicode ustring = u'Hello world'
# NOTE: no typing required for 'uchar' !
for uchar in ustring:
if uchar == u'A':
print("Found the letter A")
cpdef void is_in(Py_UCS4 uchar_val):
if uchar_val in u'abcABCxY':
print("The character is in the string.")
else:
print("The character isn't in the string")
...@@ -620,22 +620,14 @@ C code:: ...@@ -620,22 +620,14 @@ C code::
for c in c_string[:100]: for c in c_string[:100]:
if c == 'A': ... if c == 'A': ...
The same applies to bytes objects:: The same applies to bytes objects:
cdef bytes bytes_string = ... .. literalinclude:: ../../examples/tutorial/string/for_bytes.pyx
cdef char c
for c in bytes_string:
if c == 'A': ...
For unicode objects, Cython will automatically infer the type of the For unicode objects, Cython will automatically infer the type of the
loop variable as :c:type:`Py_UCS4`:: loop variable as :c:type:`Py_UCS4`:
cdef unicode ustring = ...
# NOTE: no typing required for 'uchar' ! .. literalinclude:: ../../examples/tutorial/string/for_unicode.pyx
for uchar in ustring:
if uchar == u'A': ...
The automatic type inference usually leads to much more efficient code The automatic type inference usually leads to much more efficient code
here. However, note that some unicode operations still require the here. However, note that some unicode operations still require the
...@@ -648,11 +640,9 @@ loop to enforce one-time coercion before running Python operations on ...@@ -648,11 +640,9 @@ loop to enforce one-time coercion before running Python operations on
it. it.
There are also optimisations for ``in`` tests, so that the following There are also optimisations for ``in`` tests, so that the following
code will run in plain C code, (actually using a switch statement):: code will run in plain C code, (actually using a switch statement):
cdef Py_UCS4 uchar_val = get_a_unicode_character() .. literalinclude:: ../../examples/tutorial/string/if_char_in.pyx
if uchar_val in u'abcABCxY':
...
Combined with the looping optimisation above, this can result in very Combined with the looping optimisation above, this can result in very
efficient character switching code, e.g. in unicode parsers. efficient character switching code, e.g. in unicode parsers.
......
Markdown is supported
0%
or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment