Commit 9ec014ff authored by Xavier Thompson's avatar Xavier Thompson

README: Add section about strings

parent 388b9085
......@@ -33,3 +33,28 @@ cd src
make helloworld
```
## GIL-free strings
So far Cython+ merely relied on C++'s `std::string` to represent strings. As it happens, writing our own GIL-free string implementation is one of our next steps.
As I started thinking about how to wrap the C socket API, I realised that it would go much easier with our own strings. In particular, C APIs often accept `null` as a valid value for `char *` string arguments, but there is no good way to represent `null` with `std::string`.
So I wrote a minimal string implementation by wrapping a `std::string` into a Cython+ GIL-free type. There are multiple advantages:
- Aliasing a string will not create a copy, unlike `std::string`
- We can write our own methods, e.g. to mirror Python's strings
- We can enforce immutability if we want
Implemented string methods include:
- `find`
- `substr` (plays the role of Python's slice operator)
- `__add__` (string concatenation using `+`)
- `split`
- `join`
Since Cython+ GIL-free types are allocated to the heap and referenced by pointer, `null` is a valid value.
Actually at first I wanted to avoid `std::string` because they incur unneeded but unavoidable zeroing or copying overhead when interacting with C I/O APIs. This is very well explained in [the `basic_string::resize_and_overwrite` proposal for C++23](http://www.open-std.org/JTC1/SC22/WG21/docs/papers/2021/p1072r7.html). So the first version actually used `std::string_view` (C++17) with hand-crafted memory allocation using `malloc` and `free`. But that was incompatible with some C++ string libraries like the excellent [fmt](https://fmt.dev/), and that's why I went back to using `std::string` as backend.
That initial implementation remains available in branch `string`.
Markdown is supported
0%
or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment