README.rst 17.1 KB
Newer Older
1 2 3
===================================================
 Pygolang - Go-like features for Python and Cython
===================================================
Kirill Smelkov's avatar
Kirill Smelkov committed
4

Kirill Smelkov's avatar
Kirill Smelkov committed
5
Package `golang` provides Go-like features for Python:
6

7
- `gpython` is Python interpreter with support for lightweight threads.
8 9
- `go` spawns lightweight thread.
- `chan` and `select` provide channels with Go semantic.
10
- `func` allows to define methods separate from class.
11
- `defer` allows to schedule a cleanup from the main control flow.
12
- `error` and package `errors` provide error chaining.
13
- `b` and `u` provide way to make sure an object is either bytes or unicode.
14 15
- `gimport` allows to import python modules by full path in a Go workspace.

16 17 18 19
Package `golang.pyx` provides__ similar features for Cython/nogil.

__ `Cython/nogil API`_

20
Additional packages and utilities are also provided__ to close other gaps
21
between Python/Cython and Go environments.
Kirill Smelkov's avatar
Kirill Smelkov committed
22

23
__ `Additional packages and utilities`_
24

25

Kirill Smelkov's avatar
Kirill Smelkov committed
26 27 28 29 30

.. contents::
   :depth: 1


31 32 33 34 35 36 37 38
GPython
-------

Command `gpython` provides Python interpreter that supports lightweight threads
via tight integration with gevent__. The standard library of GPython is API
compatible with Python standard library, but inplace of OS threads lightweight
coroutines are provided, and IO is internally organized via
libuv__/libev__-based IO scheduler. Consequently programs can spawn lots of
Kirill Smelkov's avatar
Kirill Smelkov committed
39 40 41
coroutines cheaply, and modules like `time`, `socket`, `ssl`, `subprocess` etc -
all could be used from all coroutines simultaneously, and in the same blocking way
as if every coroutine was a full OS thread. This gives ability to scale programs
42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57
without changing concurrency model and existing code.

__ http://www.gevent.org/
__ http://libuv.org/
__ http://software.schmorp.de/pkg/libev.html


Additionally GPython sets UTF-8 to be default encoding always, and puts `go`,
`chan`, `select` etc into builtin namespace.

.. note::

   GPython is optional and the rest of Pygolang can be used from under standard Python too.
   However without gevent integration `go` spawns full - not lightweight - OS thread.


58 59 60
Goroutines and channels
-----------------------

61
`go` spawns a coroutine, or thread if gevent was not activated. It is possible to
62 63
exchange data in between either threads or coroutines via channels. `chan`
creates a new channel with Go semantic - either synchronous or buffered. Use
64
`chan.recv`, `chan.send` and `chan.close` for communication. `nilchan`
65
stands for nil channel. `select` can be used to multiplex on several
66
channels. For example::
67 68 69 70 71 72 73 74 75 76 77 78

    ch1 = chan()    # synchronous channel
    ch2 = chan(3)   # channel with buffer of size 3

    def _():
        ch1.send('a')
        ch2.send('b')
    go(_)

    ch1.recv()      # will give 'a'
    ch2.recv_()     # will give ('b', True)

79
    ch2 = nilchan   # rebind ch2 to nil channel
80 81
    _, _rx = select(
        ch1.recv,           # 0
82 83 84 85
        ch1.recv_,          # 1
        (ch1.send, obj),    # 2
        ch2.recv,           # 3
        default,            # 4
86 87 88 89 90
    )
    if _ == 0:
        # _rx is what was received from ch1
        ...
    if _ == 1:
91
        # _rx is (rx, ok) of what was received from ch1
92 93
        ...
    if _ == 2:
94
        # we know obj was sent to ch1
95 96
        ...
    if _ == 3:
97 98 99 100
        # this case will be never selected because
        # send/recv on nil channel block forever.
        ...
    if _ == 4:
101 102 103
        # default case
        ...

104 105 106 107 108 109 110 111
By default `chan` creates new channel that can carry arbitrary Python objects.
However type of channel elements can be specified via `chan(dtype=X)` - for
example `chan(dtype='C.int')` creates new channel whose elements are C
integers. `chan.nil(X)` creates typed nil channel. `Cython/nogil API`_
explains how channels with non-Python dtypes, besides in-Python usage, can be
additionally used for interaction in between Python and nogil worlds.


112 113 114
Methods
-------

115
`func` decorator allows to define methods separate from class.
116 117 118

For example::

119
  @func(MyClass)
120 121 122 123 124
  def my_method(self, ...):
      ...

will define `MyClass.my_method()`.

125 126 127 128 129 130
`func` can be also used on just functions, for example::

  @func
  def my_function(...):
      ...

131

132 133 134 135 136 137 138 139 140 141 142 143 144 145 146
Defer / recover / panic
-----------------------

`defer` allows to schedule a cleanup to be executed when current function
returns. It is similar to `try`/`finally` but does not force the cleanup part
to be far away in the end. For example::

   wc = wcfs.join(zurl)    │     wc = wcfs.join(zurl)
   defer(wc.close)         │     try:
                           │        ...
   ...                     │        ...
   ...                     │        ...
   ...                     │     finally:
                           │        wc.close()

147 148 149 150 151 152
If deferred cleanup fails, previously unhandled exception, if any, won't be
lost - it will be chained with (`PEP 3134`__) and included into traceback dump
even on Python2.

__ https://www.python.org/dev/peps/pep-3134/

153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168
For completeness there is `recover` and `panic` that allow to program with
Go-style error handling, for example::

   def _():
      r = recover()
      if r is not None:
         print("recovered. error was: %s" % (r,))
   defer(_)

   ...

   panic("aaa")

But `recover` and `panic` are probably of less utility since they can be
practically natively modelled with `try`/`except`.

169 170
If `defer` is used, the function that uses it must be wrapped with `@func`
decorator.
171

Kirill Smelkov's avatar
Kirill Smelkov committed
172

173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226
Errors
------

In concurrent systems operational stack generally differs from execution code
flow, which makes code stack traces significantly less useful to understand an
error. Pygolang provides support for error chaining that gives ability to build
operational error stack and to inspect resulting errors:

`error` is error type that can be used by itself or subclassed. By
providing `.Unwrap()` method, an error can optionally wrap another error this
way forming an error chain. `errors.Is` reports whether an item in error chain
matches target. `fmt.Errorf` provides handy way to build wrapping errors.
For example::

   e1 = error("problem")
   e2 = fmt.Errorf("doing something for %s: %w", "joe", e1)
   print(e2)         # prints "doing something for joe: problem"
   errors.Is(e2, e1) # gives True

   # OpError is example class to represents an error of operation op(path).
   class OpError(error):
      def __init__(e, op, path, err):
         e.op   = op
         e.path = path
         e.err  = err

      # .Error() should be used to define what error's string is.
      # it is automatically used by error to also provide both .__str__ and .__repr__.
      def Error(e):
         return "%s %s: %s" % (e.op, e.path, e.err)

      # provided .Unwrap() indicates that this error is chained.
      def Unwrap(e):
         return e.err

   mye = OpError("read", "file.txt", io.ErrUnexpectedEOF)
   print(mye)                          # prints "read file.txt: unexpected EOF"
   errors.Is(mye, io.EOF)              # gives False
   errors.Is(mye. io.ErrUnexpectedEOF) # gives True

Both wrapped and wrapping error can be of arbitrary Python type - not
necessarily of `error` or its subclass.

`error` is also used to represent at Python level an error returned by
Cython/nogil call (see `Cython/nogil API`_) and preserves Cython/nogil error
chain for inspection at Python level.

Pygolang error chaining integrates with Python error chaining and takes
`.__cause__` attribute into account for exception created via `raise X from Y`
(`PEP 3134`__).

__ https://www.python.org/dev/peps/pep-3134/


227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244
Strings
-------

`b` and `u` provide way to make sure an object is either bytes or unicode.
`b(obj)` converts str/unicode/bytes obj to UTF-8 encoded bytestring, while
`u(obj)` converts str/unicode/bytes obj to unicode string. For example::

   b("привет мир")   # -> gives bytes corresponding to UTF-8 encoding of "привет мир".

   def f(s):
      s = u(s)       # make sure s is unicode, decoding as UTF-8(*) if it was bytes.
      ...            # (*) but see below about lack of decode errors.

The conversion in both encoding and decoding never fails and never looses
information: `b(u(·))` and `u(b(·))` are always identity for bytes and unicode
correspondingly, even if bytes input is not valid UTF-8.


245 246 247 248
Import
------

`gimport` provides way to import python modules by full path in a Go workspace.
Kirill Smelkov's avatar
Kirill Smelkov committed
249 250 251 252 253

For example

::

254
    lonet = gimport('lab.nexedi.com/kirr/go123/xnet/lonet')
Kirill Smelkov's avatar
Kirill Smelkov committed
255 256 257 258 259 260 261

will import either

- `lab.nexedi.com/kirr/go123/xnet/lonet.py`, or
- `lab.nexedi.com/kirr/go123/xnet/lonet/__init__.py`

located in `src/` under `$GOPATH`.
Kirill Smelkov's avatar
Kirill Smelkov committed
262

263 264 265 266

Cython/nogil API
----------------

267
Cython package `golang` provides *nogil* API with goroutines, channels and
Kirill Smelkov's avatar
Kirill Smelkov committed
268
other features that mirror corresponding Python package. Cython API is not only
269 270 271 272 273
faster compared to Python version, but also, due to *nogil* property, allows to
build concurrent systems without limitations imposed by Python's GIL. All that
while still programming in Python-like language. Brief description of
Cython/nogil API follows:

Kirill Smelkov's avatar
Kirill Smelkov committed
274
`go` spawns new task - a coroutine, or thread, depending on activated runtime.
275 276
`chan[T]` represents a channel with Go semantic and elements of type `T`.
Use `makechan[T]` to create new channel, and `chan[T].recv`, `chan[T].send`,
277
`chan[T].close` for communication. `nil` stands for nil channel. `select`
278
can be used to multiplex on several channels. For example::
Kirill Smelkov's avatar
Kirill Smelkov committed
279 280

   cdef nogil:
281 282 283 284 285 286 287 288 289 290 291
      struct Point:
         int x
         int y

      void worker(chan[int] chi, chan[Point] chp):
         chi.send(1)

         cdef Point p
         p.x = 3
         p.y = 4
         chp.send(p)
Kirill Smelkov's avatar
Kirill Smelkov committed
292 293

      void myfunc():
294 295 296 297 298 299 300 301 302 303 304 305
         cdef chan[int]   chi = makechan[int]()       # synchronous channel of integers
         cdef chan[Point] chp = makechan[Point](3)    # channel with buffer of size 3 and Point elements

         go(worker, chi, chp)

         i = chi.recv()    # will give 1
         p = chp.recv()    # will give Point(3,4)

         chp = nil         # rebind chp to nil channel
         cdef cbool ok
         cdef int j = 33
         _ = select([
Kirill Smelkov's avatar
Kirill Smelkov committed
306
             chi.recvs(&i),         # 0
307 308 309 310 311 312 313 314 315 316 317 318 319 320 321 322 323 324 325 326 327
             chi.recvs(&i, &ok),    # 1
             chi.sends(&j),         # 2
             chp.recvs(&p),         # 3
             default,               # 4
         ])
         if _ == 0:
             # i is what was received from chi
             ...
         if _ == 1:
             # (i, ok) is what was received from chi
             ...
         if _ == 2:
             # we know j was sent to chi
             ...
         if _ == 3:
             # this case will be never selected because
             # send/recv on nil channel block forever.
             ...
         if _ == 4:
             # default case
             ...
Kirill Smelkov's avatar
Kirill Smelkov committed
328

329
Python channels are represented by `pychan` cdef class. Python
330
channels that carry non-Python elements (`pychan.dtype != DTYPE_PYOBJECT`) can
331
be converted to Cython/nogil `chan[T]` via `pychan.chan_*()`.
332 333 334
Similarly Cython/nogil `chan[T]` can be wrapped into `pychan` via
`pychan.from_chan_*()`. This provides interaction mechanism
in between *nogil* and Python worlds. For example::
335 336 337 338 339 340 341 342 343 344 345

   def myfunc(pychan pych):
      if pych.dtype != DTYPE_INT:
         raise TypeError("expected chan[int]")

      cdef chan[int] ch = pych.chan_int()  # pychan -> chan[int]
      with nogil:
         # use ch in nogil code. Both Python and nogil parts can
         # send/receive on the channel simultaneously.
         ...

346 347 348 349 350 351 352 353 354 355 356
   def mytick(): # -> pychan
      cdef chan[int] ch
      with nogil:
         # create a channel that is connected to some nogil task of the program
         ch = ...

      # wrap the channel into pychan. Both Python and nogil parts can
      # send/receive on the channel simultaneously.
      cdef pychan pych = pychan.from_chan_int(ch)  # pychan <- chan[int]
      return pych

357

358 359 360 361 362 363 364 365 366 367 368 369 370 371 372 373
`error` is the interface that represents errors. `errors.New` and `fmt.errorf`
provide way to build errors from text. An error can optionally wrap another
error by implementing `errorWrapper` interface and providing `.Unwrap()` method.
`errors.Is` reports whether an item in error chain matches target. `fmt.errorf`
with `%w` specifier provide handy way to build wrapping errors. For example::

   e1 = errors.New("problem")
   e2 = fmt.errorf("doing something for %s: %w", "joe", e1)
   e2.Error()        # gives "doing something for joe: problem"
   errors.Is(e2, e1) # gives True

An `error` can be exposed to Python via `pyerror` cdef class wrapper
instantiated by `pyerror.from_error()`. `pyerror` preserves Cython/nogil error
chain for inspection by Python-level `error.Is`.


374 375 376 377 378 379 380 381 382 383 384 385 386 387 388 389 390 391 392 393 394 395
`panic` stops normal execution of current goroutine by throwing a C-level
exception. On Python/C boundaries C-level exceptions have to be converted to
Python-level exceptions with `topyexc`. For example::

   cdef void _do_something() nogil:
      ...
      panic("bug")   # hit a bug

   # do_something is called by Python code - it is thus on Python/C boundary
   cdef void do_something() nogil except +topyexc:
      _do_something()

   def pydo_something():
      with nogil:
         do_something()


See |libgolang.h|_ and |golang.pxd|_ for details of the API.
See also |testprog/golang_pyx_user/|_ for demo project that uses Pygolang in
Cython/nogil mode.

.. |libgolang.h| replace:: `libgolang.h`
396
.. _libgolang.h: https://lab.nexedi.com/nexedi/pygolang/tree/master/golang/libgolang.h
397 398

.. |golang.pxd| replace:: `golang.pxd`
399
.. _golang.pxd: https://lab.nexedi.com/nexedi/pygolang/tree/master/golang/_golang.pxd
400 401

.. |testprog/golang_pyx_user/| replace:: `testprog/golang_pyx_user/`
402
.. _testprog/golang_pyx_user/: https://lab.nexedi.com/nexedi/pygolang/tree/master/golang/pyx/testprog/golang_pyx_user
403

404 405 406 407 408 409
--------

Additional packages and utilities
---------------------------------

The following additional packages and utilities are also provided to close gaps
410
between Python/Cython and Go environments:
411 412 413

.. contents::
   :local:
Kirill Smelkov's avatar
Kirill Smelkov committed
414

415 416 417 418
Concurrency
~~~~~~~~~~~

In addition to `go` and channels, the following packages are provided to help
Kirill Smelkov's avatar
Kirill Smelkov committed
419
handle concurrency in structured ways:
420

421
- |golang.context|_ (py__, pyx__) provides contexts to propagate deadlines, cancellation and
Kirill Smelkov's avatar
Kirill Smelkov committed
422
  task-scoped values among spawned goroutines [*]_.
423

424
  .. |golang.context| replace:: `golang.context`
425 426 427
  .. _golang.context: https://lab.nexedi.com/nexedi/pygolang/tree/master/golang/context.h
  __ https://lab.nexedi.com/nexedi/pygolang/tree/master/golang/context.py
  __ https://lab.nexedi.com/nexedi/pygolang/tree/master/golang/_context.pxd
428 429

- |golang.sync|_ (py__, pyx__) provides `sync.WorkGroup` to spawn group of goroutines working
Kirill Smelkov's avatar
Kirill Smelkov committed
430
  on a common task. It also provides low-level primitives - for example
Kirill Smelkov's avatar
Kirill Smelkov committed
431 432
  `sync.Once`, `sync.WaitGroup`, `sync.Mutex` and `sync.RWMutex` - that are
  sometimes useful too.
433

434
  .. |golang.sync| replace:: `golang.sync`
435 436 437
  .. _golang.sync: https://lab.nexedi.com/nexedi/pygolang/tree/master/golang/sync.h
  __ https://lab.nexedi.com/nexedi/pygolang/tree/master/golang/sync.py
  __ https://lab.nexedi.com/nexedi/pygolang/tree/master/golang/_sync.pxd
438 439 440 441

- |golang.time|_ (py__, pyx__) provides timers integrated with channels.

  .. |golang.time| replace:: `golang.time`
442 443 444
  .. _golang.time: https://lab.nexedi.com/nexedi/pygolang/tree/master/golang/time.h
  __ https://lab.nexedi.com/nexedi/pygolang/tree/master/golang/time.py
  __ https://lab.nexedi.com/nexedi/pygolang/tree/master/golang/_time.pxd
445

446

Kirill Smelkov's avatar
Kirill Smelkov committed
447
.. [*] See `Go Concurrency Patterns: Context`__ for overview.
448

Kirill Smelkov's avatar
Kirill Smelkov committed
449
__ https://blog.golang.org/context
450 451


Kirill Smelkov's avatar
Kirill Smelkov committed
452
String conversion
453
~~~~~~~~~~~~~~~~~
Kirill Smelkov's avatar
Kirill Smelkov committed
454 455 456 457 458 459 460 461 462 463

`qq` (import from `golang.gcompat`) provides `%q` functionality that quotes as
Go would do. For example the following code will print name quoted in `"`
without escaping printable UTF-8 characters::

   print('hello %s' % qq(name))

`qq` accepts both `str` and `bytes` (`unicode` and `str` on Python2)
and also any other type that can be converted to `str`.

464
Package |golang.strconv|_ provides direct access to conversion routines, for
465
example `strconv.quote` and `strconv.unquote`.
466

467
.. |golang.strconv| replace:: `golang.strconv`
468
.. _golang.strconv: https://lab.nexedi.com/nexedi/pygolang/tree/master/golang/strconv.py
469

Kirill Smelkov's avatar
Kirill Smelkov committed
470 471

Benchmarking and testing
472
~~~~~~~~~~~~~~~~~~~~~~~~
Kirill Smelkov's avatar
Kirill Smelkov committed
473 474 475 476 477 478 479 480 481 482 483 484 485 486 487 488 489 490

`py.bench` allows to benchmark python code similarly to `go test -bench` and `py.test`.
For example, running `py.bench` on the following code::

    def bench_add(b):
        x, y = 1, 2
        for i in xrange(b.N):
            x + y

gives something like::

    $ py.bench --count=3 x.py
    ...
    pymod: bench_add.py
    Benchmarkadd    50000000        0.020 µs/op
    Benchmarkadd    50000000        0.020 µs/op
    Benchmarkadd    50000000        0.020 µs/op

491
Package |golang.testing|_ provides corresponding runtime bits, e.g. `testing.B`.
Kirill Smelkov's avatar
Kirill Smelkov committed
492 493 494 495

`py.bench` produces output in `Go benchmark format`__, and so benchmark results
can be analyzed and compared with standard Go tools, for example with
`benchstat`__.
496
Additionally package |golang.x.perf.benchlib|_ can be used to load and process
Kirill Smelkov's avatar
Kirill Smelkov committed
497 498
such benchmarking data in Python.

499
.. |golang.testing| replace:: `golang.testing`
500
.. _golang.testing: https://lab.nexedi.com/nexedi/pygolang/tree/master/golang/testing.py
501
.. |golang.x.perf.benchlib| replace:: `golang.x.perf.benchlib`
502
.. _golang.x.perf.benchlib: https://lab.nexedi.com/nexedi/pygolang/tree/master/golang/x/perf/benchlib.py
Kirill Smelkov's avatar
Kirill Smelkov committed
503 504
__ https://github.com/golang/proposal/blob/master/design/14313-benchmark-format.md
__ https://godoc.org/golang.org/x/perf/cmd/benchstat