README.rst 17.9 KB
Newer Older
1 2 3
===================================================
 Pygolang - Go-like features for Python and Cython
===================================================
Kirill Smelkov's avatar
Kirill Smelkov committed
4

Kirill Smelkov's avatar
Kirill Smelkov committed
5
Package `golang` provides Go-like features for Python:
6

7
- `gpython` is Python interpreter with support for lightweight threads.
8 9
- `go` spawns lightweight thread.
- `chan` and `select` provide channels with Go semantic.
10
- `func` allows to define methods separate from class.
11
- `defer` allows to schedule a cleanup from the main control flow.
12
- `error` and package `errors` provide error chaining.
13
- `b` and `u` provide way to make sure an object is either bytes or unicode.
14 15
- `gimport` allows to import python modules by full path in a Go workspace.

16 17 18 19
Package `golang.pyx` provides__ similar features for Cython/nogil.

__ `Cython/nogil API`_

20
Additional packages and utilities are also provided__ to close other gaps
21
between Python/Cython and Go environments.
Kirill Smelkov's avatar
Kirill Smelkov committed
22

23
__ `Additional packages and utilities`_
24

25

Kirill Smelkov's avatar
Kirill Smelkov committed
26 27 28 29 30

.. contents::
   :depth: 1


31 32 33 34 35 36 37 38
GPython
-------

Command `gpython` provides Python interpreter that supports lightweight threads
via tight integration with gevent__. The standard library of GPython is API
compatible with Python standard library, but inplace of OS threads lightweight
coroutines are provided, and IO is internally organized via
libuv__/libev__-based IO scheduler. Consequently programs can spawn lots of
Kirill Smelkov's avatar
Kirill Smelkov committed
39 40 41
coroutines cheaply, and modules like `time`, `socket`, `ssl`, `subprocess` etc -
all could be used from all coroutines simultaneously, and in the same blocking way
as if every coroutine was a full OS thread. This gives ability to scale programs
42 43 44 45 46 47 48 49 50 51 52 53 54 55
without changing concurrency model and existing code.

__ http://www.gevent.org/
__ http://libuv.org/
__ http://software.schmorp.de/pkg/libev.html


Additionally GPython sets UTF-8 to be default encoding always, and puts `go`,
`chan`, `select` etc into builtin namespace.

.. note::

   GPython is optional and the rest of Pygolang can be used from under standard Python too.
   However without gevent integration `go` spawns full - not lightweight - OS thread.
56 57
   GPython can be also used with threads - not gevent - runtime. Please see
   `GPython options`_ for details.
58 59


60 61 62
Goroutines and channels
-----------------------

63
`go` spawns a coroutine, or thread if gevent was not activated. It is possible to
64 65
exchange data in between either threads or coroutines via channels. `chan`
creates a new channel with Go semantic - either synchronous or buffered. Use
66
`chan.recv`, `chan.send` and `chan.close` for communication. `nilchan`
67
stands for nil channel. `select` can be used to multiplex on several
68
channels. For example::
69 70 71 72 73 74 75 76 77 78 79 80

    ch1 = chan()    # synchronous channel
    ch2 = chan(3)   # channel with buffer of size 3

    def _():
        ch1.send('a')
        ch2.send('b')
    go(_)

    ch1.recv()      # will give 'a'
    ch2.recv_()     # will give ('b', True)

81
    ch2 = nilchan   # rebind ch2 to nil channel
82 83
    _, _rx = select(
        ch1.recv,           # 0
84 85 86 87
        ch1.recv_,          # 1
        (ch1.send, obj),    # 2
        ch2.recv,           # 3
        default,            # 4
88 89 90 91 92
    )
    if _ == 0:
        # _rx is what was received from ch1
        ...
    if _ == 1:
93
        # _rx is (rx, ok) of what was received from ch1
94 95
        ...
    if _ == 2:
96
        # we know obj was sent to ch1
97 98
        ...
    if _ == 3:
99 100 101 102
        # this case will be never selected because
        # send/recv on nil channel block forever.
        ...
    if _ == 4:
103 104 105
        # default case
        ...

106 107 108 109 110 111 112 113
By default `chan` creates new channel that can carry arbitrary Python objects.
However type of channel elements can be specified via `chan(dtype=X)` - for
example `chan(dtype='C.int')` creates new channel whose elements are C
integers. `chan.nil(X)` creates typed nil channel. `Cython/nogil API`_
explains how channels with non-Python dtypes, besides in-Python usage, can be
additionally used for interaction in between Python and nogil worlds.


114 115 116
Methods
-------

117
`func` decorator allows to define methods separate from class.
118 119 120

For example::

121
  @func(MyClass)
122 123 124 125 126
  def my_method(self, ...):
      ...

will define `MyClass.my_method()`.

127 128 129 130 131 132
`func` can be also used on just functions, for example::

  @func
  def my_function(...):
      ...

133

134 135 136 137 138 139 140 141 142 143 144 145 146 147 148
Defer / recover / panic
-----------------------

`defer` allows to schedule a cleanup to be executed when current function
returns. It is similar to `try`/`finally` but does not force the cleanup part
to be far away in the end. For example::

   wc = wcfs.join(zurl)    │     wc = wcfs.join(zurl)
   defer(wc.close)         │     try:
                           │        ...
   ...                     │        ...
   ...                     │        ...
   ...                     │     finally:
                           │        wc.close()

149 150 151 152 153 154
If deferred cleanup fails, previously unhandled exception, if any, won't be
lost - it will be chained with (`PEP 3134`__) and included into traceback dump
even on Python2.

__ https://www.python.org/dev/peps/pep-3134/

155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170
For completeness there is `recover` and `panic` that allow to program with
Go-style error handling, for example::

   def _():
      r = recover()
      if r is not None:
         print("recovered. error was: %s" % (r,))
   defer(_)

   ...

   panic("aaa")

But `recover` and `panic` are probably of less utility since they can be
practically natively modelled with `try`/`except`.

171 172
If `defer` is used, the function that uses it must be wrapped with `@func`
decorator.
173

Kirill Smelkov's avatar
Kirill Smelkov committed
174

175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228
Errors
------

In concurrent systems operational stack generally differs from execution code
flow, which makes code stack traces significantly less useful to understand an
error. Pygolang provides support for error chaining that gives ability to build
operational error stack and to inspect resulting errors:

`error` is error type that can be used by itself or subclassed. By
providing `.Unwrap()` method, an error can optionally wrap another error this
way forming an error chain. `errors.Is` reports whether an item in error chain
matches target. `fmt.Errorf` provides handy way to build wrapping errors.
For example::

   e1 = error("problem")
   e2 = fmt.Errorf("doing something for %s: %w", "joe", e1)
   print(e2)         # prints "doing something for joe: problem"
   errors.Is(e2, e1) # gives True

   # OpError is example class to represents an error of operation op(path).
   class OpError(error):
      def __init__(e, op, path, err):
         e.op   = op
         e.path = path
         e.err  = err

      # .Error() should be used to define what error's string is.
      # it is automatically used by error to also provide both .__str__ and .__repr__.
      def Error(e):
         return "%s %s: %s" % (e.op, e.path, e.err)

      # provided .Unwrap() indicates that this error is chained.
      def Unwrap(e):
         return e.err

   mye = OpError("read", "file.txt", io.ErrUnexpectedEOF)
   print(mye)                          # prints "read file.txt: unexpected EOF"
   errors.Is(mye, io.EOF)              # gives False
   errors.Is(mye. io.ErrUnexpectedEOF) # gives True

Both wrapped and wrapping error can be of arbitrary Python type - not
necessarily of `error` or its subclass.

`error` is also used to represent at Python level an error returned by
Cython/nogil call (see `Cython/nogil API`_) and preserves Cython/nogil error
chain for inspection at Python level.

Pygolang error chaining integrates with Python error chaining and takes
`.__cause__` attribute into account for exception created via `raise X from Y`
(`PEP 3134`__).

__ https://www.python.org/dev/peps/pep-3134/


229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246
Strings
-------

`b` and `u` provide way to make sure an object is either bytes or unicode.
`b(obj)` converts str/unicode/bytes obj to UTF-8 encoded bytestring, while
`u(obj)` converts str/unicode/bytes obj to unicode string. For example::

   b("привет мир")   # -> gives bytes corresponding to UTF-8 encoding of "привет мир".

   def f(s):
      s = u(s)       # make sure s is unicode, decoding as UTF-8(*) if it was bytes.
      ...            # (*) but see below about lack of decode errors.

The conversion in both encoding and decoding never fails and never looses
information: `b(u(·))` and `u(b(·))` are always identity for bytes and unicode
correspondingly, even if bytes input is not valid UTF-8.


247 248 249 250
Import
------

`gimport` provides way to import python modules by full path in a Go workspace.
Kirill Smelkov's avatar
Kirill Smelkov committed
251 252 253 254 255

For example

::

256
    lonet = gimport('lab.nexedi.com/kirr/go123/xnet/lonet')
Kirill Smelkov's avatar
Kirill Smelkov committed
257 258 259 260 261 262 263

will import either

- `lab.nexedi.com/kirr/go123/xnet/lonet.py`, or
- `lab.nexedi.com/kirr/go123/xnet/lonet/__init__.py`

located in `src/` under `$GOPATH`.
Kirill Smelkov's avatar
Kirill Smelkov committed
264

265 266 267 268

Cython/nogil API
----------------

269
Cython package `golang` provides *nogil* API with goroutines, channels and
Kirill Smelkov's avatar
Kirill Smelkov committed
270
other features that mirror corresponding Python package. Cython API is not only
271 272 273 274 275
faster compared to Python version, but also, due to *nogil* property, allows to
build concurrent systems without limitations imposed by Python's GIL. All that
while still programming in Python-like language. Brief description of
Cython/nogil API follows:

Kirill Smelkov's avatar
Kirill Smelkov committed
276
`go` spawns new task - a coroutine, or thread, depending on activated runtime.
277 278
`chan[T]` represents a channel with Go semantic and elements of type `T`.
Use `makechan[T]` to create new channel, and `chan[T].recv`, `chan[T].send`,
279
`chan[T].close` for communication. `nil` stands for nil channel. `select`
280
can be used to multiplex on several channels. For example::
Kirill Smelkov's avatar
Kirill Smelkov committed
281 282

   cdef nogil:
283 284 285 286 287 288 289 290 291 292 293
      struct Point:
         int x
         int y

      void worker(chan[int] chi, chan[Point] chp):
         chi.send(1)

         cdef Point p
         p.x = 3
         p.y = 4
         chp.send(p)
Kirill Smelkov's avatar
Kirill Smelkov committed
294 295

      void myfunc():
296 297 298 299 300 301 302 303 304 305 306 307
         cdef chan[int]   chi = makechan[int]()       # synchronous channel of integers
         cdef chan[Point] chp = makechan[Point](3)    # channel with buffer of size 3 and Point elements

         go(worker, chi, chp)

         i = chi.recv()    # will give 1
         p = chp.recv()    # will give Point(3,4)

         chp = nil         # rebind chp to nil channel
         cdef cbool ok
         cdef int j = 33
         _ = select([
Kirill Smelkov's avatar
Kirill Smelkov committed
308
             chi.recvs(&i),         # 0
309 310 311 312 313 314 315 316 317 318 319 320 321 322 323 324 325 326 327 328 329
             chi.recvs(&i, &ok),    # 1
             chi.sends(&j),         # 2
             chp.recvs(&p),         # 3
             default,               # 4
         ])
         if _ == 0:
             # i is what was received from chi
             ...
         if _ == 1:
             # (i, ok) is what was received from chi
             ...
         if _ == 2:
             # we know j was sent to chi
             ...
         if _ == 3:
             # this case will be never selected because
             # send/recv on nil channel block forever.
             ...
         if _ == 4:
             # default case
             ...
Kirill Smelkov's avatar
Kirill Smelkov committed
330

331
Python channels are represented by `pychan` cdef class. Python
332
channels that carry non-Python elements (`pychan.dtype != DTYPE_PYOBJECT`) can
333
be converted to Cython/nogil `chan[T]` via `pychan.chan_*()`.
334 335 336
Similarly Cython/nogil `chan[T]` can be wrapped into `pychan` via
`pychan.from_chan_*()`. This provides interaction mechanism
in between *nogil* and Python worlds. For example::
337 338 339 340 341 342 343 344 345 346 347

   def myfunc(pychan pych):
      if pych.dtype != DTYPE_INT:
         raise TypeError("expected chan[int]")

      cdef chan[int] ch = pych.chan_int()  # pychan -> chan[int]
      with nogil:
         # use ch in nogil code. Both Python and nogil parts can
         # send/receive on the channel simultaneously.
         ...

348 349 350 351 352 353 354 355 356 357 358
   def mytick(): # -> pychan
      cdef chan[int] ch
      with nogil:
         # create a channel that is connected to some nogil task of the program
         ch = ...

      # wrap the channel into pychan. Both Python and nogil parts can
      # send/receive on the channel simultaneously.
      cdef pychan pych = pychan.from_chan_int(ch)  # pychan <- chan[int]
      return pych

359

360 361 362 363 364 365 366 367 368 369 370 371 372 373 374 375
`error` is the interface that represents errors. `errors.New` and `fmt.errorf`
provide way to build errors from text. An error can optionally wrap another
error by implementing `errorWrapper` interface and providing `.Unwrap()` method.
`errors.Is` reports whether an item in error chain matches target. `fmt.errorf`
with `%w` specifier provide handy way to build wrapping errors. For example::

   e1 = errors.New("problem")
   e2 = fmt.errorf("doing something for %s: %w", "joe", e1)
   e2.Error()        # gives "doing something for joe: problem"
   errors.Is(e2, e1) # gives True

An `error` can be exposed to Python via `pyerror` cdef class wrapper
instantiated by `pyerror.from_error()`. `pyerror` preserves Cython/nogil error
chain for inspection by Python-level `error.Is`.


376 377 378 379 380 381 382 383 384 385 386 387 388 389 390 391 392 393 394 395 396 397
`panic` stops normal execution of current goroutine by throwing a C-level
exception. On Python/C boundaries C-level exceptions have to be converted to
Python-level exceptions with `topyexc`. For example::

   cdef void _do_something() nogil:
      ...
      panic("bug")   # hit a bug

   # do_something is called by Python code - it is thus on Python/C boundary
   cdef void do_something() nogil except +topyexc:
      _do_something()

   def pydo_something():
      with nogil:
         do_something()


See |libgolang.h|_ and |golang.pxd|_ for details of the API.
See also |testprog/golang_pyx_user/|_ for demo project that uses Pygolang in
Cython/nogil mode.

.. |libgolang.h| replace:: `libgolang.h`
398
.. _libgolang.h: https://lab.nexedi.com/nexedi/pygolang/tree/master/golang/libgolang.h
399 400

.. |golang.pxd| replace:: `golang.pxd`
401
.. _golang.pxd: https://lab.nexedi.com/nexedi/pygolang/tree/master/golang/_golang.pxd
402 403

.. |testprog/golang_pyx_user/| replace:: `testprog/golang_pyx_user/`
404
.. _testprog/golang_pyx_user/: https://lab.nexedi.com/nexedi/pygolang/tree/master/golang/pyx/testprog/golang_pyx_user
405

406 407 408 409 410 411
--------

Additional packages and utilities
---------------------------------

The following additional packages and utilities are also provided to close gaps
412
between Python/Cython and Go environments:
413 414 415

.. contents::
   :local:
Kirill Smelkov's avatar
Kirill Smelkov committed
416

417 418 419 420
Concurrency
~~~~~~~~~~~

In addition to `go` and channels, the following packages are provided to help
Kirill Smelkov's avatar
Kirill Smelkov committed
421
handle concurrency in structured ways:
422

423
- |golang.context|_ (py__, pyx__) provides contexts to propagate deadlines, cancellation and
Kirill Smelkov's avatar
Kirill Smelkov committed
424
  task-scoped values among spawned goroutines [*]_.
425

426
  .. |golang.context| replace:: `golang.context`
427 428 429
  .. _golang.context: https://lab.nexedi.com/nexedi/pygolang/tree/master/golang/context.h
  __ https://lab.nexedi.com/nexedi/pygolang/tree/master/golang/context.py
  __ https://lab.nexedi.com/nexedi/pygolang/tree/master/golang/_context.pxd
430 431

- |golang.sync|_ (py__, pyx__) provides `sync.WorkGroup` to spawn group of goroutines working
Kirill Smelkov's avatar
Kirill Smelkov committed
432
  on a common task. It also provides low-level primitives - for example
Kirill Smelkov's avatar
Kirill Smelkov committed
433 434
  `sync.Once`, `sync.WaitGroup`, `sync.Mutex` and `sync.RWMutex` - that are
  sometimes useful too.
435

436
  .. |golang.sync| replace:: `golang.sync`
437 438 439
  .. _golang.sync: https://lab.nexedi.com/nexedi/pygolang/tree/master/golang/sync.h
  __ https://lab.nexedi.com/nexedi/pygolang/tree/master/golang/sync.py
  __ https://lab.nexedi.com/nexedi/pygolang/tree/master/golang/_sync.pxd
440 441 442 443

- |golang.time|_ (py__, pyx__) provides timers integrated with channels.

  .. |golang.time| replace:: `golang.time`
444 445 446
  .. _golang.time: https://lab.nexedi.com/nexedi/pygolang/tree/master/golang/time.h
  __ https://lab.nexedi.com/nexedi/pygolang/tree/master/golang/time.py
  __ https://lab.nexedi.com/nexedi/pygolang/tree/master/golang/_time.pxd
447

448

Kirill Smelkov's avatar
Kirill Smelkov committed
449
.. [*] See `Go Concurrency Patterns: Context`__ for overview.
450

Kirill Smelkov's avatar
Kirill Smelkov committed
451
__ https://blog.golang.org/context
452 453


Kirill Smelkov's avatar
Kirill Smelkov committed
454
String conversion
455
~~~~~~~~~~~~~~~~~
Kirill Smelkov's avatar
Kirill Smelkov committed
456 457 458 459 460 461 462 463 464 465

`qq` (import from `golang.gcompat`) provides `%q` functionality that quotes as
Go would do. For example the following code will print name quoted in `"`
without escaping printable UTF-8 characters::

   print('hello %s' % qq(name))

`qq` accepts both `str` and `bytes` (`unicode` and `str` on Python2)
and also any other type that can be converted to `str`.

466
Package |golang.strconv|_ provides direct access to conversion routines, for
467
example `strconv.quote` and `strconv.unquote`.
468

469
.. |golang.strconv| replace:: `golang.strconv`
470
.. _golang.strconv: https://lab.nexedi.com/nexedi/pygolang/tree/master/golang/strconv.py
471

Kirill Smelkov's avatar
Kirill Smelkov committed
472 473

Benchmarking and testing
474
~~~~~~~~~~~~~~~~~~~~~~~~
Kirill Smelkov's avatar
Kirill Smelkov committed
475 476 477 478 479 480 481 482 483 484 485 486 487 488 489 490 491 492

`py.bench` allows to benchmark python code similarly to `go test -bench` and `py.test`.
For example, running `py.bench` on the following code::

    def bench_add(b):
        x, y = 1, 2
        for i in xrange(b.N):
            x + y

gives something like::

    $ py.bench --count=3 x.py
    ...
    pymod: bench_add.py
    Benchmarkadd    50000000        0.020 µs/op
    Benchmarkadd    50000000        0.020 µs/op
    Benchmarkadd    50000000        0.020 µs/op

493
Package |golang.testing|_ provides corresponding runtime bits, e.g. `testing.B`.
Kirill Smelkov's avatar
Kirill Smelkov committed
494 495 496 497

`py.bench` produces output in `Go benchmark format`__, and so benchmark results
can be analyzed and compared with standard Go tools, for example with
`benchstat`__.
498
Additionally package |golang.x.perf.benchlib|_ can be used to load and process
Kirill Smelkov's avatar
Kirill Smelkov committed
499 500
such benchmarking data in Python.

501
.. |golang.testing| replace:: `golang.testing`
502
.. _golang.testing: https://lab.nexedi.com/nexedi/pygolang/tree/master/golang/testing.py
503
.. |golang.x.perf.benchlib| replace:: `golang.x.perf.benchlib`
504
.. _golang.x.perf.benchlib: https://lab.nexedi.com/nexedi/pygolang/tree/master/golang/x/perf/benchlib.py
Kirill Smelkov's avatar
Kirill Smelkov committed
505 506
__ https://github.com/golang/proposal/blob/master/design/14313-benchmark-format.md
__ https://godoc.org/golang.org/x/perf/cmd/benchstat
507 508 509 510 511 512 513 514 515 516 517 518 519 520 521 522 523 524 525


--------

GPython options
---------------

GPython mimics and supports most of Python command-line options, like `gpython
-c <commands>` to run Python statements from command line, or `gpython -m
<module>` to execute a module. Such options have the same meaning as in
standard Python and are not documented here.

GPython-specific options and environment variables are listed below:

`-X gpython.runtime=(gevent|threads)`
    Specify which runtime GPython should use. `gevent` provides lightweight
    coroutines, while with `threads` `go` spawns full OS thread. `gevent` is
    default. The runtime to use can be also specified via `$GPYTHON_RUNTIME`
    environment variable.