Commits · 4fbdd270a310e083cbf112730e0447e479f4ee7b · Kirill Smelkov / wendelin.core

20 Nov, 2018 1 commit
- X Proof that that it is possible to change mmapping while under pagefault to it · 4fbdd270
  Kirill Smelkov authored Nov 20, 2018
```
Evn though the kernel not generally releasing mmap_sem on IO caused by
pagefault.
```
  4fbdd270
30 Oct, 2018 1 commit

Kirill Smelkov authored Oct 30, 2018

* master:
  lib.xnumpy.structured: New utility to create structured view of an array
  bigarray: Factor-out our custom numpy.lib.stride_tricks.as_strided-alike into lib/xnumpy.py

e1f05973

29 Oct, 2018 9 commits

lib.xnumpy.structured: New utility to create structured view of an array · 32ca80e2

Kirill Smelkov authored Oct 29, 2018

Structured creates view of the array interpreting its minor axis as fully covered by a dtype.

It is similar to arr.view(dtype) + corresponding reshape, but does
not have limitations of ndarray.view(). For example:

  In [1]: a = np.arange(3*3, dtype=np.int32).reshape((3,3))

  In [2]: a
  Out[2]:
  array([[0, 1, 2],
         [3, 4, 5],
         [6, 7, 8]], dtype=int32)

  In [3]: b = a[:2,:2]

  In [4]: b
  Out[4]:
  array([[0, 1],
         [3, 4]], dtype=int32)

  In [5]: dtxy = np.dtype([('x', np.int32), ('y', np.int32)])

  In [6]: dtxy
  Out[6]: dtype([('x', '<i4'), ('y', '<i4')])

  In [7]: b.view(dtxy)
  ---------------------------------------------------------------------------
  ValueError                                Traceback (most recent call last)
  <ipython-input-66-af98529aa150> in <module>()
  ----> 1 b.view(dtxy)

  ValueError: To change to a dtype of a different size, the array must be C-contiguous

  In [8]: structured(b, dtxy)
  Out[8]: array([(0, 1), (3, 4)], dtype=[('x', '<i4'), ('y', '<i4')])

Structured always creates view and never copies data.

Here is original context where separately playing with .shape and .dtype
was not enough, since it was creating array copy and OOM'ing the machine:

klaus/wendelin@cbe4938b

32ca80e2

bigarray: Factor-out our custom numpy.lib.stride_tricks.as_strided-alike into lib/xnumpy.py · 6a5dfefa

Kirill Smelkov authored Oct 29, 2018

We are going to use this code in another place, so move this out to
dommon place as a preparatory step first.

On a related note: Since ArrayRef is generic and quite independent from
BigArray (it only supports it, but equally it supports just other - e.g.
plain arrays), the proper place for it might be also to be lib/xnumpy.py .
We might get to this topic a bit later.

6a5dfefa

. · 33ff0c80
Kirill Smelkov authored Oct 29, 2018

33ff0c80
. · b4a3a0bd
Kirill Smelkov authored Oct 29, 2018

b4a3a0bd
. · c4b66f7d
Kirill Smelkov authored Oct 29, 2018

c4b66f7d
. · fd647f2b
Kirill Smelkov authored Oct 29, 2018

fd647f2b
. · 2ad300e8
Kirill Smelkov authored Oct 29, 2018

2ad300e8

Merge remote-tracking branch 'nxd/master' into t · 009b9fb9

Kirill Smelkov authored Oct 29, 2018

* nxd/master:
  bigarray: RAMArray
  bigarray/tests: Factor out a way to spcify on which BigFile/BigFileH an array is tested into fixture parameter

009b9fb9

X `expected a single-segment buffer object` problem gone · fbebef94
Kirill Smelkov authored Oct 29, 2018

fbebef94

26 Oct, 2018 1 commit

X xnumpy.restructure · 2569b175

Kirill Smelkov authored Oct 26, 2018

Currently fails with:

/home/kirr/src/wendelin/wendelin.core/lib/xnumpy.py in restructure(arr, dtype)
     82     print 'stridev:', stridev
     83     #return np.ndarray.__new__(type(arr), shape, dtype, buffer(arr), 0, stridev)
---> 84     return np.ndarray(shape, dtype, buffer(arr), 0, stridev)

TypeError: expected a single-segment buffer object

2569b175

21 Oct, 2018 2 commits

X notes on why lazy-invalidate approach was taken · c1f5bb19
Kirill Smelkov authored Oct 21, 2018

c1f5bb19

X ptrace when client is under pagefault or syscall won't work · d36b171f

Kirill Smelkov authored Oct 21, 2018

The kernel sends SIGSTOP to interrupt tracee, but the signal will be
processed only when the process returns from kernel space, e.g. here

https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/arch/x86/entry/common.c?id=v4.19-rc8-151-g23469de647c4#n160

This way the tracer won't receive obligatory information that tracee
stopped (via wait...) and even though ptrace(ATTACH) succeeds, all other
ptrace commands will fail:

https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/kernel/ptrace.c?id=v4.19-rc8-151-g23469de647c4#n1140
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/kernel/ptrace.c?id=v4.19-rc8-151-g23469de647c4#n207

My original idea was to use ptrace to run code in process to change it's
memory mappings, while the triggering process is under pagefault/read
to wcfs, and the above shows it won't work - trying to ptrace the
client from under wcfs will just block forever (the kernel will be
waiting for read operation to finish for ptrace, and read will be first
waiting on ptrace stopping to complete = deadlock)

d36b171f

19 Oct, 2018 5 commits
- X mmap over under pagefault to this mmapping works · c27c1940
  Kirill Smelkov authored Oct 19, 2018
```
kirr/go-fuse@f822c9db
```
  c27c1940
- . · 2d2bb578
  Kirill Smelkov authored Oct 19, 2018
  
  2d2bb578
- X δFtail settled · 27d91d47
  Kirill Smelkov authored Oct 19, 2018
  
  27d91d47
- . · fee4a7e4
  Kirill Smelkov authored Oct 19, 2018
  
  fee4a7e4
- . · 06ed10ee
  Kirill Smelkov authored Oct 19, 2018
  
  06ed10ee
18 Oct, 2018 2 commits

X invalidation design draftly settled · 9b4a42a3
Kirill Smelkov authored Oct 18, 2018

9b4a42a3

X test that ZBlk objects can be actually removed from ZODB Connection cache... · 69c94fbc

Kirill Smelkov authored Oct 18, 2018

X test that ZBlk objects can be actually removed from ZODB Connection cache and cause invalidation to be missed

____________________________________ test_bigfile_filezodb_vs_cache_invalidation ____________________________________

    def test_bigfile_filezodb_vs_cache_invalidation():
        root = dbopen()
        conn = root._p_jar
        db   = conn.db()
        conn.close()
        del root, conn

        tm1 = TransactionManager()
        tm2 = TransactionManager()

        conn1 = db.open(transaction_manager=tm1)
        root1 = conn1.root()

        # setup zfile with fileh view to it
        root1['zfile3'] = f1 = ZBigFile(blksize)
        tm1.commit()

        fh1 = f1.fileh_open()
        tm1.commit()

        # set zfile initial data
        vma1 = fh1.mmap(0, 1)
        Blk(vma1, 0)[0] = 1
        tm1.commit()

        # read zfile and setup fileh for it in conn2
        conn2 = db.open(transaction_manager=tm2)
        root2 = conn2.root()

        f2 = root2['zfile3']
        fh2 = f2.fileh_open()
        vma2 = fh2.mmap(0, 1)

        assert Blk(vma2, 0)[0] == 1 # read data in conn2 + make sure read correctly

        # now zfile content is both in ZODB.Connection cache and in _ZBigFileH
        # cache for each conn1 and conn2. Modify data in conn1 and make sure it
        # fully propagate to conn2.

        Blk(vma1, 0)[0] = 2
        tm1.commit()

        # still should be read as old value in conn2
        assert Blk(vma2, 0)[0] == 1
        # and even after virtmem pages reclaim
        # ( verifies that _p_invalidate() in ZBlk.loadblkdata() does not lead to
        #   reloading data as updated )
        ram_reclaim_all()
        assert Blk(vma2, 0)[0] == 1

        # FIXME: this simulates ZODB Connection cache pressure and currently
        # removes ZBlk corresponding to blk #0 from conn2 cache.
        # In turn this leads to conn2 missing that block invalidation on follow-up
        # transaction boundary.
        #
        # See FIXME notes on ZBlkBase._p_invalidate() for detailed description.
        conn2._cache.minimize()

        tm2.commit()                # transaction boundary for t2

        # data from tm1 should propagate -> ZODB -> ram pages for _ZBigFileH in conn2
>       assert Blk(vma2, 0)[0] == 2
E       assert 1 == 2

tests/test_filezodb.py:615: AssertionError

69c94fbc

17 Oct, 2018 1 commit
- X another example of wc1 invalidation bug related to ZBlk pinning to #blk · d1a579b2
  Kirill Smelkov authored Oct 17, 2018
```
This one is less exotic compared to format changes rewrite.
```
  d1a579b2
16 Oct, 2018 3 commits
- X found another bug in wc 1 invalidation · 48eb692f
  Kirill Smelkov authored Oct 16, 2018
  
  48eb692f
- X found bug in wc 1 invalidation · 5a4562fc
  Kirill Smelkov authored Oct 16, 2018
  
  5a4562fc
- . · d7ff6655
  Kirill Smelkov authored Oct 16, 2018
  
  d7ff6655
15 Oct, 2018 5 commits
- . · bdc59b5e
  Kirill Smelkov authored Oct 15, 2018
  
  bdc59b5e
- . · 11ce2ba7
  Kirill Smelkov authored Oct 15, 2018
  
  11ce2ba7
- . · e8c26821
  Kirill Smelkov authored Oct 15, 2018
  
  e8c26821
- . · 46e6f6a0
  Kirill Smelkov authored Oct 15, 2018
  
  46e6f6a0
- . · 7f4eb022
  Kirill Smelkov authored Oct 15, 2018
  
  7f4eb022
12 Oct, 2018 3 commits

. · 15123fbf
Kirill Smelkov authored Oct 12, 2018

15123fbf

RAMArray · 99b91c84

Kirill Smelkov authored Oct 11, 2018

RAMArray is compatible to ZBigArray in API and semantic, but stores its
data in RAM only. It is useful in situations where ZBigArray compatible
data type is needed, but the amount of data is small and the data itself
is needed only temporarily - e.g. in a simulation.

Please see details in individual patches.

Original merge request by @klaus (!8).

/cc @Tyagov
/reviewed-on !9

99b91c84

bigarray: RAMArray · fc9b69d8

Kirill Smelkov authored Oct 11, 2018

RAMArray is compatible to ZBigArray in API and semantic, but stores its
data in RAM only. It is useful in situations where ZBigArray compatible
data type is needed, but the amount of data is small and the data itself
is needed only temporarily - e.g. in a simulation.

Implementation is based on mmapping temporary files from /dev/shm/... and
passing them as file handles, similarly to how ZBigArray works, to BigArray.
We don't use just numpy.ndarray because of append - for ZBigArray append
works in O(1), but more importantly it does not copy data. This way
mmapings previously created for ZBigArray views, continue to correctly
alias array data. If we would be using ndarray directly, since
ndarray.resize copies data, that property would not be preserved.

Original patch by Klaus Wölfel <klaus@nexedi.com>
(nexedi/wendelin.core!8)

fc9b69d8

11 Oct, 2018 6 commits

. · 100995d6
Kirill Smelkov authored Oct 11, 2018

100995d6
. · 899b6102
Kirill Smelkov authored Oct 11, 2018

899b6102

X readBlk: Fix thinko in aready case · 29c9f13d

Kirill Smelkov authored Oct 11, 2018

We were checking for `loading.err != nil` as the indication for success
and it should have been `err == nil`. The symphoms of the bug were that
\0 instead of data were read sometimes:

	wcfs: 2018/10/11 19:18:12 < 22: i7.READ {Fh 0 [2097152 +131072)  L 0 RDONLY,0x8000}                             <-- NOTE

	I1011 19:18:12.556125    6330 wcfs.go:538] readBlk #1 dest[0:+2097152]
	I1011 19:18:12.556361    6330 wcfs.go:538] readBlk #1 dest[0:+2097152]
	wcfs: 2018/10/11 19:18:12 ZBlk0.PySetState #11
	wcfs: 2018/10/11 19:18:12 ZBigFile.loadblk(1) -> 2097152B

	wcfs: 2018/10/11 19:18:12 > 22:     OK,  131072B data "\x00\x00\x00\x00\x00\x00\x00\x00"...                     <-- XXX not "hello world"

	wcfs: 2018/10/11 19:18:12 < 24: i7.READ {Fh 0 [2359296 +131072)  L 0 RDONLY,0x8000}
	wcfs: 2018/10/11 19:18:12 > 23:     OK,  131072B data "\x00\x00\x00\x00\x00\x00\x00\x00"...
	wcfs: 2018/10/11 19:18:12 > 0:     NOTIFY_STORE_CACHE, {i7 [2097152 +2097152)} 2097152B data "hello wo"...      <-- NOTE

29c9f13d

X don't overalign end by 1 blksize if end is already aligned · d58c71e8

Kirill Smelkov authored Oct 11, 2018

Else:

	wcfs: 2018/10/10 17:52:15 < 40: i7.READ {Fh 0 [4063232 +131072)  L 0 RDONLY,0x8000}
	wcfs: 2018/10/10 17:52:15 > 39:     OK,  131072B data
	wcfs: 2018/10/10 17:52:15 > 40:     OK,  131072B data
	wcfs: 2018/10/10 17:52:15 < 41: i7.GETATTR {Fh 0}
	wcfs: 2018/10/10 17:52:15 Response: INODE_NOTIFY_STORE_CACHE: OK
	wcfs: 2018/10/10 17:52:15 > 41:     OK, {tA=1s {M0100444 SZ=4194304 L=1 1000:1000 B0*0 i0:7 A 0.000000 M 1539183135.261177 C 1539183135.261177}}

	# XXX vvv why we store 2M after read @4M even though read gives len=0 ?
	wcfs: 2018/10/10 17:52:15 > 0:     NOTIFY_STORE_CACHE, {i7 [4194304 +2097152)} 2097152B data
	wcfs: 2018/10/10 17:52:15 < 42: i7.READ {Fh 0 [4194304 +4096)  L 0 RDONLY,0x8000}
	wcfs: 2018/10/10 17:52:15 > 42:     OK,

	wcfs: 2018/10/10 17:52:15 < 43: i7.GETATTR {Fh 0}
	wcfs: 2018/10/10 17:52:15 > 43:     OK, {tA=1s {M0100444 SZ=4194304 L=1 1000:1000 B0*0 i0:7 A 0.000000 M 1539183135.261177 C 1539183135.261177}}
	wcfs: 2018/10/10 17:52:15 Response: INODE_NOTIFY_STORE_CACHE: OK
	wcfs: 2018/10/10 17:52:15 < 44: i7.READ {Fh 0 [4198400 +4096)  L 0 RDONLY,0x8000}
	wcfs: 2018/10/10 17:52:15 > 44:     OK,

        data = readfile(fpath + "/head/data")
>       assert len(data) == fsize
E       AssertionError: assert 4198400 == 4194304

d58c71e8

bigarray/tests: Factor out a way to spcify on which BigFile/BigFileH an array... · 7365979b

Kirill Smelkov authored Oct 11, 2018

bigarray/tests: Factor out a way to spcify on which BigFile/BigFileH an array is tested into fixture parameter

Currently we have only one BigFile and its BigFileH handle. However in
the next patch, for RAMArray, we'll be adding handles for opened RAM
files, and it would be good to test whole BigArray functionality on
data served by those handles too.

Prepare for this and first factor out into testbig fixture the way to
open such handles.

7365979b

. · 5a793aa3
Kirill Smelkov authored Oct 11, 2018

5a793aa3

10 Oct, 2018 1 commit
- . · a3907113
  Kirill Smelkov authored Oct 10, 2018
  
  a3907113