Commit 3f578560 authored by Kirill Smelkov's avatar Kirill Smelkov

go/neo/t/neotest: ZODB benchmarks

Add to neotest zbench-local and zbench-cluster commands that perform
ZODB benchmarks on FileStorage, ZEO and NEO with Python and Go clients
either locally, or with a server and client running on 2 different nodes.

There are 2 client programs: tzodb.py and tzodb.go which for now compute
hash of whole latest objects stream in a ZODB database. On server side
neotest is taught to launch ZEO and various NEO clusters and to execute
client load on them.

Two test datasets are used: wczblk1-8 - the dataset with wendelin.core ZBlk1
objects covering 8M array, and prod1-1024 - synthethic dataset that tries to
represent regular ERP5 instance. Both datasets are very small and so we can
assume they reside completely in server disk cache while running benchmarks.
Benchmark timings will thus give pure storage software processing latency, as
pagecache hit time is on par, or less, to 1µs.

Example output:

	x/src/lab.nexedi.com/kirr/neo/go/neo/t$ ./neotest zbench-local
	dataset:	wczblk1-8
	node:
	cluster:	deco

	*** generating fs1 data...
	I: RAM:  7.47GB
	I: WORK: 0.01GB
	gen signal t=0...1.05e+06  float64  (= 0.01GB)
	gen signal blk [0:1048576]  (100.0%)
	VIRT: 297 MB	RSS: 48MB

	*** generating sqlite data...
	I: RAM:  7.47GB
	I: WORK: 0.01GB
	gen signal t=0...1.05e+06  float64  (= 0.01GB)
	gen signal blk [0:1048576]  (100.0%)
	VIRT: 386 MB	RSS: 58MB
	2018-07-10 19:57:35.7065 ERROR     NEO        [           app: 91] primary master is down
	Cluster state changed

	*** generating sql data...
	2018-07-10 19:57:35 140115116649600 [Note] /usr/sbin/mysqld (mysqld 10.1.29-MariaDB-6+b1) starting as process 27574 ...
	2018-07-10 19:57:39 140205509999744 [Note] /usr/sbin/mysqld (mysqld 10.1.29-MariaDB-6+b1) starting as process 27603 ...
	2018-07-10 19:57:42 139692109810816 [Note] /usr/sbin/mysqld (mysqld 10.1.29-MariaDB-6+b1) starting as process 27633 ...
	2018-07-10 19:57:45 139759221546112 [Note] mysqld (mysqld 10.1.29-MariaDB-6+b1) starting as process 27662 ...
	I: RAM:  7.47GB
	I: WORK: 0.01GB
	gen signal t=0...1.05e+06  float64  (= 0.01GB)
	gen signal blk [0:1048576]  (100.0%)
	VIRT: 387 MB	RSS: 59MB
	2018-07-10 19:57:48.2565 ERROR     NEO        [           app: 91] primary master is down
	Cluster state changed

	*** FileStorage
	Benchmarkfs1-zhash.py 2127 16.3 µs/object	# crc32:14640593  nread=8540363  t=0.035s	# POLL·2 C1·73 C1E·38 C3·12 C6·36 C7s·0 C8·112 C9·0 C10·62
	Benchmarkfs1-zhash.py 2127 16.6 µs/object	# crc32:14640593  nread=8540363  t=0.035s	# POLL·0 C1·113 C1E·21 C3·16 C6·56 C7s·0 C8·136 C9·0 C10·41
	Benchmarkfs1-zhash.py 2127 15.9 µs/object	# crc32:14640593  nread=8540363  t=0.034s	# POLL·0 C1·71 C1E·36 C3·22 C6·50 C7s·0 C8·167 C9·0 C10·47
	Benchmarkfs1-zhash.py 2127 15.9 µs/object	# crc32:14640593  nread=8540363  t=0.034s	# POLL·0 C1·77 C1E·32 C3·11 C6·55 C7s·0 C8·184 C9·0 C10·31
	Benchmarkfs1-zhash.py 2127 16.0 µs/object	# crc32:14640593  nread=8540363  t=0.034s	# POLL·0 C1·78 C1E·15 C3·12 C6·51 C7s·0 C8·140 C9·0 C10·44

	# 16 clients in parallel
	Benchmarkfs1-zhash.py·P16 2127 129.0 µs/object	# crc32:14640593  nread=8540363  t=0.274s
	Benchmarkfs1-zhash.py·P16 2127 132.6 µs/object	# crc32:14640593  nread=8540363  t=0.282s
	Benchmarkfs1-zhash.py·P16 2127 135.0 µs/object	# crc32:14640593  nread=8540363  t=0.287s
	Benchmarkfs1-zhash.py·P16 2127 135.3 µs/object	# crc32:14640593  nread=8540363  t=0.288s
	Benchmarkfs1-zhash.py·P16 2127 136.6 µs/object	# crc32:14640593  nread=8540363  t=0.291s
	Benchmarkfs1-zhash.py·P16 2127 122.8 µs/object	# crc32:14640593  nread=8540363  t=0.261s
	Benchmarkfs1-zhash.py·P16 2127 130.9 µs/object	# crc32:14640593  nread=8540363  t=0.279s
	Benchmarkfs1-zhash.py·P16 2127 126.4 µs/object	# crc32:14640593  nread=8540363  t=0.269s
	Benchmarkfs1-zhash.py·P16 2127 125.8 µs/object	# crc32:14640593  nread=8540363  t=0.268s
	Benchmarkfs1-zhash.py·P16 2127 108.3 µs/object	# crc32:14640593  nread=8540363  t=0.230s
	Benchmarkfs1-zhash.py·P16 2127 131.0 µs/object	# crc32:14640593  nread=8540363  t=0.279s
	Benchmarkfs1-zhash.py·P16 2127 124.1 µs/object	# crc32:14640593  nread=8540363  t=0.264s
	Benchmarkfs1-zhash.py·P16 2127 129.3 µs/object	# crc32:14640593  nread=8540363  t=0.275s
	Benchmarkfs1-zhash.py·P16 2127 125.0 µs/object	# crc32:14640593  nread=8540363  t=0.266s
	Benchmarkfs1-zhash.py·P16 2127 131.5 µs/object	# crc32:14640593  nread=8540363  t=0.280s
	Benchmarkfs1-zhash.py·P16 2127 131.4 µs/object	# crc32:14640593  nread=8540363  t=0.280s
	# POLL·0 C1·4 C1E·13 C3·11 C6·79 C7s·0 C8·14 C9·0 C10·0

	...

And its summary via benchstat:

	x/src/lab.nexedi.com/kirr/neo/go/neo/t$ benchstat -split node,cluster,dataset x.log
	name                                 time/object
	cluster:deco dataset:wczblk1-8
	fs1-zhash.py                         16.1µs ± 3%
	fs1-zhash.py·P16                      130µs ± 5%
	fs1-zhash.go                         3.00µs ±10%
	fs1-zhash.go+prefetch128             3.40µs ±18%
	fs1-zhash.go·P16                     10.2µs ±71%
	zeo/py/fs1-zhash.py                   336µs ± 3%
	zeo/py/fs1-zhash.py·P16              3.22ms ± 6%
	zeo/py/fs1-zhash.go                   112µs ± 2%
	zeo/py/fs1-zhash.go+prefetch128      60.9µs ± 1%
	zeo/py/fs1-zhash.go·P16              1.07ms ± 5%
	neo/py(!log)/sqlite·P1-zhash.py       291µs ± 2%
	neo/py(!log)/sqlite·P1-zhash.py·P16  2.86ms ± 1%
	neo/py(!log)/sql·P1-zhash.py          318µs ± 4%
	neo/py(!log)/sql·P1-zhash.py·P16     3.99ms ± 0%
	cluster:deco dataset:prod1-1024
	fs1-zhash.py                         12.3µs ± 1%
	fs1-zhash.py·P16                      106µs ±10%
	fs1-zhash.go                         2.56µs ±10%
	fs1-zhash.go+prefetch128             2.68µs ± 8%
	fs1-zhash.go·P16                     9.48µs ±43%
	zeo/py/fs1-zhash.py                   319µs ± 3%
	zeo/py/fs1-zhash.py·P16              3.13ms ± 3%
	zeo/py/fs1-zhash.go                   101µs ± 5%
	zeo/py/fs1-zhash.go+prefetch128      56.9µs ± 1%
	zeo/py/fs1-zhash.go·P16              1.19ms ± 4%
	neo/py(!log)/sqlite·P1-zhash.py       281µs ± 3%
	neo/py(!log)/sqlite·P1-zhash.py·P16  2.80ms ± 1%
	neo/py(!log)/sql·P1-zhash.py          316µs ± 1%
	neo/py(!log)/sql·P1-zhash.py·P16     3.91ms ± 1%

Since there is no NEO/go support yet, corresponding neotest parts are merged,
but commented-out with appropriate remark.

Parallel access is simulated with spawning many OS processes for now.
This will change in the nearby followup patch to zwrk.

Results of ZODB benchmarking were discussed in

	http://navytux.spb.ru/~kirr/neo.html#performance-tests		, and
	http://navytux.spb.ru/~kirr/neo.html#results-and-discussion

Some draft history related to this patch:

	lab.nexedi.com/kirr/neo/commit/e0d875bc	X neotest: Teach it to benchmark NEO with storage partitioned to several nodes
	lab.nexedi.com/kirr/neo/commit/590f0a46	X neo/py uses n(replica) as n(real-replica) - 1
	lab.nexedi.com/kirr/neo/commit/b655da26	X save time not benchmarking things we do not show
	lab.nexedi.com/kirr/neo/commit/f834f40d	X zhash: Show N(obj) read, not 1, in place of N(iter)
	lab.nexedi.com/kirr/neo/commit/a16e8d52	X teach golang to access ZEO
	lab.nexedi.com/kirr/neo/commit/b9827725	X switch to using no compression, because this way it is more fair for comparing storage latencies
	lab.nexedi.com/kirr/neo/commit/c0067335	X neotest: Don't depend on killall
	lab.nexedi.com/kirr/neo/commit/2bcd6ebb	X neotest: add zbench-local & zbench-cluster subcomands
	lab.nexedi.com/kirr/neo/commit/fb165ad9	X neotest: Also benchmark NEO/py with logging disabled
	lab.nexedi.com/kirr/neo/commit/2118ba38	X neotest: Help mysqlk_install_db find its basedir under SlapOS
	lab.nexedi.com/kirr/neo/commit/80eaa05e	X zgenprod1 tool
	lab.nexedi.com/kirr/neo/commit/eb0e516f	X check hash result and error if mismatch (zhash.* part); neotest part pending
	lab.nexedi.com/kirr/neo/commit/046370db	X benchify rest of bench-cluster
	lab.nexedi.com/kirr/neo/commit/2d13818e	X bench-local + zhash: Add output in std bench format
	lab.nexedi.com/kirr/neo/commit/1d692a3b	X add NEO/go with SHA1 disabled (both Sgo and Cgo to regular benchmarks)
parent 26006d7e
...@@ -2,4 +2,6 @@ ...@@ -2,4 +2,6 @@
/var /var
/tcpu /tcpu
/tcpu_go /tcpu_go
/tzodb
/tzodb_go
/ioping.tmp /ioping.tmp
This diff is collapsed.
// Copyright (C) 2017-2018 Nexedi SA and Contributors.
// Kirill Smelkov <kirr@nexedi.com>
//
// This program is free software: you can Use, Study, Modify and Redistribute
// it under the terms of the GNU General Public License version 3, or (at your
// option) any later version, as published by the Free Software Foundation.
//
// You can also Link and Combine this program with other software covered by
// the terms of any of the Free Software licenses or any of the Open Source
// Initiative approved licenses and Convey the resulting work. Corresponding
// source of such a combination shall include the source code for all other
// software used.
//
// This program is distributed WITHOUT ANY WARRANTY; without even the implied
// warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.
//
// See COPYING file for full licensing terms.
// See https://www.nexedi.com/licensing for rationale and options.
// +build ignore
// tzodb - ZODB-related benchmarks
package main
import (
"context"
"crypto/sha1"
"crypto/sha256"
"crypto/sha512"
"flag"
"fmt"
"hash"
"hash/adler32"
"hash/crc32"
"io"
"log"
"os"
"time"
"lab.nexedi.com/kirr/go123/prog"
"lab.nexedi.com/kirr/go123/xerr"
"lab.nexedi.com/kirr/neo/go/zodb"
_ "lab.nexedi.com/kirr/neo/go/zodb/wks"
"github.com/pkg/errors"
)
// hasher is hash.Hash which also knows its name
type hasher struct {
hash.Hash
name string
}
// hasher that discards data
type nullHasher struct {}
func (h nullHasher) Write(b []byte) (int, error) { return len(b), nil }
func (h nullHasher) Sum(b []byte) []byte { return []byte{0} }
func (h nullHasher) Reset() {}
func (h nullHasher) Size() int { return 1 }
func (h nullHasher) BlockSize() int { return 1 }
// hashFlags installs common hash flags and returns function to retrieve selected hasher.
func hashFlags(flags *flag.FlagSet) func() hasher {
fnull := flags.Bool("null", false, "don't compute hash - just read data")
fadler32 := flags.Bool("adler32",false, "compute Adler32 checksum")
fcrc32 := flags.Bool("crc32", false, "compute CRC32 checksum")
fsha1 := flags.Bool("sha1", false, "compute SHA1 cryptographic hash")
fsha256 := flags.Bool("sha256", false, "compute SHA256 cryptographic hash")
fsha512 := flags.Bool("sha512", false, "compute SHA512 cryptographic hash")
return func() hasher {
var h hasher
inith := func(name string, ctor func() hash.Hash) {
if h.name != "" {
log.Fatalf("duplicate hashes: %s and %s specified", h.name, name)
}
h.name = name
h.Hash = ctor()
}
switch {
case *fnull: inith("null", func() hash.Hash { return nullHasher{} })
case *fadler32: inith("adler32", func() hash.Hash { return adler32.New() })
case *fcrc32: inith("crc32", func() hash.Hash { return crc32.NewIEEE() })
case *fsha1: inith("sha1", sha1.New)
case *fsha256: inith("sha256", sha256.New)
case *fsha512: inith("sha512", sha512.New)
}
return h
}
}
// ----------------------------------------
const zhashSummary = "compute hash of whole latest objects stream in a ZODB database."
func zhashUsage(w io.Writer) {
fmt.Fprintf(w,
`Usage: tzodb zhash [options] <url>
`)
}
func zhashMain(argv []string) {
ctx := context.Background()
flags := flag.NewFlagSet("", flag.ExitOnError)
flags.Usage = func() { zhashUsage(os.Stderr); flags.PrintDefaults() }
fhash := hashFlags(flags)
fcheck := flags.String("check", "", "verify resulting hash to be = expected")
fbench := flags.String("bench", "", "use benchmarking format for output")
useprefetch := flags.Bool("useprefetch", false, "prefetch loaded objects")
flags.Parse(argv[1:])
if flags.NArg() != 1 {
flags.Usage()
os.Exit(1)
}
url := flags.Arg(0)
h := fhash()
if h.Hash == nil {
log.Fatal("no hash function specified")
}
err := zhash(ctx, url, h, *useprefetch, *fbench, *fcheck)
if err != nil {
log.Fatal(err)
}
}
func zhash(ctx context.Context, url string, h hasher, useprefetch bool, bench, check string) (err error) {
defer xerr.Context(&err, "zhash")
stor, err := zodb.OpenStorage(ctx, url, &zodb.OpenOptions{ReadOnly: true})
if err != nil {
return err
}
defer func() {
err2 := stor.Close()
err = xerr.First(err, err2)
}()
const nprefetch = 128 // XXX -> 512 ?
// prefetchBlk prefetches block of nprefetch objects starting from xid
//var tprevLoadBlkStart time.Time
prefetchBlk := func(ctx context.Context, xid zodb.Xid) {
if !useprefetch {
return
}
//t1 := time.Now()
for i := 0; i < nprefetch; i++ {
//fmt.Printf("prefetch %v\n", xid)
stor.Prefetch(ctx, xid)
xid.Oid++
}
//t2 := time.Now()
//δt := t2.Sub(t1)
//fmt.Printf("tprefetch: %s", δt)
//if !tprevLoadBlkStart.IsZero() {
// fmt.Printf("\ttprevload: %s", t1.Sub(tprevLoadBlkStart))
//}
//fmt.Printf("\n")
//
//tprevLoadBlkStart = t2
}
lastTid, err := stor.LastTid(ctx)
if err != nil {
return err
}
tstart := time.Now()
oid := zodb.Oid(0)
nread := 0
loop:
for {
xid := zodb.Xid{Oid: oid, At: lastTid}
if xid.Oid % nprefetch == 0 {
prefetchBlk(ctx, xid)
}
buf, _, err := stor.Load(ctx, xid)
if err != nil {
switch errors.Cause(err).(type) {
case *zodb.NoObjectError:
break loop
default:
return err
}
}
h.Write(buf.Data)
//fmt.Fprintf(os.Stderr, "%d @%s\t%s: %x\n", uint(oid), serial, h.name, h.Sum(nil))
//fmt.Fprintf(os.Stderr, "\tdata: %x\n", buf.Data)
nread += len(buf.Data)
oid += 1
buf.Release()
}
tend := time.Now()
δt := tend.Sub(tstart)
x := "zhash.go"
if useprefetch {
x += fmt.Sprintf("+prefetch%d", nprefetch)
}
hresult := fmt.Sprintf("%s:%x", h.name, h.Sum(nil))
if bench == "" {
fmt.Printf("%s ; oid=0..%d nread=%d t=%s (%s / object) x=%s\n",
hresult, oid-1, nread, δt, δt / time.Duration(oid), x) // XXX /oid cast ?
} else {
topic := fmt.Sprintf(bench, x) // XXX -> better text/template
fmt.Printf("Benchmark%s %d %.1f µs/object\t# %s nread=%d t=%s\n",
topic, oid-1, float64(δt) / float64(oid) / float64(time.Microsecond),
hresult, nread, δt)
}
if check != "" && hresult != check {
return fmt.Errorf("%s: hash mismatch: expected %s ; got %s\t# x=%s", url, check, hresult, x)
}
return nil
}
// ----------------------------------------
var commands = prog.CommandRegistry{
{"zhash", zhashSummary, zhashUsage, zhashMain},
}
func main() {
log.SetPrefix("tzodb: ")
log.SetFlags(0)
prog := prog.MainProg{
Name: "tzodb",
Summary: "tzodb is a tool to run ZODB-related benchmarks",
Commands: commands,
HelpTopics: nil,
}
prog.Main()
}
#!/usr/bin/env python
# -*- coding: utf-8 -*-
# Copyright (C) 2017-2018 Nexedi SA and Contributors.
# Kirill Smelkov <kirr@nexedi.com>
#
# This program is free software: you can Use, Study, Modify and Redistribute
# it under the terms of the GNU General Public License version 3, or (at your
# option) any later version, as published by the Free Software Foundation.
#
# You can also Link and Combine this program with other software covered by
# the terms of any of the Free Software licenses or any of the Open Source
# Initiative approved licenses and Convey the resulting work. Corresponding
# source of such a combination shall include the source code for all other
# software used.
#
# This program is distributed WITHOUT ANY WARRANTY; without even the implied
# warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.
#
# See COPYING file for full licensing terms.
# See https://www.nexedi.com/licensing for rationale and options.
"""tzodb - ZODB-related benchmarks"""
from __future__ import print_function
import zodbtools.util
from ZODB.POSException import POSKeyError
from ZODB.utils import p64, u64
import hashlib
from tcpu import Adler32Hasher, CRC32Hasher
import sys
import logging
from time import time
from getopt import getopt, GetoptError
# hasher that discards data
class NullHasher:
name = "null"
def update(self, data):
pass
def hexdigest(self):
return "00"
# {} name -> hasher
hashRegistry = {
"null": NullHasher,
"adler32": Adler32Hasher,
"crc32": CRC32Hasher,
"sha1": hashlib.sha1,
"sha256": hashlib.sha256,
"sha512": hashlib.sha512,
}
def usage(w):
print(\
"""Usage: tzodb zhash [options] url
options:
--null don't compute hash - just read data
--adler32 compute Adler32 checksum
--crc32 compute CRC32 checksum
--sha1 compute SHA1 cryptographic hash
--sha256 compute SHA256 cryptographic hash
--sha512 compute SHA512 cryptographic hash
--check=<expected> verify resulting hash to be = expected
--bench=<topic> use benchmarking format for output
""", file=w)
def zhash():
"""zhash - compute hash of whole latest objects stream in a ZODB database"""
if len(sys.argv) < 2 or sys.argv[1] != "zhash":
usage(sys.stderr)
exit(1)
try:
optv, argv = getopt(sys.argv[2:], "h", ["help", "check=", "bench="] + hashRegistry.keys())
except GetoptError as e:
print("E: %s" % e, file=sys.stderr)
usage(sys.stderr)
exit(1)
bench=None
check=None
for opt, arg in optv:
if opt in ("-h", "--help"):
print(__doc__)
usage(sys.stdout)
sys.exit()
if opt in ("--bench"):
bench=arg
continue
if opt in ("--check"):
check=arg
continue
opt = opt.lstrip("-")
hctor = hashRegistry[opt]
h = hctor()
if len(argv) != 1:
print(__doc__)
usage(sys.stderr)
sys.exit(1)
url = argv[0]
# log -> stderr
l = logging.getLogger()
l.addHandler(logging.StreamHandler())
stor = zodbtools.util.storageFromURL(url, read_only=True)
last_tid = stor.lastTransaction()
before = p64(u64(last_tid) + 1)
tstart = time()
oid = 0
nread = 0
while 1:
try:
data, serial, _ = stor.loadBefore(p64(oid), before)
except POSKeyError:
break
h.update(data)
#print('%s @%s\t%s: %s' % (oid, u64(serial), h.name, h.hexdigest()), file=sys.stderr)
#print('\tdata: %s' % (data.encode('hex'),), file=sys.stderr)
nread += len(data)
oid += 1
tend = time()
dt = tend - tstart
x = "zhash.py"
hresult = "%s:%s" % (h.name, h.hexdigest())
if bench is None:
print('%s ; oid=0..%d nread=%d t=%.3fs (%.1fµs / object) x=%s' % \
(hresult, oid-1, nread, dt, dt * 1E6 / oid, x))
else:
topic = bench % x
print('Benchmark%s %d %.1f µs/object\t# %s nread=%d t=%.3fs' % \
(topic, oid-1, dt * 1E6 / oid, hresult, nread, dt))
if check != None and hresult != check:
print("%s: hash mismatch: expected %s ; got %s\t# x=%s" % (url, check, hresult, x), file=sys.stderr)
sys.exit(1)
def main():
# XXX stub (no commands dispatching here)
zhash()
if __name__ == '__main__':
main()
#!/usr/bin/env python
# -*- coding: utf-8 -*-
# Copyright (C) 2017 Nexedi SA and Contributors.
#
# This program is free software: you can Use, Study, Modify and Redistribute
# it under the terms of the GNU General Public License version 3, or (at your
# option) any later version, as published by the Free Software Foundation.
#
# You can also Link and Combine this program with other software covered by
# the terms of any of the Free Software licenses or any of the Open Source
# Initiative approved licenses and Convey the resulting work. Corresponding
# source of such a combination shall include the source code for all other
# software used.
#
# This program is distributed WITHOUT ANY WARRANTY; without even the implied
# warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.
#
# See COPYING file for full licensing terms.
# See https://www.nexedi.com/licensing for rationale and options.
"""zgenprod1 - generate synthetic test database according to NEO PROD1 statistics"""
from __future__ import print_function
from neo.tests.stat_zodb import PROD1
from random import Random
import zodbtools.util
import sys
import logging
def usage(w):
print(\
"""Usage: zgenprod1 url n""", file=w)
def main():
argv = sys.argv[1:]
if len(argv) != 2:
print(__doc__)
usage(sys.stderr)
sys.exit(1)
url = argv[0]
n = int(argv[1])
# log -> stderr
l = logging.getLogger()
l.addHandler(logging.StreamHandler())
zin = PROD1(Random(0)).as_storage(n)
zout = zodbtools.util.storageFromURL(url)
zout.copyTransactionsFrom(zin)
zout.close()
if __name__ == '__main__':
main()
Markdown is supported
0%
or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment