- 19 Aug, 2024 2 commits
-
-
Levin Zimmermann authored
Before this patch, the 'KnownMasterList' field of the 'NotPrimaryMaster' was expected to be structured in the following way: ArrayHeader (KnownMasterList) ArrayHeader (KnownMaster) ArrayHeader (Address) Host (string) Port (uint16) However NEO/py sends the following structure: ArrayHeader (KnownMasterList) ArrayHeader (Address) Host (string) Port (uint16) This also makes sense, as 'KnownMaster' doesn't need to add another nesting, because it only includes the address. This patch amends the NEO/go protocol definition to transparently represent the nesting as it's send by NEO/py. See also 18287612 for a similar issue. /reviewed-by @kirr /reviewed-on kirr/neo!6
-
Levin Zimmermann authored
Some NEO protocol packets have the field 'RowList'. This field contains information about each row of a partition table. In NEO/go the information of each row is represented with the 'RowInfo' type [1]. This type is defined as a struct with the field ‘CellList’. ‘CellList’ is defined as a list of 'CellInfo' [1] (e.g. an entry for each cell). NEO/go {en,de}codes struct types with ‘genStructHead’ (structures in golang are encoded as arrays in msgpack) [2]. From the 'RowList' definition, the msgpack decoder currently expects the following msgpack array structure: ArrayHeader (RowList) ArrayHeader (RowInfo) ArrayHeader (CellList) ArrayHeader (CellInfo) int32 (NID) enum (State) However NEO/py actually sends: ArrayHeader (RowList) ArrayHeader (CellList) ArrayHeader (CellInfo) int32 (NID) enum (State) In other words, currently the NEO/go msgpack encoder expects one more nesting, which NEO/py doesn’t provide (and which also doesn’t seem to be necessary as the outer nesting would always only contain one element). We could adjust the msgpack {en,de}coder to introduce an exception for the 'RowInfo' type, however as the protocol definition in 'proto.go' aims to transparently reflect the structure of the packets on the wire, it seems to be more appropriate to fix this straight in the protocol definition. This is also less error-prone as we don't have to fix all the different positions of the encoder, decoder & sizer and it's less code (particularly if 'RowInfo' doesn't stay the only case for such an issue). [1] https://lab.nexedi.com/kirr/neo/-/blob/1ad088c8/go/neo/proto/proto.go#L391-394 [2] https://lab.nexedi.com/kirr/neo/-/blob/1ad088c8/go/neo/proto/protogen.go#L1770-1775 -------- kirr: I've applied the following interdiff to the original patch of c93d5dbc : --- a/go/neo/neo_test.go +++ b/go/neo/neo_test.go @@ -100,7 +100,7 @@ func _TestMasterStorage(t0 *tEnv) { PTid: 1, NumReplicas: 0, RowList: []proto.RowInfo{ - proto.RowInfo{proto.CellInfo{proto.NID(proto.STORAGE, 1), proto.UP_TO_DATE}}, + {proto.CellInfo{proto.NID(proto.STORAGE, 1), proto.UP_TO_DATE}}, }, })) @@ -173,7 +173,7 @@ func _TestMasterStorage(t0 *tEnv) { PTid: 1, NumReplicas: 0, RowList: []proto.RowInfo{ - proto.RowInfo{proto.CellInfo{proto.NID(proto.STORAGE, 1), proto.UP_TO_DATE}}, + {proto.CellInfo{proto.NID(proto.STORAGE, 1), proto.UP_TO_DATE}}, }, })) --- a/go/neo/proto/proto_test.go +++ b/go/neo/proto/proto_test.go @@ -210,9 +210,9 @@ func TestMsgMarshal(t *testing.T) { PTid: 0x0102030405060708, NumReplicas: 34, RowList: []RowInfo{ - {CellInfo{11, UP_TO_DATE}, CellInfo{17, OUT_OF_DATE}}, - {CellInfo{11, FEEDING}}, - {CellInfo{11, CORRUPTED}, CellInfo{15, DISCARDED}, CellInfo{23, UP_TO_DATE}}, + {{11, UP_TO_DATE}, {17, OUT_OF_DATE}}, + {{11, FEEDING}}, + {{11, CORRUPTED}, {15, DISCARDED}, {23, UP_TO_DATE}}, }, }, @@ -229,9 +229,9 @@ func TestMsgMarshal(t *testing.T) { hex("cf0102030405060708") + hex("22") + hex("93") + - hex("92"+"920bd40001"+"9211d40000") + - hex("91"+"920bd40002") + - hex("93"+"920bd40003"+"920fd40004"+"9217d40001"), + hex("92"+"920bd40401"+"9211d40400") + + hex("91"+"920bd40402") + + hex("93"+"920bd40403"+"920fd40404"+"9217d40401"), }, // map[Oid]struct {Tid,Tid,bool} for cosmetics and because the tests were failing as --- FAIL: TestMsgMarshal (0.00s) proto_test.go:106: M/proto.AnswerPartitionTable: encode result unexpected: proto_test.go:107: have: 93cf0102030405060708229392920bd404019211d4040091920bd4040293920bd40403920fd404049217d40401 proto_test.go:108: want: 93cf0102030405060708229392920bd400019211d4000091920bd4000293920bd40003920fd400049217d40001 /reviewed-by @kirr /reviewed-on kirr/neo!6
-
- 06 Aug, 2024 1 commit
-
-
Kirill Smelkov authored
* master: go/neo/neonet: DialLink: Fix SIGSEGV in case client handshake fails go/neo/neonet: Demonstrate DialLink misbehaviour when all handshake attempts fail
-
- 04 Aug, 2024 2 commits
-
-
Levin Zimmermann authored
In case the last 'handshakeClient' call returns an error, 'DialLink' returns 'link = nil, err = nil'. Callers of 'DialLink' then don't recognize that 'link' is 'nil', as it's the convention to only check if 'err' is 'nil', which leads to a 'segmentation violation' as soon as subsequent code tries to access fields of 'link': ``` panic: runtime error: invalid memory address or nil pointer dereference [signal SIGSEGV: segmentation violation code=0x1 addr=0x14 pc=0x7087ae] goroutine 5 [running]: lab.nexedi.com/kirr/neo/go/neo/neonet.(*NodeLink).NewConn(0x0) /srv/slapgrid/slappart82/srv/runner/instance/slappart6/software_release/parts/wendelin.core/wcfs/neo/go/neo/neonet/connection.go:404 +0x4e lab.nexedi.com/kirr/neo/go/neo/xneo.Dial.func1() /srv/slapgrid/slappart82/srv/runner/instance/slappart6/software_release/parts/wendelin.core/wcfs/neo/go/neo/xneo/connect.go:138 +0x52 lab.nexedi.com/kirr/neo/go/internal/xio.WithCloseOnErrCancel.func2() /srv/slapgrid/slappart82/srv/runner/instance/slappart6/software_release/parts/wendelin.core/wcfs/neo/go/internal/xio/xio.go:114 +0x6a created by lab.nexedi.com/kirr/neo/go/internal/xio.WithCloseOnErrCancel in goroutine 21 /srv/slapgrid/slappart82/srv/runner/instance/slappart6/software_release/parts/wendelin.core/wcfs/neo/go/internal/xio/xio.go:109 +0x1ad ``` This patch fixes this issue so that now 'err' and 'link' are never both 'nil' again. /reviewed-by @kirr /reviewed-on !10
-
Kirill Smelkov authored
Levin found that when all handshake attempts fail DialLink returns both link=nil and err=nil which breaks what callers expect and lead to segmentation fault when accessing that nil link. -> Add test to demonstrate the problem. With xfail removed that test currently fails as --- FAIL: TestDialLink_AllHandshakeErr (0.00s) panic: lab.nexedi.com/kirr/neo/go/neo/neonet.TestDialLink_AllHandshakeErr.gox.func4.1: lab.nexedi.com/kirr/neo/go/neo/neonet.TestDialLink_AllHandshakeErr.func2: DialLink to handshake-rejecting server: have: link=<nil> err=<nil> want: link=<nil> err=client:1 - server:2: handshake (client): unexpected EOF [recovered] We will fix the problem in the next patch. /reported-by @levin.zimmermann /reported-at !10
-
- 23 Jul, 2024 2 commits
-
-
Levin Zimmermann authored
I realized a minor mistake I did in writing the tests that needs to be fixed in order to have reliable test results. /reviewed-by @kirr /reviewed-on kirr/neo!9
-
Levin Zimmermann authored
During our work on 'wendelin.core' URI normalization, Kirill Smelkov noted a inconsistency between NEO/py and NEO/go URI parser [1]: NEO/py drops empty query options, while NEO/go preserves them. Let's fix this inconsistency by adjusting NEO/go to NEO/py behaviour. [1] nexedi/wendelin.core!28 (comment 212447) /reviewed-by @kirr /reviewed-on kirr/neo!9
-
- 21 Jul, 2024 1 commit
-
-
Kirill Smelkov authored
Hello Kirill, in nexedi/neoppod!18 and nexedi/neoppod!21 we could find a common solution for a zurl format that previously diverged between NEOgo and NEOpy. The purpose of this MR is to sync again NEOgo and NEOpy zurl format. After merging this, we can continue to sync NEO zurl format in 'wendelin.core' & 'slapos'. Then we finally have unified approach again, which simplifies understanding and reduces unnecessary mental overhead. As this is strongly related to nexedi/neoppod!21 I thought it'd be a good idea to generally reduce difference and to replace WIP commits with merged NEOpy upstream commits. Best, Levin /reviewed-by @kirr /reviewed-on kirr/neo!7 * lev/sync-zurl: client: Don't allow oPtion_nAme in zurl app: Remember SSL credentials so that it is possible to retrieve them client: Allow to force TLS via neos:// scheme client: Don't allow master_nodes and name to be present in options Revert "." Revert "Y client: Fix URI scheme to move credentials out of query" Revert "X Adjust NEO/go to neo:// URL change + py fixups" Revert "fixup! Y client: Fix URI scheme to move credentials out of query" Revert "Y client: Don't allow master_nodes and name to be present in options" go/client/zurl: Sync format to py upstream
-
- 19 Jul, 2024 10 commits
-
-
Kirill Smelkov authored
Julien notes this is very likely unneeded: nexedi/neoppod!21 (diffs, comment 195929) We had it like this since 01a01c8c (client: Add support for zodburi), but I rechecked zodburi codebase now and it does not do any similar lowering anywhere. So drop support for case normalization in zurl options. /cc @levin.zimmermann /reviewed-by @jm /reviewed-on nexedi/neoppod!21 (cherry-picked from commit 798c9f25)
-
Kirill Smelkov authored
Unfortunately after creating SSL context it is not possible, or at least I could not find how, to retrieve original credentials with which the context was created. However wendelin.core needs to be able to take a client storage, reconstruct zurl to refer to that particular storage, and pass that zurl to wcfs, so that wcfs, in turn, could access the same ZODB database. Given a NEO client instance, it is already possible to retrieve master_nodes, cluster name, and detect whether SSL is being in use. However without being able to retrieve original SSL credentials, reconstructed zurl will not be full and wcfs won't be able to use exactly the same secrets as python part does. -> Help wendelin.core by remembering which ca/cert/key were used to build SSL context. This information is used by zstor_2zurl in wendelin.core here: https://lab.nexedi.com/nexedi/wendelin.core/blob/885b3556/lib/zodb.py#L390-418 /cc @levin.zimmermann /reviewed-by @jm /reviewed-on nexedi/neoppod!21
-
Kirill Smelkov authored
Similarly to how it is done with e.g. http:// and https:// - if neos:// is given TLS usage is forced and ca/cert/key must be there either in the URI itself, or in $NEO_CA, $NEO_CERT and $NEO_KEY environment variables mimicking the way how e.g. for https:// TLS credentials are taken from host environment, not from the uri. The latter might be usability convenience, but is also useful for WCFS which needs to be able to remove secrets from uri on zurl normalization. Please see discussion at nexedi/neoppod!18 (comment 184439) for details. /cc @levin.zimmermann /reviewed-by @jm /reviewed-on nexedi/neoppod!21 (cherry-picked from commit bc3e38ea)
-
Kirill Smelkov authored
Because list of masters and cluster name must be already present in netloc and path. Previously e.g. neo://db@α,β,γ?master_nodes=a,b,c" would mean to use master nodes {a,b,c} not {α,β,γ}. Now it is treated as invalid URI to remove ambiguity. Same for cluster name. /cc @levin.zimmermann /reviewed-by @jm /reviewed-on nexedi/neoppod!21 (cherry-picked from commit 22ccebd6)
-
Levin Zimmermann authored
This reverts commit a2f192cb. This has been merged upstream with nexedi/neoppod@17af7f27. We should rather cherry-pick upstream commit.
-
Levin Zimmermann authored
This reverts commit b9a42957. In nexedi/neoppod!18 and nexedi/neoppod!21 a common solution for a zurl format was found. This common format keeps credentials in the query, therefore we should revert patch b9a42957.
-
Levin Zimmermann authored
This reverts the py part of kirr/neo@8c974485. go parts of this patch are handled in 70c0a984.
-
Levin Zimmermann authored
This reverts commit cf685fb5. This used to be a divergence between NEO/py and NEO/go, however in nexedi/neoppod!18 and nexedi/neoppod!21 a common solution for a zurl format was found. This common format keeps credentials in the query, therefore we should revert patch cf685fb5.
-
Levin Zimmermann authored
This reverts commit 6047f893 in order to replace it with py upstream commit nexedi/neoppod@22ccebd6.
-
Levin Zimmermann authored
NEO/go and NEO/py zurl format diverged over time: - kirr/neo@8c974485 However with nexedi/neoppod!21 a common solution was found. From there, this patch aims to adjust NEO/go zurl format to be in sync with NEO/py zurl format again.
-
- 02 Feb, 2024 3 commits
-
-
Kirill Smelkov authored
* master: go/zodb: Handle common options in zurl in generic layer
-
Kirill Smelkov authored
/reviewed-by @kirr /reviewed-on kirr/neo!4 * kirr/t+new-uri: Revert "Y client: Adjust URI scheme to move client-specific options to fragment" fixup! client.go: Fix URI client option parsing for supported + unsupported options client.go: Fix URI client option parsing for supported + unsupported options fixup! client_test: Add tests for NEO URI parser client_test: Add tests for NEO URI parser fixup! client: Refactor openClientByURL for easier testing client: Refactor openClientByURL for easier testing Y go/zodb: Handle common options in zurl in generic layer
-
Kirill Smelkov authored
Offload drivers from handling options such as ?read-only=1 and force them to deal with such options only via DriverOptions, never zurl. See added comment for details. /reviewed-by @levin.zimmermann /reviewed-on !4
-
- 29 Jan, 2024 8 commits
-
-
Levin Zimmermann authored
This reverts commit kirr/neo@4c9414ea. This patch was added at a time when nexedi/neoppod!18 wasn't resolved yet, but we already wanted to proceed with WCFS. Now the NEO MR is resolved and we decided to mostly leave the NEO zurl as it was originally implemented in nexedi/neoppod!6. This means we don't need this patch anymore which changed the NEO zurl format.
-
Kirill Smelkov authored
readonly is handled by common zodb.OpenDriver.
-
Levin Zimmermann authored
Before this patch, the parser ignored options which were already supported by the client (for instance 'read-only') and even raised an error. But the client can already use this option: as a9246333 describes this should happen in the local storage URL parser. Furthermore not-yet-supported client options (for instance compress) broke the NEO client before this patch. Now these options only raise a warning which informs the user that they are ignored. Why? We want to use pre-complete NEO in real-world projects together with NEO/py clusters. Those real-world projects may already specify options which aren't supported by our NEO/go client yet. But it doesn't matters so much, because those options are mostly relevant for other NEO/py cluster clients (e.g. zope nodes). Instead of filtering those parameters before parsing them to NEO/go in a higher level (e.g. SlapOS), NEO/go should already support any valid NEO URL and raise warnings for not yet implemented features.
-
Kirill Smelkov authored
- use simplified parseURL signature - DriverOptions are not passed nor changed there. - read-only is handled by generic zodb layer not neo.parseURL .
-
Levin Zimmermann authored
This test was missing so far. Particularly recent changes of the NEO URI scheme [1], but also problems with valid old URI [2] stressed out the necessity for comprehensive NEO URI parser tests. [1] kirr/neo@4c9414ea [2] 573514c6 (comment 184417)
-
Kirill Smelkov authored
- no need to pass DriverOptions into parseURL - it is only zurl that is parsed, and also DriverOptions should not be changed by the opener. - no need to document "If anything fails within this process an error and nil are returned." because that is standard omnipresent Go convention.
-
Levin Zimmermann authored
With all the recent changes of the NEO URI scheme we need to reliably test the function which parses the URI and convert it into the different parameter. Testing is much simpler if we can only analyse how the URI parsing works. Therefore this patch moves NEO URI parsing to an external function.
-
Kirill Smelkov authored
Offload drivers from handling options such as ?read-only=1 and force them to deal with such options only via DriverOptions, never zurl. See added commend for details.
-
- 22 Aug, 2023 1 commit
-
-
Kirill Smelkov authored
I missed the following build failure in go/neo/cmd: # lab.nexedi.com/kirr/neo/go/neo/cmd/neo ./storage.go:128:37: cannot use master (variable of type string) as []string value in argument to neo.NewStorage
-
- 02 Aug, 2023 8 commits
-
-
Levin Zimmermann authored
See !2 for discussion, context and details. /reviewed-by @kirr * t-with-multiple-master-nodes: fixup! client_test: Add nmaster={1,2} to test matrix fixup! client_test: Support test cluster /w >1 master fixup! TalkMaster: Switch master if dialed M is secondary fixup! Node: Add support for NEO cluster with > 1 master fixup! Dial: Catch NotPrimaryMaster & return custom error fixup! proto: Implement Error for NotPrimaryMaster fixup! proto.NotPrimaryMaster: Fix .Primary data type (2) fixup! proto.NotPrimaryMaster: Fix .Primary data type (1) client_test: Add nmaster={1,2} to test matrix client_test: Support test cluster /w >1 master proto.NotPrimaryMaster: Fix .Primary data type TalkMaster: Switch master if dialed M is secondary Dial: Catch NotPrimaryMaster & return custom error proto: Implement Error for NotPrimaryMaster openClientByURL: Fix for >1 master (split URL host) Client.URL: Fix incomplete URL if > 1 master nodes Node: Add support for NEO cluster with > 1 master
-
Kirill Smelkov authored
Actually do test nmaster=2 case.
-
Kirill Smelkov authored
- show all options in error context and ran test kind - skip nmaster > 1 for NEO/go as that is currently not implemented on NEO/go server - use json for interacting with runneo.py, so that we can use whatever builtin type for any argument without hardcoding ad-hoc handling of specific arguments inside runneo.py. - adjust comments + cosmetics
-
Kirill Smelkov authored
- validate received NotPrimaryMaster - use Address.String() instead of printf with format that works only for ipv6 - add some logging, comments and TODO
-
Kirill Smelkov authored
- no need to keep Node.MasterAddr anymore - the address of current PM is managed by TalkMaster and is provided as part of operational context to user functions that TalkMaster runs. - correct docstrings. - cosmetics.
-
Kirill Smelkov authored
Expect NotPrimaryMaster only if we are trying to connect to a master.
-
Kirill Smelkov authored
Provide details in the error message.
-
Kirill Smelkov authored
Change .Promary type from int8 back to int32. 5d93e434 says that .Primary type is not NodeID. That is true, but changing it to int8 was a mistake: 1. PSignedNull is explicitly defined to come with '!l' struct code, which according to https://docs.python.org/3/library/struct.html#module-struct comes as 4-bytes integer on the wire: https://lab.nexedi.com/nexedi/neoppod/blob/v1.12-13-gf2ea4be2/neo/lib/protocol.py#L560-562 2. verifying this via serializing NotPrimaryMaster on NEO/py also confirms that .Primary occupies 4 bytes, not one: In [1]: from neo.lib.protocol import NotPrimaryMaster In [2]: NotPrimaryMaster(0x01020304, [('m111', 111), ('m222', 222)])._body Out[2]: '\x01\x02\x03\x04\x00\x00\x00\x02\x00\x00\x00\x04m111\x00o\x00\x00\x00\x04m222\x00\xde' ^^^^^^^^^^^^^^^^ NOTE NOTE NOTE -> change .Primary type back to being 4-bytes integer, but to int32 instead of NodeID because, as 5d93e434 correctly says, .Primary comes as array index, not a node ID. The following place of NEO/py code explicitly confirms this: https://lab.nexedi.com/nexedi/neoppod/blob/v1.12-13-gf2ea4be2/neo/master/handlers/identification.py#L155-159 Add corresponding test.
-
- 01 Aug, 2023 1 commit
-
-
Kirill Smelkov authored
Rerun `go generate`. As the diff in zproto-marshal.go shows changing NotPrimaryMaster.Primary type from NodeID to int8 actually does make a difference. This happens because NodeID type is based on int32 and changing that to int8 changes how NotPrimaryMaster structure is layed out in memory and on the wire. The changes to zproto-marshal.go in 5d93e434 seem to be done by hand and not matching the change to proto.go even though head of zproto-marshal.go says // Code generated by protogen.go; DO NOT EDIT.
-
- 18 Jul, 2023 1 commit
-
-
Levin Zimmermann authored
Tests should work with both one master or more than one masters.
-