fixup! download: add support for slapos.libnetworkcache
See commit 47ab68e0.
-
Owner
This adds 2 things:
- Do not cache restricted data. I think it's quite obvious but ask me otherwise.
- New
${networkcache-section:file-urlmd5-exclude}
option to exclude URL. This is required because caching URLs whose contents change (e.g. https://lab.nexedi.com/nexedi/slapos/raw/master/software/fluentd/instance.cfg) does not work (the server returns a random version which is rarely the correct and the client uploads all the time). Given that gitlab urls don't distinguish branches from tags or commits, we'd need something like:
file-urlmd5-exclude = !https://lab\.nexedi\.com/[^/]+/[^/]+/raw/[0-9a-f]{40}/ !https://lab\.nexedi\.com/[^/]+/[^/]+/raw/[^/]*\d+\.\d+\.\d+ https://lab\.nexedi\.com/
Waiting for comments before testing.
-
Owner
I'd really prefer to have the same syntax and same vocabulary that we use for binary network cache. See https://lab.nexedi.com/nexedi/slapos.core/-/blob/acff45335692847d3aa5e87017b6325abab75699/slapos.cfg.example#L162
So it would be
file-urlmd5-blacklist
and the content would be exactly the same as I pointed above. -
Owner
I don't know this well, so I don't have much to say except that it's not good to use words like "blacklist" ( https://en.wikipedia.org/wiki/Blacklist_(computing)#Controversy_over_use_of_the_term ), we should really consider changing slapos.cfg instead, or at least not use more this kind of terminology.
The syntax is
!
for include and without to exclude ? we could use two options maybe ( maybe the problem is in which order to apply them ? )newer gitlab use a
-
in URL https://lab.nexedi.com/nexedi/slapos/-/raw/master/software/fluentd/instance.cfg once we make the real regex we should allow these.Is https://developer.mozilla.org/en-US/docs/Web/API/URL_Pattern_API more suitable than regular expression here ? I think we don't like the idea of adding dependencies in buildout so it might be a bad idea. EDIT: this seems different kind of thing, it's not just a way to describe URL patterns with a string.Another thing I am wondering, isn't it possible/better to refuse the URLs on server side ?
-
Owner
it's not good to use words like "blacklist" ( https://en.wikipedia.org/wiki/Blacklist_(computing)#Controversy_over_use_of_the_term )
I'd use 'blacklist' just because I find this controversy stupid.
I chose something that was a little closer to Git (
$GIT_DIR/info/exclude
, but could not do exactly the same because too limited) and I also had the feeling it was less strange to have negative patterns by naming like this.The syntax is
!
for include and without to exclude ?So yes, like gitignore. I think everyone knows this way of filtering so it should be fine. And well, this option will be so rarely touched.
we could use two options maybe ( maybe the problem is in which order to apply them ? )
Indeed the order matters and with 2 options you lose a lot.
Another thing I am wondering, isn't it possible/better to refuse the URLs on server side ?
Unfortunately no, this is too late. First because one of the purpose is to avoid useless shadir lookups for things that should not be cached. And also the way some blob is indexed is exclusively client's business.
I'd really prefer to have the same syntax and same vocabulary that we use for binary network cache.
You don't take into account what I tried to solve by doing differently. Do you discuss the goal ?
-
Owner
I'm waiting on this to cherry-pick/fixup it on !30 (merged). Is this ready to push on master?
-
mentioned in merge request !30 (merged)
-
Owner
Pushed to master. You can include it in !30 (merged).
-
Owner
Unfortunately, @jp is against the principle of excluding some stuff from shacache. The philosophy is "everything buildout is using should go into shacache". The files with URLs pointing to a branch will of course be changing overtime but with the new filter mechanism of slapos.libnetworkcache, we can make buildout select the most recent version of the files. @jp is aware that the state of the files in shacache for master branch will be incoherent (but in a sense this is already the same problem when using directly gitlab and only a local git repository or a tag is coherent).
So I will partially revert this commit (only the part for exclude) so that Xavier can still rebase with the part for not caching authenticated material.
Sorry for not warning beforehand. I just had the discussion yesterday afternoon.
After slapos.libnetworkcache!9 (merged) is done, we can discuss the changes needed in buildout for using those cached branches (maybe add an option
--accept-shacache-incoherent-buildout-files
).