Commit 3985f158 authored by Kazuhiko Shiozaki's avatar Kazuhiko Shiozaki

CMFCategory: keep only the first occurrence of each category while preserving order.

parent c3f1131f
......@@ -30,7 +30,7 @@
"""\
ERP portal_categories tool.
"""
from collections import deque
from collections import deque, OrderedDict
import re
from BTrees.OOBTree import OOTreeSet
from Products.ERP5Type.Globals import InitializeClass, DTMLFile
......@@ -1189,7 +1189,8 @@ class CategoryTool(BaseTool):
# note that we don't want to cast as set at this point to keep ordering (and duplicates).
value = [x for x in tuple(value) if x != relative_url]
context.categories = value = tuple(value)
# Keep only the first occurrence of each category while preserving order.
context.categories = value = tuple(OrderedDict.fromkeys(value))
if context.isTempDocument():
return
......
......@@ -221,6 +221,14 @@ class TestCMFCategory(ERP5TypeTestCase):
p1.setRegionList(region_list)
self.assertEqual(p1.getRegion(), None)
region_list = [self.region1, self.region2, self.region1]
p1.setRegionList(region_list)
self.assertEqual(p1.getRegionList(), self.region_list)
p1.setRegionValueList([region_value_list[0],
region_value_list[1],
region_value_list[0]])
self.assertEqual(p1.getRegionList(), self.region_list)
def test_03_CategoryValue(self):
# Test if we can get categories values
region_value = self.portal.portal_categories.resolveCategory('region/%s' % self.region1)
......@@ -1251,13 +1259,13 @@ class TestCMFCategory(ERP5TypeTestCase):
def _set(*args, **kw):
return category_tool._setCategoryMembership(person, *args, **kw)
_set(bc.id, list('aa'))
self.assertEqual(get(bc.id), list('aa'))
self.assertEqual(get(bc.id), list('a'))
_set(bc.id, list('baa'))
self.assertEqual(get(bc.id), list('aba'))
self.assertEqual(get(bc.id), list('ab'))
_set(bc.id, map(base, 'bb'), 1)
self.assertEqual(get(bc.id), list('bb'))
self.assertEqual(get(bc.id), list('b'))
_set(bc.id, map(base, 'abb'), 1)
self.assertEqual(get(bc.id), list('bab'))
self.assertEqual(get(bc.id), list('ba'))
_set(bc.id, ())
def test_relatedIndex(self):
......
  • Shouldn't we make a special setter for that ? a bit like set{Category}ValueSet but preserving the order ?

  • ( I guess there are code depending on the ability to set duplicates categories, but I don't know examples of specific cases )

  • I cannot imagine any usecase where we need to store duplicate category relations intentionally... and that cannot be represented on catalog. Do you really want to support such functionality ?

  • That's true that it's not supported by catalog. I don't know such a use case either, but I see we have Set getters and setter and I guess we added them for the purpose of controlling presence of duplicates, so probably there's a reason to allow duplicates. If we disallow duplicates at CMFActivity level , Set getters becomes a useless.

  • In my opinion, Set setter is badly implemented because it does not guarantee the order of values thus Default getter value is not predictable. As a part of this change,Set getter / setter should become an alias of normal List getter / setter.

  • @seb Set setter is initially implemented by you at 2004 😄 (69ba1a39).

    @romain Set getter is initially implemented by you at 2006 😄 (2c00c5a1).

    Can I ask your opinion ?

  • I agree with @jerome. Set accessors must be used if you want to get rid of duplicated categories, and CMFCategory change must be reverted. I also agree that the order must be preserved by the Set accessors, and so, they must be fixed.

  • Well, if Set-type getter returns a set, it is anyway non-ordered object in Python, thus the order of the return value is unpredictable and I feel it is useless. Or should Set-type getter return a list or a tuple (keeping the order but having no duplicate) that sounds a bit strange ?

    I'm still wondering if we REALLY have a strong requirement to store duplicate category relations. Anybody @nexedi has ?

    Anyway current Set-type accessors are bad because :

    • Set-type setter does not preserve the order when stroging into categories property.
    • Set-type getter does not preserve the order stored in categories property.

    BTW, the reason why I only modified setter in this commit is for the performance, i.e. we can assume that getters are much more often called than setters and once setter is well implemented to remove duplicate, we don't need to remove duplicate in getter.

  • Do we really use set accessors? Since the order of categories is necessary to determine the default value and set is unordered, set is simply incompatible with ERP5 category system. No?

  • Or should Set-type getter return a list or a tuple (keeping the order but having no duplicate) that sounds a bit strange ?

    To me it is not so strange, I understand these accessor qualificators as expectations on the semantic level rather than on type level: a Set setter should accept any vector and may complain when finding duplicate values (or silently discard all but one, but then we need to specify which is kept). Another example is that List getter could return a tuple if it wants, IMHO.

    But besides this, I agree with @kazuhiko that the point here is whether we have a use-case of multiple all-identical relationships on documents. If there is none, then I think related API should either discard duplicates (again, specifying which is kept) or complain on duplicates (I tend to prefer the latter). Accepting meaningless garbage is only asking for trouble later:

    • maybe when some other change will suddenly cause the shadowed entry to become significant
    • maybe when another modification codepath will trash all but one (ex: set 2 source relations to the same document on a document with a monovalued my_source field, then change the value in that field, bam both relations are replaced with a single new one to the new source).

    Do note that "all-identical" is a critically important part in my argument:

    • same base category
    • same related document
    • Set-type getter ALREADY returns a list instance, not a set instance. So the issue is just 'it is unordered'.
    • Set-type setter has hardcoded keep_default=1. Even if we keep this behaviour, I think we should keep the order in the rest.
    • possible usage of 'duplicate' categories is representing 'steps' like ('xxx/step1', 'xxx/step2', 'xxx/step1', 'xxx/step3'), according to @jp

    So I created a new merge request !862 (merged).

  • mentioned in merge request !862 (merged)

    Toggle commit list
Markdown is supported
0%
or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment