Provide a way to express option values such that relative indentation

is preserved. Tighten config file syntax to allow fewer variations.

Provide a way to express option values such that relative indentation
is preserved. Tighten config file syntax to allow fewer variations.
7052070c · Jim Fulton · 64803254 · 7052070c · 7052070c · 7052070c
Commit 7052070c authored Jan 10, 2013 by Jim Fulton
4 changed files
--- a/src/zc/buildout/buildout.txt
+++ b/src/zc/buildout/buildout.txt
@@ -690,9 +690,84 @@ Now, we'll fix the typo again and we'll get the directories we expect:
 Configuration file syntax
 -------------------------
-As mentioned earlier, buildout configuration files use the format
+A buildout configuration file consists of a sequence of sections.  A
-defined by the Python ConfigParser module with extensions.  The
+section has a section header followed by 0 or more section options.
-extensions are:
+(Buildout configuration files may be viewed as a variation on INI
+files.)
+A section header consists of a section name enclosed in square braces.
+A section name consists of one or more non-whitespace characters
+other than square braces ('[', ']'), curly braces ('{', '}'), colons
+(':') or equal signs ('='). Whitespace surrounding section names is ignored.
+Options consist of option names, followed by optional space or tab
+characters, an optional plus or minus sign and an equal signs and
+values.  An option value may be spread over multiple lines as long as
+the lines after the first start with a whitespace character.  An
+option name consists of one or more non-whitespace characters other
+than equal signs, square braces ("[", "]"), curly braces ("{", "}"),
+plus signs or colons (":"). The option name '<' is reserved.  An
+option's data consists of the characters following the equal sign on
+the start line, plus the continuation lines.
+Option values have extra whitespace stripped.  How this is done
+depends on whether the value has non-whitespace characterts on the
+first line.  If an option value has non-whitespace characters on the
+first line, then each line is stripped and blank lines are removed.
+For exampe, in::
+  [foo]
+  bar = 1
+  baz = a
+        b
+        c
+.. -> text
+    >>> import pprint, StringIO, zc.buildout.configparser
+    >>> pprint.pprint(zc.buildout.configparser.parse(StringIO.StringIO(
+    ...     text), 'test'))
+    {'foo': {'bar': '1', 'baz': 'a\nb\nc'}}
+The value of of ``bar`` is ``'1'`` and the value of ``baz`` is
+``'a\nb\nc'``.
+If the first line of an option doesn't contain whitespace, then the
+value is dedented (with ``textwrap.dedent``), trailing spaces in lines
+are removed, and leading and trailing blank lines are removed.  For
+example, in::
+  [foo]
+  bar =
+  baz =
+    a
+      b
+    c
+.. -> text
+    >>> pprint.pprint(zc.buildout.configparser.parse(StringIO.StringIO(
+    ...     text), 'test'))
+    {'foo': {'bar': '', 'baz': 'a\n  b\n\nc'}}
+The value of bar is ``''``, and the value of baz is ``'a\n  b\n\nc'``.
+Lines starting with '#' or ';' characters are comments.  Comments can
+also be placed after the closing square bracket (']') in a section header.
+Buildout configuration data are Python strings, which are bytes in
+Python 2 and unicode in Python 3.
+Sections and options within sections may be repeated.  Multiple
+occurrences of of a section are treated as if they were concantinated.
+The last option value for a given name in a section overrides previous
+values.
+In addition top the syntactic details above:
 - option names are case sensitive
@@ -702,18 +777,6 @@ extensions are:
 - option values can be appended or removed using the - and +
  operators.
-The ConfigParser syntax is very flexible.  Section names can contain
-any characters other than newlines and right square braces ("]").
-Option names can contain any characters other than newlines, colons,
-and equal signs, can not start with a space, and don't include
-trailing spaces.
-It is likely that, in the future, some characters will be given
-special buildout-defined meanings.  This is already true of the
-characters ":", "$", "%", "(", and ")".  For now, it is a good idea to
-keep section and option names simple, sticking to alphanumeric
-characters, hyphens, and periods.
 Annotated sections
 ------------------
@@ -839,15 +902,15 @@ examples:
    ...
    ... [debug]
    ... recipe = recipes:debug
-    ... File 1 = ${data-dir:path}/file
+    ... File-1 = ${data-dir:path}/file
-    ... File 2 = ${debug:File 1}/log
+    ... File-2 = ${debug:File-1}/log
    ...
    ... [data-dir]
    ... recipe = recipes:mkdir
    ... path = mydata
    ... """)
-We used a string-template substitution for File 1 and File 2.  This
+We used a string-template substitution for File-1 and File-2.  This
 type of substitution uses the string.Template syntax.  Names
 substituted are qualified option names, consisting of a section name
 and option name joined by a colon.
@@ -861,8 +924,8 @@ substituted.
    Installing data-dir.
    data-dir: Creating directory mydata
    Installing debug.
-    File 1 /sample-buildout/mydata/file
+    File-1 /sample-buildout/mydata/file
-    File 2 /sample-buildout/mydata/file/log
+    File-2 /sample-buildout/mydata/file/log
    recipe recipes:debug
 Note that the substitution of the data-dir path option reflects the
@@ -878,8 +941,8 @@ the buildout:
    Develop: '/sample-buildout/recipes'
    Updating data-dir.
    Updating debug.
-    File 1 /sample-buildout/mydata/file
+    File-1 /sample-buildout/mydata/file
-    File 2 /sample-buildout/mydata/file/log
+    File-2 /sample-buildout/mydata/file/log
    recipe recipes:debug
 We can see that mydata was not recreated.
@@ -904,8 +967,8 @@ _buildout_section_name_ to get the current section name.
    ...
    ... [debug]
    ... recipe = recipes:debug
-    ... File 1 = ${data-dir:path}/file
+    ... File-1 = ${data-dir:path}/file
-    ... File 2 = ${:File 1}/log
+    ... File-2 = ${:File-1}/log
    ... my_name = ${:_buildout_section_name_}
    ...
    ... [data-dir]
@@ -918,8 +981,8 @@ _buildout_section_name_ to get the current section name.
    Uninstalling debug.
    Updating data-dir.
    Installing debug.
-    File 1 /sample-buildout/mydata/file
+    File-1 /sample-buildout/mydata/file
-    File 2 /sample-buildout/mydata/file/log
+    File-2 /sample-buildout/mydata/file/log
    my_name debug
    recipe recipes:debug
@@ -940,8 +1003,8 @@ example, we can leave data-dir out of the parts list:
    ...
    ... [debug]
    ... recipe = recipes:debug
-    ... File 1 = ${data-dir:path}/file
+    ... File-1 = ${data-dir:path}/file
-    ... File 2 = ${debug:File 1}/log
+    ... File-2 = ${debug:File-1}/log
    ...
    ... [data-dir]
    ... recipe = recipes:mkdir
@@ -956,8 +1019,8 @@ It will still be treated as a part:
    Uninstalling debug.
    Updating data-dir.
    Installing debug.
-    File 1 /sample-buildout/mydata/file
+    File-1 /sample-buildout/mydata/file
-    File 2 /sample-buildout/mydata/file/log
+    File-2 /sample-buildout/mydata/file/log
    recipe recipes:debug
    >>> cat('.installed.cfg') # doctest: +ELLIPSIS
@@ -979,8 +1042,8 @@ the data-dir part after the debug part, it will be included before:
    ...
    ... [debug]
    ... recipe = recipes:debug
-    ... File 1 = ${data-dir:path}/file
+    ... File-1 = ${data-dir:path}/file
-    ... File 2 = ${debug:File 1}/log
+    ... File-2 = ${debug:File-1}/log
    ...
    ... [data-dir]
    ... recipe = recipes:mkdir
@@ -994,8 +1057,8 @@ It will still be treated as a part:
    Develop: '/sample-buildout/recipes'
    Updating data-dir.
    Updating debug.
-    File 1 /sample-buildout/mydata/file
+    File-1 /sample-buildout/mydata/file
-    File 2 /sample-buildout/mydata/file/log
+    File-2 /sample-buildout/mydata/file/log
    recipe recipes:debug
    >>> cat('.installed.cfg') # doctest: +ELLIPSIS

--- a/src/zc/buildout/configparser.py
+++ b/src/zc/buildout/configparser.py
@@ -18,6 +18,7 @@
 # - dict of dicts is a much simpler api
 import re
+import textwrap
 class Error(Exception):
    """Base class for ConfigParser exceptions."""
@@ -44,6 +45,8 @@ class Error(Exception):
    def __repr__(self):
        return self.message
+    __str__ = __repr__
 class ParsingError(Error):
    """Raised when a configuration file does not follow legal syntax."""
@@ -68,19 +71,13 @@ class MissingSectionHeaderError(ParsingError):
        self.lineno = lineno
        self.line = line
-SECTCRE = re.compile(
+section_header = re.compile(
-    r'\['                                 # [
+    r'\[\s*(?P<header>[^\s[\]:{}]+)\s*]\s*([#;].*)?$').match
-    r'(?P<header>[^]]+)'                  # very permissive!
+option_start = re.compile(
-    r'\]'                                 # ]
+    r'(?P<name>[^\s{}[\]=:]+\s*[-+]?)'
-    )
+    r'='
-OPTCRE = re.compile(
+    r'(?P<value>.*)$').match
-    r'(?P<option>[^:=\s][^:=]*)'          # very permissive!
+leading_blank_lines = re.compile(r"^(\s*\n)+")
-    r'\s*(?P<vi>[:=])\s*'                 # any number of space/tab,
-                                          # followed by separator
-                                          # (either : or =), followed
-                                          # by any # space/tab
-    r'(?P<value>.*)$'                     # everything up to eol
-    )
 def parse(fp, fpname):
    """Parse a sectioned setup file.
@@ -92,63 +89,56 @@ def parse(fp, fpname):
    leading whitespace.  Blank lines, lines beginning with a '#',
    and just about everything else are ignored.
    """
-    _sections = {}
+    sections = {}
    cursect = None                            # None, or a dictionary
+    blockmode = None
    optname = None
    lineno = 0
    e = None                                  # None, or an exception
    while True:
        line = fp.readline()
        if not line:
-            break
+            break # EOF
        lineno = lineno + 1
-        # comment or blank line?
-        if line.strip() == '' or line[0] in '#;':
+        if line[0] in '#;':
-            continue
+            continue # comment
-        if line.split(None, 1)[0].lower() == 'rem' and line[0] in "rR":
-            # no leading whitespace
-            continue
-        # continuation line?
        if line[0].isspace() and cursect is not None and optname:
-            value = line.strip()
+            # continuation line
-            if value:
+            if blockmode:
-                cursect[optname] = "%s\n%s" % (cursect[optname], value)
+                line = line.rstrip()
-        # a section header or option header?
+            else:
+                line = line.strip()
+                if not line:
+                    continue
+            cursect[optname] = "%s\n%s" % (cursect[optname], line)
        else:
-            # is it a section header?
+            mo = section_header(line)
-            mo = SECTCRE.match(line)
            if mo:
+                # section header
                sectname = mo.group('header')
-                if sectname in _sections:
+                if sectname in sections:
-                    cursect = _sections[sectname]
+                    cursect = sections[sectname]
                else:
-                    cursect = {}
+                    sections[sectname] = cursect = {}
-                    _sections[sectname] = cursect
                # So sections can't start with a continuation line
                optname = None
-            # no section header in the file?
            elif cursect is None:
+                if not line.strip():
+                    continue
+                # no section header in the file?
                raise MissingSectionHeaderError(fpname, lineno, line)
-            # an option line?
            else:
-                mo = OPTCRE.match(line)
+                mo = option_start(line)
                if mo:
-                    optname, vi, optval = mo.group('option', 'vi', 'value')
+                    # option start line
-                    # This check is fine because the OPTCRE cannot
+                    optname, optval = mo.group('name', 'value')
-                    # match if it would set optval to None
-                    if optval is not None:
-                        if vi in ('=', ':') and ';' in optval:
-                            # ';' is a comment delimiter only if it follows
-                            # a spacing character
-                            pos = optval.find(';')
-                            if pos != -1 and optval[pos-1].isspace():
-                                optval = optval[:pos]
-                        optval = optval.strip()
-                    # allow empty values
-                    if optval == '""':
-                        optval = ''
                    optname = optname.rstrip()
+                    optval = optval.strip()
                    cursect[optname] = optval
+                    blockmode = not optval
                else:
                    # a non-fatal parsing error occurred.  set up the
                    # exception but keep going. the exception will be
@@ -157,8 +147,17 @@ def parse(fp, fpname):
                    if not e:
                        e = ParsingError(fpname)
                    e.append(lineno, repr(line))
    # if any parsing errors occurred, raise an exception
    if e:
        raise e
-    return _sections
+    for sectname in sections:
+        section = sections[sectname]
+        for name in section:
+            value = section[name]
+            if value[:1].isspace():
+                section[name] = leading_blank_lines.sub(
+                    '', textwrap.dedent(value.rstrip()))
+    return sections
--- a/src/zc/buildout/configparser.test
+++ b/src/zc/buildout/configparser.test
+Some tests of the basic config-file parser:
+First, an example that illustrates a well-formed configuration::
+  [s1]
+  a = 1
+  [   s2  ]         # a comment
+  long = a
+      b
+      c
+  l2 =
+      a
+      # not a comment
+  # comment
+  ; also a coment
+      b
+        c
+  empty =
+  c=1
+  b    += 1
+  [s3]; comment
+  x =           a b        
+.. -> text
+    >>> import pprint, StringIO, zc.buildout.configparser
+    >>> pprint.pprint(zc.buildout.configparser.parse(StringIO.StringIO(
+    ...     text), 'test'))
+    {'s1': {'a': '1'},
+     's2': {'b    +': '1',
+            'c': '1',
+            'empty': '',
+            'l2': 'a\n\n\n# not a comment\n\n\nb\n\n  c',
+            'long': 'a\nb\nc'},
+     's3': {'x': 'a b'}}
+Here's an example with leading blank lines:
+    >>> text = '\n\n[buildout]\nz=1\n\n'
+    >>> pprint.pprint(zc.buildout.configparser.parse(StringIO.StringIO(
+    ...     text), 'test'))
+    {'buildout': {'z': '1'}}
+Some examples that should error:
--- a/src/zc/buildout/tests.py
+++ b/src/zc/buildout/tests.py
@@ -15,6 +15,9 @@ from zc.buildout.buildout import print_
 from zope.testing import renormalizing
 import doctest
+import manuel.capture
+import manuel.doctest
+import manuel.testing
 import os
 import pkg_resources
 import re
@@ -3015,41 +3018,46 @@ normalize_S = (
 def test_suite():
    test_suite = [
-        doctest.DocFileSuite(
+        manuel.testing.TestSuite(
+            manuel.doctest.Manuel() + manuel.capture.Manuel(),
+            'configparser.test'),
+        manuel.testing.TestSuite(
+            manuel.doctest.Manuel(
+                checker=renormalizing.RENormalizing([
+                    zc.buildout.testing.normalize_path,
+                    zc.buildout.testing.normalize_endings,
+                    zc.buildout.testing.normalize_script,
+                    zc.buildout.testing.normalize_egg_py,
+                    zc.buildout.testing.not_found,
+                    (re.compile(r'zc.buildout-version = >=\S+'), ''),
+                    (re.compile(r"Installing 'zc.buildout >=\S+"), ''),
+                    (re.compile('__buildout_signature__ = recipes-\S+'),
+                     '__buildout_signature__ = recipes-SSSSSSSSSSS'),
+                    (re.compile('executable = [\S ]+python\S*', re.I),
+                     'executable = python'),
+                    (re.compile('[-d]  (setuptools|distribute)-\S+[.]egg'),
+                     'setuptools.egg'),
+                    (re.compile('zc.buildout(-\S+)?[.]egg(-link)?'),
+                     'zc.buildout.egg'),
+                    (re.compile('creating \S*setup.cfg'), 'creating setup.cfg'),
+                    (re.compile('hello\%ssetup' % os.path.sep), 'hello/setup'),
+                    (re.compile('Picked: (\S+) = \S+'),
+                     'Picked: \\1 = V.V'),
+                    (re.compile(r'We have a develop egg: zc.buildout (\S+)'),
+                     'We have a develop egg: zc.buildout X.X.'),
+                    (re.compile(r'\\[\\]?'), '/'),
+                    (re.compile('WindowsError'), 'OSError'),
+                    (re.compile(r'\[Error \d+\] Cannot create a file '
+                                r'when that file already exists: '),
+                     '[Errno 17] File exists: '
+                     ),
+                    (re.compile('distribute'), 'setuptools'),
+                    (re.compile('Got zc.recipe.egg \S+'), 'Got zc.recipe.egg'),
+                    ])
+                ) + manuel.capture.Manuel(),
            'buildout.txt',
            setUp=buildout_txt_setup,
            tearDown=zc.buildout.testing.buildoutTearDown,
-            checker=renormalizing.RENormalizing([
-                zc.buildout.testing.normalize_path,
-                zc.buildout.testing.normalize_endings,
-                zc.buildout.testing.normalize_script,
-                zc.buildout.testing.normalize_egg_py,
-                zc.buildout.testing.not_found,
-                (re.compile(r'zc.buildout-version = >=\S+'), ''),
-                (re.compile(r"Installing 'zc.buildout >=\S+"), ''),
-                (re.compile('__buildout_signature__ = recipes-\S+'),
-                 '__buildout_signature__ = recipes-SSSSSSSSSSS'),
-                (re.compile('executable = [\S ]+python\S*', re.I),
-                 'executable = python'),
-                (re.compile('[-d]  (setuptools|distribute)-\S+[.]egg'),
-                 'setuptools.egg'),
-                (re.compile('zc.buildout(-\S+)?[.]egg(-link)?'),
-                 'zc.buildout.egg'),
-                (re.compile('creating \S*setup.cfg'), 'creating setup.cfg'),
-                (re.compile('hello\%ssetup' % os.path.sep), 'hello/setup'),
-                (re.compile('Picked: (\S+) = \S+'),
-                 'Picked: \\1 = V.V'),
-                (re.compile(r'We have a develop egg: zc.buildout (\S+)'),
-                 'We have a develop egg: zc.buildout X.X.'),
-                (re.compile(r'\\[\\]?'), '/'),
-                (re.compile('WindowsError'), 'OSError'),
-                (re.compile(r'\[Error \d+\] Cannot create a file '
-                            r'when that file already exists: '),
-                 '[Errno 17] File exists: '
-                 ),
-                (re.compile('distribute'), 'setuptools'),
-                (re.compile('Got zc.recipe.egg \S+'), 'Got zc.recipe.egg'),
-                ])
            ),
        doctest.DocFileSuite(
            'runsetup.txt', 'repeatable.txt', 'setup.txt',