Commit 193f5cdd authored by Kirill Smelkov's avatar Kirill Smelkov

BigFile: Fixes, Tests and on-server Append support

For my current project I needed to rework BigFile interface to support on-server data
appension. But while doing so I've discovered in many places current BigFile code
could crash and is not working properly. Thus this patch series contain:

    1) fixes for discovered bugs;
    2) support for on-server data append via exposing ._dataAppend(data_chunk) method;
    3) basic tests for BigFile (so that we know fixes are proper and we won't regress on
       the same things in the future).

Newly introduced BigFile tests were verified to pass on the testrunner:

https://nexedi.erp5.net/test_result_module/20150303-3789A033
https://nexedi.erp5.net/test_result_module/20150303-3789A033/7

NOTE running the whole testsuite failed because of:

  - erp5_test_result:testTaskDistribution failure https://nexedi.erp5.net/test_result_module/20150303-3789A033/23 .

testTaskDistribution is currently failing from time-to-time even on ERP5-MASTER
tests, e.g. here https://nexedi.erp5.net/test_result_module/20150226-1F4459D0,
so from my point of view that should not be related to BigFile at all.

Origin: http://lab.nexedi.cn/kirr/erp5 y/bigfile-fixes+append-v2
Reviewed-on: https://lab.nexedi.cn/nexedi/erp5/merge_requests/3   (= merge request !3)

~~~~~~~~

* 'y/bigfile-fixes+append-v2' of http://lab.nexedi.cn/kirr/erp5:
  BigFile: Basic tests
  bt5/erp5_big_file: Regenerate
  BigFile: Fix most errors reported by pyflakes
  BigFile: Factor out code to append data chunk to ._appendData()
  BigFile: .data can be  BTreeData or None or (possibly non-empty) str
  BigFile: Factor out "get .data mtime" into helper
  BigFile: No need to explicitly convert btree=None -> BTreeData() in PUT
parents 4ee61a23 9bf0d1e1
......@@ -2,7 +2,7 @@
<ZopeData>
<record id="1" aka="AAAAAAAAAAE=">
<pickle>
<global name="ERP5Form" module="Products.ERP5Form.Form"/>
<global name="ERP5 Form" module="erp5.portal_type"/>
</pickle>
<pickle>
<dictionary>
......
......@@ -2,7 +2,7 @@
<ZopeData>
<record id="1" aka="AAAAAAAAAAE=">
<pickle>
<global name="ERP5Form" module="Products.ERP5Form.Form"/>
<global name="ERP5 Form" module="erp5.portal_type"/>
</pickle>
<pickle>
<dictionary>
......
<?xml version="1.0"?>
<ZopeData>
<record id="1" aka="AAAAAAAAAAE=">
<pickle>
<global name="Test Component" module="erp5.portal_type"/>
</pickle>
<pickle>
<dictionary>
<item>
<key> <string>_recorded_property_dict</string> </key>
<value>
<persistent> <string encoding="base64">AAAAAAAAAAI=</string> </persistent>
</value>
</item>
<item>
<key> <string>default_reference</string> </key>
<value> <string>testBigFile</string> </value>
</item>
<item>
<key> <string>description</string> </key>
<value>
<none/>
</value>
</item>
<item>
<key> <string>id</string> </key>
<value> <string>test.erp5.testBigFile</string> </value>
</item>
<item>
<key> <string>portal_type</string> </key>
<value> <string>Test Component</string> </value>
</item>
<item>
<key> <string>sid</string> </key>
<value>
<none/>
</value>
</item>
<item>
<key> <string>text_content_error_message</string> </key>
<value>
<tuple/>
</value>
</item>
<item>
<key> <string>text_content_warning_message</string> </key>
<value>
<tuple/>
</value>
</item>
<item>
<key> <string>version</string> </key>
<value> <string>erp5</string> </value>
</item>
<item>
<key> <string>workflow_history</string> </key>
<value>
<persistent> <string encoding="base64">AAAAAAAAAAM=</string> </persistent>
</value>
</item>
</dictionary>
</pickle>
</record>
<record id="2" aka="AAAAAAAAAAI=">
<pickle>
<global name="PersistentMapping" module="Persistence.mapping"/>
</pickle>
<pickle>
<dictionary>
<item>
<key> <string>data</string> </key>
<value>
<dictionary/>
</value>
</item>
</dictionary>
</pickle>
</record>
<record id="3" aka="AAAAAAAAAAM=">
<pickle>
<global name="PersistentMapping" module="Persistence.mapping"/>
</pickle>
<pickle>
<dictionary>
<item>
<key> <string>data</string> </key>
<value>
<dictionary>
<item>
<key> <string>component_validation_workflow</string> </key>
<value>
<persistent> <string encoding="base64">AAAAAAAAAAQ=</string> </persistent>
</value>
</item>
</dictionary>
</value>
</item>
</dictionary>
</pickle>
</record>
<record id="4" aka="AAAAAAAAAAQ=">
<pickle>
<global name="WorkflowHistoryList" module="Products.ERP5Type.patches.WorkflowTool"/>
</pickle>
<pickle>
<tuple>
<none/>
<list>
<dictionary>
<item>
<key> <string>action</string> </key>
<value> <string>validate</string> </value>
</item>
<item>
<key> <string>validation_state</string> </key>
<value> <string>validated</string> </value>
</item>
</dictionary>
</list>
</tuple>
</pickle>
</record>
</ZopeData>
1.1 added tests for BigFile
test.erp5.testBigFile
\ No newline at end of file
erp5_full_text_mroonga_catalog
\ No newline at end of file
1
\ No newline at end of file
1.1
\ No newline at end of file
......@@ -24,14 +24,42 @@ from ZPublisher.HTTPRequest import FileUpload
from ZPublisher import HTTPRangeSupport
from webdav.common import rfc1123_date
from mimetools import choose_boundary
from Products.CMFCore.utils import getToolByName, _setCacheHeaders,\
_ViewEmulator
from Products.CMFCore.utils import _setCacheHeaders, _ViewEmulator
from DateTime import DateTime
import re
class BigFile(File):
"""
Support storing huge file.
No convertion is allowed for now.
NOTE BigFile maintains the following invariant:
data property is either
- BTreeData instance, or
- str(*), or
- None.
(*) str has to be supported because '' is a default value for `data` field
from Data property sheet.
Even more - for
a) compatibility reasons, and
b) desire to support automatic migration of File-based documents
from document_module to BigFiles
non-empty str for data also have to be supported.
XXX(kirr) I'm not sure supporting non-empty str is a good idea (it
would be simpler if .data could be either BTreeData or "empty"),
but neither I'm experienced enough in erp5 nor know what are
appropriate compatibility requirements.
We discussed with Romain and settled on "None or str or BTreeData"
invariant for now.
"""
meta_type = 'ERP5 Big File'
......@@ -98,6 +126,11 @@ class BigFile(File):
if data is None:
btree = BTreeData()
elif isinstance(data, str):
# we'll want to append content to this file -
# - automatically convert str (empty or not) to BTreeData
btree = BTreeData()
btree.write(data, 0)
else:
btree = data
seek(0)
......@@ -116,6 +149,14 @@ class BigFile(File):
self.serialize()
return btree, len(btree)
def _data_mtime(self):
"""get .data mtime if present and fallback to self._p_mtime"""
# there is no data._p_mtime when data is None or str.
# so try and fallback to self._p_mtime
data = self._baseGetData()
mtime = getattr(data, '_p_mtime', self._p_mtime)
return mtime
def _range_request_handler(self, REQUEST, RESPONSE):
# HTTP Range header handling: return True if we've served a range
# chunk out of our data.
......@@ -147,13 +188,10 @@ class BigFile(File):
try: mod_since=long(DateTime(date).timeTime())
except: mod_since=None
if mod_since is not None:
if data is not None:
last_mod = long(data._p_mtime)
else:
if self._p_mtime:
last_mod = long(self._p_mtime)
else:
last_mod = long(0)
last_mod = self._data_mtime()
if last_mod is None:
last_mod = 0
last_mod = long(last_mod)
if last_mod > mod_since:
# Modified, so send a normal response. We delete
# the ranges, which causes us to skip to the 200
......@@ -172,10 +210,7 @@ class BigFile(File):
RESPONSE.setHeader('Content-Range',
'bytes */%d' % self.getSize())
RESPONSE.setHeader('Accept-Ranges', 'bytes')
if data is not None:
RESPONSE.setHeader('Last-Modified', rfc1123_date(data._p_mtime))
else:
RESPONSE.setHeader('Last-Modified', rfc1123_date(self._p_mtime))
RESPONSE.setHeader('Last-Modified', rfc1123_date(self._data_mtime()))
RESPONSE.setHeader('Content-Type', self.content_type)
RESPONSE.setHeader('Content-Length', self.getSize())
RESPONSE.setStatus(416)
......@@ -188,10 +223,7 @@ class BigFile(File):
start, end = ranges[0]
size = end - start
if data is not None:
RESPONSE.setHeader('Last-Modified', rfc1123_date(data._p_mtime))
else:
RESPONSE.setHeader('Last-Modified', rfc1123_date(self._p_mtime))
RESPONSE.setHeader('Last-Modified', rfc1123_date(self._data_mtime()))
RESPONSE.setHeader('Content-Type', self.content_type)
RESPONSE.setHeader('Content-Length', size)
RESPONSE.setHeader('Accept-Ranges', 'bytes')
......@@ -199,6 +231,7 @@ class BigFile(File):
'bytes %d-%d/%d' % (start, end - 1, self.getSize()))
RESPONSE.setStatus(206) # Partial content
# NOTE data cannot be None here (if it is - ranges are not satisfiable)
if isinstance(data, str):
RESPONSE.write(data[start:end])
return True
......@@ -227,10 +260,7 @@ class BigFile(File):
RESPONSE.setHeader('Content-Length', size)
RESPONSE.setHeader('Accept-Ranges', 'bytes')
if data is not None:
RESPONSE.setHeader('Last-Modified', rfc1123_date(data._p_mtime))
else:
RESPONSE.setHeader('Last-Modified', rfc1123_date(self._p_mtime))
RESPONSE.setHeader('Last-Modified', rfc1123_date(self._data_mtime()))
RESPONSE.setHeader('Content-Type',
'multipart/%sbyteranges; boundary=%s' % (
draftprefix, boundary))
......@@ -244,6 +274,7 @@ class BigFile(File):
'Content-Range: bytes %d-%d/%d\r\n\r\n' % (
start, end - 1, self.getSize()))
# NOTE data cannot be None here (if it is - ranges are not satisfiable)
if isinstance(data, str):
RESPONSE.write(data[start:end])
......@@ -280,7 +311,7 @@ class BigFile(File):
data = self._baseGetData()
mime = self.getContentType()
RESPONSE.setHeader('Content-Length', len(data))
RESPONSE.setHeader('Content-Length', data is not None and len(data) or 0)
RESPONSE.setHeader('Content-Type', mime)
if inline is _MARKER:
# by default, use inline for text and image formats
......@@ -313,7 +344,8 @@ class BigFile(File):
content_range = REQUEST.get_header('Content-Range', None)
if content_range is None:
btree = None
# truncate the file
self._baseSetData(None)
else:
current_size = int(self.getSize())
query_range = re.compile('bytes \*/\*')
......@@ -321,8 +353,6 @@ class BigFile(File):
'(?P<last_byte>[0-9]+)/' \
'(?P<total_content_length>[0-9]+)')
if query_range.match(content_range):
data = self._baseGetData()
RESPONSE.setHeader('X-Explanation', 'Resume incomplete')
RESPONSE.setHeader('Range', 'bytes 0-%s' % (current_size-1))
RESPONSE.setStatus(308)
......@@ -349,26 +379,29 @@ class BigFile(File):
RESPONSE.setStatus(400)
return RESPONSE
else:
btree = self._baseGetData()
if btree is None:
btree = BTreeData()
else:
RESPONSE.setHeader('X-Explanation', 'Can not parse range')
RESPONSE.setStatus(400) # Partial content
return RESPONSE
data, size = self._read_data(file, data=btree)
content_type=self._get_content_type(file, data, self.__name__,
type or self.content_type)
self.update_data(data, content_type, size)
self._appendData(file, content_type=type)
RESPONSE.setStatus(204)
return RESPONSE
def _appendData(self, data_chunk, content_type=None):
"""append data chunk to the end of the file
NOTE if content_type is specified, it will change content_type for the
whole file.
"""
data, size = self._read_data(data_chunk, data=self._baseGetData())
content_type=self._get_content_type(data_chunk, data, self.__name__,
content_type or self.content_type)
self.update_data(data, content_type, size)
# CMFFile also brings the IContentishInterface on CMF 2.2, remove it.
removeIContentishInterface(BigFile)
Markdown is supported
0%
or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment