Commit 56edddf6 authored by Boris Kocherov's avatar Boris Kocherov

Merge branch 'nexedi'

parents 7f44ed79 00604a3a
Install Cloudooo Install Cloudooo
================ ================
::
$ python2.6 setup.py install $ python2.6 setup.py install
Warnings:
- you must have installed setuptools>=0.6c11 in this python. Warnings:
- you must have installed setuptools>=0.6c11 in this python.
Install LibreOffice / OpenOffice.org Install LibreOffice / OpenOffice.org
==================================== ====================================
Install LibreOffice or OpenOffice.org. Install LibreOffice or OpenOffice.org.
- http://www.libreoffice.org/download/ - http://www.libreoffice.org/download/
- http://download.openoffice.org/ - http://download.openoffice.org/
Create Configuration File Create Configuration File
========================= =========================
The configuration file is used to start the application using paster. The configuration file is used to start the application using paster.
::
$ cp ./cloudooo/sample/sample.conf . # Copy to current folder $ cp ./cloudooo/sample/sample.conf . # Copy to current folder
The next step is define some attributes in cloudooo.conf: The next step is define some attributes in cloudooo.conf:
- working_path - folder to run the application. This folder need be created. - working_path - folder to run the application. This folder need be created.
- uno_path - folder where UNO library is installed (ex. /opt/libreoffice/basis-link/program/) - uno_path - folder where UNO library is installed (ex. /opt/libreoffice/basis-link/program/)
- soffice_binary_path - folder where soffice.bin is installed (ex. /opt/libreoffice/program/) - soffice_binary_path - folder where soffice.bin is installed (ex. /opt/libreoffice/program/)
...@@ -27,59 +33,77 @@ Create Configuration File ...@@ -27,59 +33,77 @@ Create Configuration File
Run Application Run Application
=============== ===============
::
$ paster serve ./cloudooo.conf $ paster serve ./cloudooo.conf
or run as a daemon:
or run as a daemon:
::
$ paster serve ./cloudoo.conf --daemon $ paster serve ./cloudoo.conf --daemon
Stop Application Stop Application
================ ================
::
$ kill -1 PASTER_PID $ kill -1 PASTER_PID
Warning: always use SIGHUP because only with this signal all processes are Warning: always use SIGHUP because only with this signal all processes are stopped correctly.
stopped correctly.
Cloudooo Description Cloudooo Description
==================== ====================
- XMLRPC + WSGI will be one bridge for easy access to LibreOffice / OpenOffice.org. This will implement one XMLRPC server into WSGI (Paster). - XMLRPC + WSGI will be one bridge for easy access to LibreOffice / OpenOffice.org. This will implement one XMLRPC server into WSGI (Paster).
- PyUno is used to connect to LibreOffice / OpenOffice.org stated with open socket. The features will be handled all by pyuno. - PyUno is used to connect to LibreOffice / OpenOffice.org stated with open socket. The features will be handled all by pyuno.
- Only a process will have access to LibreOffice / OpenOffice.org by time. - Only a process will have access to LibreOffice / OpenOffice.org by time.
- All clients receive the same object(proxy) when connects with XMLRPC Server. - All clients receive the same object(proxy) when connects with XMLRPC Server.
Managing LibreOffice / OpenOffice.org process Managing LibreOffice / OpenOffice.org process
- start 'soffice.bin';
- Pyuno start 'soffice.bin' processes and the communication is through sockets; - start 'soffice.bin':
- 'soffice.bin' processes run in brackground;
- control 'soffice.bin'; - Pyuno start 'soffice.bin' processes and the communication is through sockets
- If the socket lose the connection, cloudooo kills the process, restartes processes and submit again the file;
- 'soffice.bin' processes run in brackground;
- control 'soffice.bin':
- If the socket lose the connection, cloudooo kills the process, restarts processes and submit again the file;
XMLRPC Server - XMLRPC + WSGI XMLRPC Server - XMLRPC + WSGI
----------------------------- -----------------------------
- Send document to 'soffice.bin' and return the document converted with metadata; - Send document to 'soffice.bin' and return the document converted with metadata
- XMLRPC receives a file and connects to 'soffice.bin' process by pyuno; - XMLRPC receives a file and connects to 'soffice.bin' process by pyuno;
- The pyuno opens a new document, write, add metadata and returns the document edited or converted to xmlrpc and it return the document to the user; - The pyuno opens a new document, write, add metadata and returns the document edited or converted to xmlrpc and it return the document to the user;
- When finalize the use of 'soffice.bin', should make sure that it was finalized; - When finalize the use of 'soffice.bin', should make sure that it was finalized;
- Export to another format; - Export to another format;
- Invite document and return metadata only; - Invite document and return metadata only;
- Edit metadata of the document; - Edit metadata of the document;
- Problems and possible solution - Problems and possible solutions
- 'soffice.bin' is stalled;
- finalize the process, start 'soffice.bin' and submit the document again(without restart the cloudooo); - 'soffice.bin' is stalled;
- 'soffice.bin' is crashed;
- finalize the process, verify if all the process was killed, start 'soffice.bin' and submit the document again(without restart the cloudooo) - finalize the process, start 'soffice.bin' and submit the document again (without restart the cloudooo);
- 'soffice.bin' received the document and stalled;
- if 'soffice.bin' isn't responding, kill the process and start - 'soffice.bin' is crashed;
- The document that was sent is corrupt;
- write in log the error and verify that the process aren't in memory - finalize the process, verify if all the process was killed, start 'soffice.bin' and submit the document again(without restart the cloudooo)
- 'soffice.bin' received the document and stalled;
- if 'soffice.bin' isn't responding, kill the process and start
- The document that was sent is corrupted;
- write in log the error and verify that the process aren't in memory
FFMPEGHandler FFMPEGHandler
......
...@@ -123,3 +123,16 @@ class Handler(object): ...@@ -123,3 +123,16 @@ class Handler(object):
return self.input.getContent() return self.input.getContent()
finally: finally:
self.input.trash() self.input.trash()
@staticmethod
def getAllowedConversionFormatList(source_mimetype):
"""Returns a list content_type and their titles which are supported
by enabled handlers.
[('audio/ogg;codecs=opus', 'Opus Audio File Format'),
('video/webm', 'Webm Video File Format'),
...
]
"""
# XXX NotImplemented
return []
...@@ -91,3 +91,16 @@ class Handler(object): ...@@ -91,3 +91,16 @@ class Handler(object):
metadata -- expected an dictionary with metadata. metadata -- expected an dictionary with metadata.
""" """
raise NotImplementedError raise NotImplementedError
@staticmethod
def getAllowedConversionFormatList(source_mimetype):
"""Returns a list content_type and their titles which are supported
by enabled handlers.
[('image/jpeg', 'Jpeg Image File Format'),
('image/png', 'Png Image File Format'),
...
]
"""
# XXX NotImplemented
return []
...@@ -29,6 +29,7 @@ ...@@ -29,6 +29,7 @@
import json import json
import re import re
import pkg_resources import pkg_resources
import mimetypes
from base64 import decodestring, encodestring from base64 import decodestring, encodestring
from os import environ, path from os import environ, path
from subprocess import Popen, PIPE from subprocess import Popen, PIPE
...@@ -39,7 +40,7 @@ from cloudooo.handler.ooo.mimemapper import mimemapper ...@@ -39,7 +40,7 @@ from cloudooo.handler.ooo.mimemapper import mimemapper
from cloudooo.handler.ooo.document import FileSystemDocument from cloudooo.handler.ooo.document import FileSystemDocument
from cloudooo.handler.ooo.monitor.timeout import MonitorTimeout from cloudooo.handler.ooo.monitor.timeout import MonitorTimeout
from cloudooo.handler.ooo.monitor import monitor_sleeping_time from cloudooo.handler.ooo.monitor import monitor_sleeping_time
from cloudooo.util import logger from cloudooo.util import logger, parseContentType
from psutil import pid_exists from psutil import pid_exists
try: try:
...@@ -238,6 +239,37 @@ class Handler(object): ...@@ -238,6 +239,37 @@ class Handler(object):
self.document.trash() self.document.trash()
return doc_loaded return doc_loaded
@staticmethod
def getAllowedConversionFormatList(source_mimetype):
"""Returns a list content_type and their titles which are supported
by enabled handlers.
[('application/vnd.oasis.opendocument.text', 'ODF Text Document'),
('application/pdf', 'PDF - Portable Document Format'),
...
]
"""
# XXX please never guess extension from mimetype
output_set = set()
if "/" in source_mimetype:
parsed_mimetype_type = parseContentType(source_mimetype).gettype()
# here `guess_all_extensions` never handles mimetype parameters
# (even for `text/plain;charset=UTF-8` which is standard)
extension_list = mimetypes.guess_all_extensions(parsed_mimetype_type) # XXX never guess
else:
extension_list = [source_mimetype]
for ext in extension_list:
for ext, title in mimemapper.getAllowedExtensionList(extension=ext.replace(".", "")):
if ext in ("fodt", ".fodt"): # BBB
output_set.add(("application/vnd.oasis.opendocument.text-flat-xml", title))
continue
if ext:
mimetype, _ = mimetypes.guess_type("a." + ext) # XXX never guess
if mimetype:
output_set.add((mimetype, title))
return list(output_set)
def bootstrapHandler(configuration_dict): def bootstrapHandler(configuration_dict):
# Bootstrap handler # Bootstrap handler
from signal import signal, SIGINT, SIGQUIT, SIGHUP from signal import signal, SIGINT, SIGQUIT, SIGHUP
......
...@@ -74,8 +74,10 @@ drawing_expected_tuple = ( ...@@ -74,8 +74,10 @@ drawing_expected_tuple = (
web_expected_tuple = ( web_expected_tuple = (
('html', 'HTML Document'), ('html', 'HTML Document'),
('jpg', 'JPEG - Joint Photographic Experts Group'),
('odt', 'Text (Writer/Web)'), ('odt', 'Text (Writer/Web)'),
('pdf', 'PDF - Portable Document Format'), ('pdf', 'PDF - Portable Document Format'),
('png', 'PNG - Portable Network Graphic'),
('sxw', 'OpenOffice.org 1.0 Text Document (Writer/Web)'), ('sxw', 'OpenOffice.org 1.0 Text Document (Writer/Web)'),
('txt', 'Text (Writer/Web)'), ('txt', 'Text (Writer/Web)'),
('txt', 'Text - Choose Encoding (Writer/Web)'), ('txt', 'Text - Choose Encoding (Writer/Web)'),
......
...@@ -66,20 +66,18 @@ class TestServer(TestCase): ...@@ -66,20 +66,18 @@ class TestServer(TestCase):
"""Verify if getAllowedExtensionList returns is a list with extension and """Verify if getAllowedExtensionList returns is a list with extension and
ui_name. The request is by extension""" ui_name. The request is by extension"""
doc_allowed_list = self.proxy.getAllowedExtensionList({'extension': "doc"}) doc_allowed_list = self.proxy.getAllowedExtensionList({'extension': "doc"})
doc_allowed_list.sort() # Verify all expected types ("doc" MAY NOT be present)
for arg in doc_allowed_list: self.assertEquals(sorted([(a, b) for a, b in doc_allowed_list if a != "doc"]),
self.assertTrue(tuple(arg) in text_expected_tuple, sorted(list(filter(lambda (a, b): a != "doc", text_expected_tuple))))
"%s not in %s" % (arg, text_expected_tuple))
def testGetAllowedExtensionListByMimetype(self): def testGetAllowedExtensionListByMimetype(self):
"""Verify if getAllowedExtensionList returns is a list with extension and """Verify if getAllowedExtensionList returns is a list with extension and
ui_name. The request is by mimetype""" ui_name. The request is by mimetype"""
request_dict = {"mimetype": "application/msword"} request_dict = {"mimetype": "application/msword"}
msword_allowed_list = self.proxy.getAllowedExtensionList(request_dict) msword_allowed_list = self.proxy.getAllowedExtensionList(request_dict)
msword_allowed_list.sort() # Verify all expected types ("doc" MAY NOT be present)
for arg in msword_allowed_list: self.assertEquals(sorted([(a, b) for a, b in msword_allowed_list if a != "doc"]),
self.assertTrue(tuple(arg) in text_expected_tuple, sorted(list(filter(lambda (a, b): a != "doc", text_expected_tuple))))
"%s not in %s" % (arg, text_expected_tuple))
def ConversionScenarioList(self): def ConversionScenarioList(self):
return [ return [
...@@ -381,14 +379,9 @@ class TestServer(TestCase): ...@@ -381,14 +379,9 @@ class TestServer(TestCase):
response_code, response_dict, response_message = \ response_code, response_dict, response_message = \
self.proxy.getAllowedTargetItemList(mimetype) self.proxy.getAllowedTargetItemList(mimetype)
self.assertEquals(response_code, 200) self.assertEquals(response_code, 200)
# Verify if all expected types are in response list # Verify all expected types ("odt" MAY NOT be present)
for arg in text_expected_tuple: self.assertEquals(sorted([(a, b) for a, b in response_dict['response_data'] if a != "odt"]),
self.assertTrue(list(arg) in response_dict['response_data'], sorted(list(filter(lambda (a, b): a != "odt", text_expected_tuple))))
"%s not in %s" % (arg, response_dict['response_data']))
# Verify if all types in response list are expected
for arg in response_dict['response_data']:
self.assertTrue(tuple(arg) in text_expected_tuple,
"%s not in %s" % (arg, text_expected_tuple))
def testGetTableItemListFromOdt(self): def testGetTableItemListFromOdt(self):
"""Test if getTableItemList can get the table item list from odt file""" """Test if getTableItemList can get the table item list from odt file"""
...@@ -481,10 +474,10 @@ class TestServer(TestCase): ...@@ -481,10 +474,10 @@ class TestServer(TestCase):
data = encodestring(open("./data/granulate_test.odt").read()) data = encodestring(open("./data/granulate_test.odt").read())
image_list = self.proxy.getImageItemList(data, "odt") image_list = self.proxy.getImageItemList(data, "odt")
self.assertEquals([['10000000000000C80000009CBF079A6E41EE290C.jpg', ''], self.assertEquals([['10000000000000C80000009CBF079A6E41EE290C.jpg', ''],
['10000201000000C80000004EF26C99A54A61B987.png', 'TioLive Logo'], ['10000201000000C80000004E85B3F70C71E07CE8.png', 'TioLive Logo'],
['10000201000000C80000004EF26C99A54A61B987.png', ''], ['10000201000000C80000004E85B3F70C71E07CE8.png', ''],
['2000004F0000423300001370ADF6545B2997B448.svm', 'Python Logo'], ['2000004F0000423300001370ADF6545B2997B448.svm', 'Python Logo'],
['10000201000000C80000004EF26C99A54A61B987.png', 'Again TioLive Logo']], ['10000201000000C80000004E85B3F70C71E07CE8.png', 'Again TioLive Logo']],
image_list) image_list)
def testGetImageItemListFromDoc(self): def testGetImageItemListFromDoc(self):
...@@ -492,10 +485,10 @@ class TestServer(TestCase): ...@@ -492,10 +485,10 @@ class TestServer(TestCase):
data = encodestring(open("./data/granulate_test.doc").read()) data = encodestring(open("./data/granulate_test.doc").read())
image_list = self.proxy.getImageItemList(data, "doc") image_list = self.proxy.getImageItemList(data, "doc")
self.assertEquals([['10000000000000C80000009CBF079A6E41EE290C.jpg', ''], self.assertEquals([['10000000000000C80000009CBF079A6E41EE290C.jpg', ''],
['10000201000000C80000004EF26C99A54A61B987.png', 'TioLive Logo'], ['10000201000000C80000004E85B3F70C71E07CE8.png', 'TioLive Logo'],
['10000201000000C80000004EF26C99A54A61B987.png', ''], ['10000201000000C80000004E85B3F70C71E07CE8.png', ''],
['2000031600004233000013702113A0E70B910778.wmf', 'Python Logo'], ['2000031600004233000013702113A0E70B910778.wmf', 'Python Logo'],
['10000201000000C80000004EF26C99A54A61B987.png', 'Again TioLive Logo']], ['10000201000000C80000004E85B3F70C71E07CE8.png', 'Again TioLive Logo']],
image_list) image_list)
def testGetImageFromOdt(self): def testGetImageFromOdt(self):
......
...@@ -29,7 +29,7 @@ ...@@ -29,7 +29,7 @@
from zope.interface import implements from zope.interface import implements
from cloudooo.interfaces.handler import IHandler from cloudooo.interfaces.handler import IHandler
from cloudooo.file import File from cloudooo.file import File
from cloudooo.util import logger from cloudooo.util import logger, parseContentType
from subprocess import Popen, PIPE from subprocess import Popen, PIPE
from tempfile import mktemp from tempfile import mktemp
...@@ -115,3 +115,17 @@ class Handler(object): ...@@ -115,3 +115,17 @@ class Handler(object):
return self.document.getContent() return self.document.getContent()
finally: finally:
self.document.trash() self.document.trash()
@staticmethod
def getAllowedConversionFormatList(source_mimetype):
"""Returns a list content_type and their titles which are supported
by enabled handlers.
[('text/plain', 'Plain Text'),
...
]
"""
source_mimetype = parseContentType(source_mimetype).gettype()
if source_mimetype in ("application/pdf", "pdf"):
return [("text/plain", "Plain Text")]
return []
...@@ -64,6 +64,19 @@ class TestHandler(HandlerTestCase): ...@@ -64,6 +64,19 @@ class TestHandler(HandlerTestCase):
self.assertEquals(metadata["title"], 'Set Metadata Test') self.assertEquals(metadata["title"], 'Set Metadata Test')
self.assertEquals(metadata['creator'], 'gabriel\'@') self.assertEquals(metadata['creator'], 'gabriel\'@')
def testGetAllowedConversionFormatList(self):
"""Test all combination of mimetype
None of the types below define any mimetype parameter to not ignore so far.
"""
get = Handler.getAllowedConversionFormatList
# Handled mimetypes
self.assertEquals(get("application/pdf;ignored=param"),
[("text/plain", "Plain Text")])
# Unhandled mimetypes
self.assertEquals(get("text/plain;ignored=param"), [])
self.assertEquals(get("text/plain;charset=UTF-8;ignored=param"), [])
def test_suite(): def test_suite():
return make_suite(TestHandler) return make_suite(TestHandler)
...@@ -28,9 +28,9 @@ ...@@ -28,9 +28,9 @@
from zope.interface import implements from zope.interface import implements
from cloudooo.interfaces.handler import IHandler from cloudooo.interfaces.handler import IHandler
from cloudooo.file import File from cloudooo.file import File
from cloudooo.util import logger from cloudooo.util import logger, parseContentType
from subprocess import Popen, PIPE from subprocess import Popen, PIPE
from tempfile import mktemp from tempfile import mktemp, mkdtemp
from os.path import basename from os.path import basename
from base64 import b64decode from base64 import b64decode
...@@ -55,6 +55,9 @@ class Handler(object): ...@@ -55,6 +55,9 @@ class Handler(object):
) )
return path return path
def makeTempDir(self, *args, **kw):
return mkdtemp(*args, dir=self.file.directory_name, **kw)
def convertPathToUrl(self, path): def convertPathToUrl(self, path):
if path.startswith("/"): if path.startswith("/"):
return "file://" + path return "file://" + path
...@@ -96,6 +99,20 @@ class Handler(object): ...@@ -96,6 +99,20 @@ class Handler(object):
""" """
raise NotImplementedError raise NotImplementedError
@staticmethod
def getAllowedConversionFormatList(source_mimetype):
"""Returns a list content_type and their titles which are supported
by enabled handlers.
[('application/pdf', 'PDF - Portable Document Format'),
...
]
"""
source_mimetype = parseContentType(source_mimetype).gettype()
if source_mimetype in ("text/html", "htm", "html"):
return [("application/pdf", "PDF - Portable Document Format")]
return []
def makeSwitchOptionList(self, allowed_option_list, option_dict): def makeSwitchOptionList(self, allowed_option_list, option_dict):
""" """
A switch option is enable if it exists. A switch option is enable if it exists.
...@@ -342,6 +359,8 @@ class Handler(object): ...@@ -342,6 +359,8 @@ class Handler(object):
"include_in_outline", "include_in_outline",
], conversion_kw) ], conversion_kw)
command += self.makeSwitchOptionList(["default_header"], conversion_kw) command += self.makeSwitchOptionList(["default_header"], conversion_kw)
# put cache in the temp dir - to disable cache
command += ["--cache-dir", self.makeTempDir(prefix="cache")]
command += self.makeOneStringArgumentOptionList([ command += self.makeOneStringArgumentOptionList([
#"cache_dir", # we decide #"cache_dir", # we decide
"encoding", "encoding",
......
...@@ -77,6 +77,18 @@ class TestHandler(HandlerTestCase): ...@@ -77,6 +77,18 @@ class TestHandler(HandlerTestCase):
handler = Handler(self.tmp_url, "", "png", **self.kw) handler = Handler(self.tmp_url, "", "png", **self.kw)
self.assertRaises(NotImplementedError, handler.setMetadata) self.assertRaises(NotImplementedError, handler.setMetadata)
def testGetAllowedConversionFormatList(self):
"""Test all combination of mimetype
None of the types below define any mimetype parameter to not ignore so far.
"""
get = Handler.getAllowedConversionFormatList
# Handled mimetypes
self.assertEquals(get("text/html;ignored=param"),
[("application/pdf", "PDF - Portable Document Format")])
# Unhandled mimetypes
self.assertEquals(get("application/pdf;ignored=param"), [])
def test_suite(): def test_suite():
return make_suite(TestHandler) return make_suite(TestHandler)
...@@ -35,7 +35,8 @@ from zope.interface import implements ...@@ -35,7 +35,8 @@ from zope.interface import implements
from cloudooo.interfaces.handler import IHandler from cloudooo.interfaces.handler import IHandler
from cloudooo.file import File from cloudooo.file import File
from cloudooo.util import logger, zipTree, unzip from cloudooo.util import logger, zipTree, unzip, parseContentType
from cloudooo.handler.ooo.handler import Handler as OOoHandler
AVS_OFFICESTUDIO_FILE_UNKNOWN = "0" AVS_OFFICESTUDIO_FILE_UNKNOWN = "0"
AVS_OFFICESTUDIO_FILE_DOCUMENT_DOCX = "65" AVS_OFFICESTUDIO_FILE_DOCUMENT_DOCX = "65"
...@@ -93,6 +94,9 @@ class Handler(object): ...@@ -93,6 +94,9 @@ class Handler(object):
The source format of the inputed file The source format of the inputed file
""" """
self.base_folder_url = base_folder_url self.base_folder_url = base_folder_url
self._data = data
self._source_format = source_format
self._init_kw = kw
self.file = File(base_folder_url, data, source_format) self.file = File(base_folder_url, data, source_format)
self.environment = kw.get("env", {}) self.environment = kw.get("env", {})
...@@ -113,12 +117,13 @@ class Handler(object): ...@@ -113,12 +117,13 @@ class Handler(object):
config_file_name = os.path.join(root_dir, "config.xml") config_file_name = os.path.join(root_dir, "config.xml")
if source_format in yformat_tuple: if source_format in yformat_tuple:
os.mkdir(input_dir) if self._data.startswith("PK\x03\x04"):
unzip(self.file.getUrl(), input_dir) os.mkdir(input_dir)
for _, _, files in os.walk(input_dir): unzip(self.file.getUrl(), input_dir)
input_file_name, = files for _, _, files in os.walk(input_dir):
break input_file_name, = files
input_file_name = os.path.join(input_dir, input_file_name) break
input_file_name = os.path.join(input_dir, input_file_name)
if destination_format in yformat_tuple: if destination_format in yformat_tuple:
os.mkdir(output_dir) os.mkdir(output_dir)
output_file_name = os.path.join(output_dir, "body.txt") output_file_name = os.path.join(output_dir, "body.txt")
...@@ -169,14 +174,61 @@ class Handler(object): ...@@ -169,14 +174,61 @@ class Handler(object):
self.file.trash() self.file.trash()
def getMetadata(self, base_document=False): def getMetadata(self, base_document=False):
"""Returns a dictionary with all metadata of document. r"""Returns a dictionary with all metadata of document.
along with the metadata. /!\ Not Implemented: no format are handled correctly.
""" """
raise NotImplementedError # XXX Cloudooo takes the first handler that can "handle" source_mimetype.
# However, docx documents metadata can only be "handled" by the ooo handler.
# Handlers should provide a way to tell if such capability is available for the required source mimetype.
# We have to define a precise direction on how to know/get what are handlers capabilities according to Cloudooo configuration.
# And then, this method MUST raise on unhandled format. Here xformats are "handled" by cheating.
if self._source_format in (
"docx", "application/vnd.openxmlformats-officedocument.wordprocessingml.document",
"xlsx", "application/vnd.openxmlformats-officedocument.spreadsheetml.sheet",
"pptx", "application/vnd.openxmlformats-officedocument.presentationml.presentation",
):
return OOoHandler(self.base_folder_url, self._data, self._source_format, **self._init_kw).getMetadata(base_document)
return {}
def setMetadata(self, metadata={}): def setMetadata(self, metadata={}):
"""Returns image with new metadata. r"""Returns document with new metadata.
/!\ Not Implemented: no format are handled correctly.
Keyword arguments: Keyword arguments:
metadata -- expected an dictionary with metadata. metadata -- expected an dictionary with metadata.
""" """
raise NotImplementedError # XXX Cloudooo takes the first handler that can "handle" source_mimetype.
# However, docx documents metadata can only be "handled" by the ooo handler.
# Handlers should provide a way to tell if such capability is available for the required source mimetype.
# We have to define a precise direction on how to know/get what are handlers capabilities according to Cloudooo configuration.
# And then, this method MUST raise on unhandled format. Here xformats are "handled" by cheating.
if self._source_format in (
"docx", "application/vnd.openxmlformats-officedocument.wordprocessingml.document",
"xlsx", "application/vnd.openxmlformats-officedocument.spreadsheetml.sheet",
"pptx", "application/vnd.openxmlformats-officedocument.presentationml.presentation",
):
return OOoHandler(self.base_folder_url, self._data, self._source_format, **self._init_kw).setMetadata(metadata)
return self.file.getContent()
@staticmethod
def getAllowedConversionFormatList(source_mimetype):
"""Returns a list content_type and their titles which are supported
by enabled handlers.
[('application/x-asc-text', 'OnlyOffice Text Document'),
...
]
"""
source_mimetype = parseContentType(source_mimetype).gettype()
if source_mimetype in ("docx", "application/vnd.openxmlformats-officedocument.wordprocessingml.document"):
return [("application/x-asc-text", "OnlyOffice Text Document")]
if source_mimetype in ("docy", "application/x-asc-text"):
return [("application/vnd.openxmlformats-officedocument.wordprocessingml.document", "Word 2007 Document")]
if source_mimetype in ("xlsx", "application/vnd.openxmlformats-officedocument.spreadsheetml.sheet"):
return [("application/x-asc-spreadsheet", "OnlyOffice Spreadsheet")]
if source_mimetype in ("xlsy", "application/x-asc-spreadsheet"):
return [("application/vnd.openxmlformats-officedocument.spreadsheetml.sheet", "Excel 2007 Spreadsheet")]
if source_mimetype in ("pptx", "application/vnd.openxmlformats-officedocument.presentationml.presentation"):
return [("application/x-asc-presentation", "OnlyOffice Presentation")]
if source_mimetype in ("ppty", "application/x-asc-presentation"):
return [("application/vnd.openxmlformats-officedocument.presentationml.presentation", "PowerPoint 2007 Presentation")]
return []
XLSY;v2;5883;BAKAAgAAA+cHAAAEAwgAAAD3FgAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAGMFAAAAEQAAAAEMAAAABwEAAAAACAEAAAAABAoAAAAFAAAAAAUAAAAABnwAAAAHGgAAAAQGCgAAAEEAcgBpAGEAbAAGBQAAAAAAACRABxoAAAAEBgoAAABBAHIAaQBhAGwABgUAAAAAAAAkQAcaAAAABAYKAAAAQQByAGkAYQBsAAYFAAAAAAAAJEAHGgAAAAQGCgAAAEEAcgBpAGEAbAAGBQAAAAAAACRACB8AAAAJGgAAAAAGDgAAAEcARQBOAEUAUgBBAEwAAQSkAAAADhYDAAADPwAAAAABAQEBAQMBAQYEAAAAAAcEAAAAAAgEAAAAAAkEpAAAAA0GGAAAAAABBAEEAAAAAAUBAAYEAAAAAAcBAAgBAAMhAAAAAAEAAQEAAwEBBgQAAAAABwQAAAAACAQBAAAACQQAAAAAAyEAAAAAAQABAQADAQEGBAAAAAAHBAAAAAAIBAEAAAAJBAAAAAADIQAAAAABAAEBAAMBAQYEAAAAAAcEAAAAAAgEAgAAAAkEAAAAAAMhAAAAAAEAAQEAAwEBBgQAAAAABwQAAAAACAQCAAAACQQAAAAAAyEAAAAAAQABAQADAQEGBAAAAAAHBAAAAAAIBAAAAAAJBAAAAAADIQAAAAABAAEBAAMBAQYEAAAAAAcEAAAAAAgEAAAAAAkEAAAAAAMhAAAAAAEAAQEAAwEBBgQAAAAABwQAAAAACAQAAAAACQQAAAAAAyEAAAAAAQABAQADAQEGBAAAAAAHBAAAAAAIBAAAAAAJBAAAAAADIQAAAAABAAEBAAMBAQYEAAAAAAcEAAAAAAgEAAAAAAkEAAAAAAMhAAAAAAEAAQEAAwEBBgQAAAAABwQAAAAACAQAAAAACQQAAAAAAyEAAAAAAQABAQADAQEGBAAAAAAHBAAAAAAIBAAAAAAJBAAAAAADIQAAAAABAAEBAAMBAQYEAAAAAAcEAAAAAAgEAAAAAAkEAAAAAAMhAAAAAAEAAQEAAwEBBgQAAAAABwQAAAAACAQAAAAACQQAAAAAAyEAAAAAAQABAQADAQEGBAAAAAAHBAAAAAAIBAAAAAAJBAAAAAADIQAAAAABAAEBAAMBAQYEAAAAAAcEAAAAAAgEAQAAAAkEKwAAAAMhAAAAAAEAAQEAAwEBBgQAAAAABwQAAAAACAQBAAAACQQpAAAAAyEAAAAAAQABAQADAQEGBAAAAAAHBAAAAAAIBAEAAAAJBCwAAAADIQAAAAABAAEBAAMBAQYEAAAAAAcEAAAAAAgEAQAAAAkEKgAAAAMhAAAAAAEAAQEAAwEBBgQAAAAABwQAAAAACAQBAAAACQQJAAAAAkoAAAADRQAAAAABAAEBAAMBAAYEAAAAAAcEAAAAAAgEAAAAAAkEpAAAAAwEAAAAAA0GGAAAAAABBAEEAAAAAAUBAAYEAAAAAAcBAAgBAA8qAQAAECkAAAAABAAAAAAAAAABAQAAAAAEDAAAAE4AbwByAG0AYQBsAAUEAAAAAAAAABAnAAAAAAQAAAADAAAAAQEAAAAABAoAAABDAG8AbQBtAGEABQQAAAAPAAAAEC8AAAAABAAAAAYAAAABAQAAAAAEEgAAAEMAbwBtAG0AYQAgAFsAMABdAAUEAAAAEAAAABAtAAAAAAQAAAAEAAAAAQEAAAAABBAAAABDAHUAcgByAGUAbgBjAHkABQQAAAARAAAAEDUAAAAABAAAAAcAAAABAQAAAAAEGAAAAEMAdQByAHIAZQBuAGMAeQAgAFsAMABdAAUEAAAAEgAAABArAAAAAAQAAAAFAAAAAQEAAAAABA4AAABQAGUAcgBjAGUAbgB0AAUEAAAAEwAAABgAAAAAAwAAAAABAAELAAAAAgYAAAAABAAAAADwDgAAAOsOAAABGwAAAAAGDAAAAFMAaABlAGUAdAAxAAEEAQAAAAIBAgIkAAAAAx8AAAABAQACBAEEAAADBAEAAAAEBAAAAAAFBX+XU/ByCidABAoAAABBADEAOgBDADcAFhEAAAAXDAAAAAQBAAAAAQYBAAAAAQsKAAAAAQWamZmZmZkpQA48AAAAAAVxPQrXowA0QAEFhPuDDJW9OkACBXE9CtejADRAAwWE+4MMlb06QAQFcT0K16MANEAFBXE9CtejADRADwYAAAAAAQEBAQkQBgAAAAABAQEBAAlVBQAACr4AAAAABAEAAAACBZqZmZmZmSlAAwEABgEABAaiAAAABSkAAAAFCAAAAAAAAAAAAAAAAQQAAAAAAAAAAgEAAAAEAwgAAAAAAAAAAADwPwUpAAAABQgAAAAAAAAAAQAAAAEEAAAAAAAAAAIBAAAABAMIAAAAAAAAAAAAAEAFQQAAAAUIAAAAAAAAAAIAAAABBAAAAAAAAAACAQAAAAQEEwAAAAABAAwGCgAAAEEAMQArAEIAMQADCAAAAAAAAAAAAAhACr4AAAAABAIAAAACBZqZmZmZmSlAAwEABgEABAaiAAAABSkAAAAFCAAAAAEAAAAAAAAAAQQAAAAAAAAAAgEAAAAEAwgAAAAAAAAAAAAAQAUpAAAABQgAAAABAAAAAQAAAAEEAAAAAAAAAAIBAAAABAMIAAAAAAAAAAAACEAFQQAAAAUIAAAAAQAAAAIAAAABBAAAAAAAAAACAQAAAAQEEwAAAAABAAwGCgAAAEEAMgArAEIAMgADCAAAAAAAAAAAABRACr4AAAAABAMAAAACBZqZmZmZmSlAAwEABgEABAaiAAAABSkAAAAFCAAAAAIAAAAAAAAAAQQAAAAAAAAAAgEAAAAEAwgAAAAAAAAAAAAIQAUpAAAABQgAAAACAAAAAQAAAAEEAAAAAAAAAAIBAAAABAMIAAAAAAAAAAAAEEAFQQAAAAUIAAAAAgAAAAIAAAABBAAAAAAAAAACAQAAAAQEEwAAAAABAAwGCgAAAEEAMwArAEIAMwADCAAAAAAAAAAAABxACr4AAAAABAQAAAACBZqZmZmZmSlAAwEABgEABAaiAAAABSkAAAAFCAAAAAMAAAAAAAAAAQQAAAAAAAAAAgEAAAAEAwgAAAAAAAAAAAAQQAUpAAAABQgAAAADAAAAAQAAAAEEAAAAAAAAAAIBAAAABAMIAAAAAAAAAAAAFEAFQQAAAAUIAAAAAwAAAAIAAAABBAAAAAAAAAACAQAAAAQEEwAAAAABAAwGCgAAAEEANAArAEIANAADCAAAAAAAAAAAACJACr4AAAAABAUAAAACBZqZmZmZmSlAAwEABgEABAaiAAAABSkAAAAFCAAAAAQAAAAAAAAAAQQAAAAAAAAAAgEAAAAEAwgAAAAAAAAAAAAUQAUpAAAABQgAAAAEAAAAAQAAAAEEAAAAAAAAAAIBAAAABAMIAAAAAAAAAAAAGEAFQQAAAAUIAAAABAAAAAIAAAABBAAAAAAAAAACAQAAAAQEEwAAAAABAAwGCgAAAEEANQArAEIANQADCAAAAAAAAAAAACZACr4AAAAABAYAAAACBZqZmZmZmSlAAwEABgEABAaiAAAABSkAAAAFCAAAAAUAAAAAAAAAAQQAAAAAAAAAAgEAAAAEAwgAAAAAAAAAAAAYQAUpAAAABQgAAAAFAAAAAQAAAAEEAAAAAAAAAAIBAAAABAMIAAAAAAAAAAAAHEAFQQAAAAUIAAAABQAAAAIAAAABBAAAAAAAAAACAQAAAAQEEwAAAAABAAwGCgAAAEEANgArAEIANgADCAAAAAAAAAAAACpACr4AAAAABAcAAAACBZqZmZmZmSlAAwEABgEABAaiAAAABSkAAAAFCAAAAAYAAAAAAAAAAQQAAAAAAAAAAgEAAAAEAwgAAAAAAAAAAAAcQAUpAAAABQgAAAAGAAAAAQAAAAEEAAAAAAAAAAIBAAAABAMIAAAAAAAAAAAAIEAFQQAAAAUIAAAABgAAAAIAAAABBAAAAAAAAAACAQAAAAQEEwAAAAABAAwGCgAAAEEANwArAEIANwADCAAAAAAAAAAAAC5ADK0IAAANqAgAAAABAAAAAgEgAAAAAAQDAAAAAQUpXI/C9SjwPwIEAAAAAAMFAAAAAAAA8D8CIAAAAAAECgAAAAEFrkfhehSuB0ACBBQAAAADBcP1KFyPwuU/BlMIAAAKSQgAAAEPAAAAAAoAAABlAG4ALQBVAFMACOEHAAAHeQcAAAAAAAAABTEFAAAABgAAAAABAAAAAQEGAAAAAAEAAAABA5oBAAAACQAAAAAEAAAAAAAAAAEJAAAAAAQAAAAAAAAAA0oAAAAARQAAAPr7AQAAAAACFwAAAAMSAAAAAA0AAAABCAAAAPoAAAFFAob7AxgAAAD6+wAFAAAAAgAAAAACBwAAAPoAAAAAAPsEAAAAAAdCAAAAAgYAAAAAAQAAAAYKBgAAAAABAAAAAA0GAAAAAAEAAAAACAYAAAAAAQAAAAAMBgAAAAABAAAAAAsGAAAAAAEAAAAAC+MAAAAB3gAAAAAgAAAAUwBoAGUAZQB0ADEAIQAkAEEAJAAxADoAJABBACQANwABtAAAAAAOAAAARwBlAG4AZQByAGEAbAABCQAAAAAEAAAABwAAAAIQAAAAAAIAAAAxAAEEAAAAAAAAAAIQAAAAAAIAAAAyAAEEAAAAAQAAAAIQAAAAAAIAAAAzAAEEAAAAAgAAAAIQAAAAAAIAAAA0AAEEAAAAAwAAAAIQAAAAAAIAAAA1AAEEAAAABAAAAAIQAAAAAAIAAAA2AAEEAAAABQAAAAIQAAAAAAIAAAA3AAEEAAAABgAAAAOaAQAAAAkAAAAABAAAAAEAAAABCQAAAAAEAAAAAQAAAANKAAAAAEUAAAD6+wEAAAAAAhcAAAADEgAAAAANAAAAAQgAAAD6AP8BQgIO+wMYAAAA+vsABQAAAAIAAAAAAgcAAAD6AAAAAAD7BAAAAAAHQgAAAAIGAAAAAAEAAAAGCgYAAAAAAQAAAAANBgAAAAABAAAAAAgGAAAAAAEAAAAADAYAAAAAAQAAAAALBgAAAAABAAAAAAvjAAAAAd4AAAAAIAAAAFMAaABlAGUAdAAxACEAJABCACQAMQA6ACQAQgAkADcAAbQAAAAADgAAAEcAZQBuAGUAcgBhAGwAAQkAAAAABAAAAAcAAAACEAAAAAACAAAAMgABBAAAAAAAAAACEAAAAAACAAAAMwABBAAAAAEAAAACEAAAAAACAAAANAABBAAAAAIAAAACEAAAAAACAAAANQABBAAAAAMAAAACEAAAAAACAAAANgABBAAAAAQAAAACEAAAAAACAAAANwABBAAAAAUAAAACEAAAAAACAAAAOAABBAAAAAYAAAADoAEAAAAJAAAAAAQAAAACAAAAAQkAAAAABAAAAAIAAAADSgAAAABFAAAA+vsBAAAAAAIXAAAAAxIAAAAADQAAAAEIAAAA+gD/AdMCIPsDGAAAAPr7AAUAAAACAAAAAAIHAAAA+gAAAAAA+wQAAAAAB0IAAAACBgAAAAABAAAABgoGAAAAAAEAAAAADQYAAAAAAQAAAAAIBgAAAAABAAAAAAwGAAAAAAEAAAAACwYAAAAAAQAAAAAL6QAAAAHkAAAAACAAAABTAGgAZQBlAHQAMQAhACQAQwAkADEAOgAkAEMAJAA3AAG6AAAAAA4AAABHAGUAbgBlAHIAYQBsAAEJAAAAAAQAAAAHAAAAAhAAAAAAAgAAADMAAQQAAAAAAAAAAhAAAAAAAgAAADUAAQQAAAABAAAAAhAAAAAAAgAAADcAAQQAAAACAAAAAhAAAAAAAgAAADkAAQQAAAADAAAAAhIAAAAABAAAADEAMQABBAAAAAQAAAACEgAAAAAEAAAAMQAzAAEEAAAABQAAAAISAAAAAAQAAAAxADUAAQQAAAAGAAAABQsAAAAABgAAADEAMAAwAAYHAAAAAAIAAAAwAAgJAAAAAAQAAACfJdkDCAkAAAAABAAAAEdf8AUT3gAAAAAJAAAAAAQAAACfJdkDAQsAAAABBgAAAAABAAAAAQIGAAAAAAEAAAAAAwYAAAAAAQAAAAAIBgAAAAABAAAAAwkGAAAAAAEAAAACCgYAAAAAAQAAAAILRQAAAABAAAAA+vsBAAAAAAIAAAAAAyoAAAD6+wAXAAAAAxIAAAAADQAAAAEIAAAA+gCzAbMCs/sCBwAAAPoAAAAAAPsEAAAAAA0JAAAAAAQAAABHX/AFDgYAAAAAAQAAAAAQBgAAAAABAAAAAREGAAAAAAEAAAAAEgsAAAAABgAAADEAMAAwABYHAQAAAAkAAAAABAAAAEdf8AUBCwAAAAEGAAAAAAEAAAABAgYAAAAAAQAAAAADBgAAAAABAAAAAQRKAAAAAEUAAAAAQAAAAPr7AQAAAAACAAAAAAMqAAAA+vsAFwAAAAMSAAAAAA0AAAABCAAAAPoAswGzArP7AgcAAAD6AAAAAAD7BAAAAAAIBgAAAAABAAAAAwkGAAAAAAEAAAACCgYAAAAAAQAAAAILRQAAAABAAAAA+vsBAAAAAAIAAAAAAyoAAAD6+wAXAAAAAxIAAAAADQAAAAEIAAAA+gCzAbMCs/sCBwAAAPoAAAAAAPsEAAAAAA0JAAAAAAQAAACfJdkDDgYAAAAAAQAAAAAYSgAAAABFAAAA+vsBAAAAAAIFAAAAAgAAAAADKgAAAPr7ABcAAAADEgAAAAANAAAAAQgAAAD6ALMBswKz+wIHAAAA+gAAAAAA+wQAAAAACFMAAAAABgAAAAABAAAAAwMGAAAAAAEAAAAABDgAAAAAMwAAAPr7AQAAAAACBQAAAAIAAAAAAxgAAAD6+wAFAAAAAgAAAAACBwAAAPoAAAAAAPsEAAAAAAkGAAAAAAEAAAABCUoAAAAARQAAAPr7AQAAAAACFwAAAAMSAAAAAA0AAAABCAAAAPoA/wH/Av/7AxgAAAD6+wAFAAAAAgAAAAACBwAAAPoAAAAAAPsEAAAAAAsAAAAAGAYAAAACAQAAAAAAAAAA
\ No newline at end of file
...@@ -51,6 +51,8 @@ class TestHandler(HandlerTestCase): ...@@ -51,6 +51,8 @@ class TestHandler(HandlerTestCase):
def testConvertXlsy(self): def testConvertXlsy(self):
"""Test conversion of xlsy to xlsx and back""" """Test conversion of xlsy to xlsx and back"""
x_data = Handler(self.tmp_url, open("data/test_body.xlsy").read(), "xlsy", **self.kw).convert("xlsx")
self.assertIn("xl/", x_data[:2000])
x_data = Handler(self.tmp_url, open("data/test.xlsy").read(), "xlsy", **self.kw).convert("xlsx") x_data = Handler(self.tmp_url, open("data/test.xlsy").read(), "xlsy", **self.kw).convert("xlsx")
self.assertIn("xl/", x_data[:2000]) self.assertIn("xl/", x_data[:2000])
...@@ -81,16 +83,38 @@ class TestHandler(HandlerTestCase): ...@@ -81,16 +83,38 @@ class TestHandler(HandlerTestCase):
self.assertTrue(y_body_data.startswith("DOCY;v5;7519;"), "%r... does not start with 'DOCY;v5;7519;'" % (y_body_data[:20],)) self.assertTrue(y_body_data.startswith("DOCY;v5;7519;"), "%r... does not start with 'DOCY;v5;7519;'" % (y_body_data[:20],))
y_zip.open("media/image1.png") y_zip.open("media/image1.png")
def testgetMetadataFromImage(self): def testgetMetadata(self):
"""Test getMetadata not implemented form yformats""" """Test getMetadata from yformats (not implemented)"""
handler = Handler(self.tmp_url, "", "xlsy", **self.kw) handler = Handler(self.tmp_url, "", "xlsy", **self.kw)
self.assertRaises(NotImplementedError, handler.getMetadata) # Of course, expected behavior should be a dict of internal metadata
# but don't know how to handle it so far.
self.assertEquals(handler.getMetadata(), {})
def testsetMetadata(self): def testsetMetadata(self):
"""Test setMetadata not implemented for yformats""" """Test setMetadata for yformats (not implemented)"""
handler = Handler(self.tmp_url, "", "xlsy", **self.kw) handler = Handler(self.tmp_url, "", "xlsy", **self.kw)
self.assertRaises(NotImplementedError, handler.setMetadata) # Of course, expected behavior should be an updated data with new
# internal metadata but don't know how to handle it so far.
self.assertEquals(handler.setMetadata(), "")
def testGetAllowedConversionFormatList(self):
"""Test all combination of mimetype
None of the types below define any mimetype parameter to not ignore so far.
"""
get = Handler.getAllowedConversionFormatList
self.assertEquals(get("application/x-asc-text;ignored=param"),
[("application/vnd.openxmlformats-officedocument.wordprocessingml.document", "Word 2007 Document")])
self.assertEquals(get("application/x-asc-spreadsheet;ignored=param"),
[("application/vnd.openxmlformats-officedocument.spreadsheetml.sheet", "Excel 2007 Spreadsheet")])
self.assertEquals(get("application/x-asc-presentation;ignored=param"),
[("application/vnd.openxmlformats-officedocument.presentationml.presentation", "PowerPoint 2007 Presentation")])
self.assertEquals(get("application/vnd.openxmlformats-officedocument.wordprocessingml.document;ignored=param"),
[("application/x-asc-text", "OnlyOffice Text Document")])
self.assertEquals(get("application/vnd.openxmlformats-officedocument.spreadsheetml.sheet;ignored=param"),
[("application/x-asc-spreadsheet", "OnlyOffice Spreadsheet")])
self.assertEquals(get("application/vnd.openxmlformats-officedocument.presentationml.presentation;ignored=param"),
[("application/x-asc-presentation", "OnlyOffice Presentation")])
def test_suite(): def test_suite():
return make_suite(TestHandler) return make_suite(TestHandler)
...@@ -87,7 +87,7 @@ class IManager(Interface): ...@@ -87,7 +87,7 @@ class IManager(Interface):
metadata_dict : Metadatas to include in content metadata_dict : Metadatas to include in content
""" """
def getAllowedConversionFormatList(source_mimetype): def getAllowedConversionFormatList(self, source_mimetype):
"""Returns a list content_type and their titles which are supported """Returns a list content_type and their titles which are supported
by enabled handlers. by enabled handlers.
......
...@@ -28,11 +28,11 @@ ...@@ -28,11 +28,11 @@
############################################################################## ##############################################################################
import mimetypes import mimetypes
from mimetypes import guess_all_extensions, guess_extension from mimetypes import guess_type, guess_all_extensions, guess_extension
from base64 import encodestring, decodestring from base64 import encodestring, decodestring
from zope.interface import implements from zope.interface import implements
from interfaces.manager import IManager, IERP5Compatibility from interfaces.manager import IManager, IERP5Compatibility
from cloudooo.util import logger from cloudooo.util import logger, parseContentType
from cloudooo.interfaces.granulate import ITableGranulator from cloudooo.interfaces.granulate import ITableGranulator
from cloudooo.interfaces.granulate import IImageGranulator from cloudooo.interfaces.granulate import IImageGranulator
from cloudooo.interfaces.granulate import ITextGranulator from cloudooo.interfaces.granulate import ITextGranulator
...@@ -41,6 +41,7 @@ from fnmatch import fnmatch ...@@ -41,6 +41,7 @@ from fnmatch import fnmatch
#XXX Must be removed #XXX Must be removed
from cloudooo.handler.ooo.granulator import OOGranulator from cloudooo.handler.ooo.granulator import OOGranulator
from cloudooo.handler.ooo.mimemapper import mimemapper from cloudooo.handler.ooo.mimemapper import mimemapper
from cloudooo.handler.wkhtmltopdf.handler import Handler as WkhtmltopdfHandler
class HandlerNotFound(Exception): class HandlerNotFound(Exception):
pass pass
...@@ -59,6 +60,35 @@ def getHandlerClass(source_format, destination_format, mimetype_registry, ...@@ -59,6 +60,35 @@ def getHandlerClass(source_format, destination_format, mimetype_registry,
raise HandlerNotFound('No Handler found for %r=>%r' % (source_format, raise HandlerNotFound('No Handler found for %r=>%r' % (source_format,
destination_format)) destination_format))
def BBB_guess_type(url):
base = url.split("/")[-1].lstrip(".")
# if base.endswith(".ms.docx"): return ("application/vnd.openxmlformats-officedocument.wordprocessingml.document", "Microsoft Word 2007-2013 XML")
split = base.split(".")
ext = '' if len(split) == 1 else split[-1]
return {
"docy": ("application/x-asc-text", None),
"xlsy": ("application/x-asc-spreadsheet", None),
"ppty": ("application/x-asc-presentation", None),
}.get(ext, None) or guess_type(url)
def BBB_guess_extension(mimetype, title=None):
return {
# title : extension
"Flat XML ODF Text Document": ".fodt",
"MET - OS/2 Metafile": ".met",
"Microsoft Excel 2007-2013 XML": ".ms.xlsx",
"Microsoft PowerPoint 2007-2013 XML": ".ms.pptx",
"Microsoft PowerPoint 2007-2013 XML AutoPlay": ".ms.ppsx",
"Microsoft Word 2007-2013 XML": ".ms.docx",
}.get(title, None) or {
# mediatype : extension
"application/postscript": ".eps",
"application/vnd.ms-excel": ".xls",
"application/vnd.ms-excel.sheet.macroenabled.12": ".xlsm",
"application/vnd.ms-powerpoint": ".ppt",
"text/plain": ".txt",
"image/jpeg": ".jpg",
}.get(parseContentType(mimetype).gettype(), None) or guess_extension(mimetype)
class Manager(object): class Manager(object):
"""Manipulates requisitons of client and temporary files in file system.""" """Manipulates requisitons of client and temporary files in file system."""
...@@ -84,7 +114,20 @@ class Manager(object): ...@@ -84,7 +114,20 @@ class Manager(object):
""" """
self.kw['zip'] = zip self.kw['zip'] = zip
self.kw['refresh'] = refresh self.kw['refresh'] = refresh
handler_class = getHandlerClass(source_format, # XXX Force the use of wkhtmltopdf handler if converting from html to pdf
# with conversion parameters.
# This is a hack that quickly enables the use of wkhtmltopdf without
# conflicting with other "html to pdf" conversion method
# (i.e. using the ooo handler) that does not use such a parameter.
# This hack should be removed after defining and implementing a way to
# use the conversion_kw in a possible interoperable way between all
# "html to pdf" handlers.
if (conversion_kw and
source_format in ("html", "text/html") and
destination_format in ("pdf", "application/pdf")):
handler_class = WkhtmltopdfHandler
else:
handler_class = getHandlerClass(source_format,
destination_format, destination_format,
self.mimetype_registry, self.mimetype_registry,
self.handler_dict) self.handler_dict)
...@@ -143,36 +186,69 @@ class Manager(object): ...@@ -143,36 +186,69 @@ class Manager(object):
return metadata_dict return metadata_dict
def getAllowedExtensionList(self, request_dict={}): def getAllowedExtensionList(self, request_dict={}):
"""List types which can be generated from given type """BBB: extension should not be used, use MIMEType with getAllowedConversionFormatList
List extension which can be generated from given type
Type can be given as: Type can be given as:
- content type
- filename extension - filename extension
- document type ('text', 'spreadsheet', 'presentation' or 'drawing') - document type
e.g
self.getAllowedMimetypeList(dict(document_type="text"))
return extension_list
""" """
mimetype = request_dict.get('mimetype') mimetype = request_dict.get('mimetype')
extension = request_dict.get('extension') extension = request_dict.get('extension')
document_type = request_dict.get('document_type') document_type = request_dict.get('document_type')
if not mimetype:
if extension:
mimetype, _ = BBB_guess_type("a." + extension)
elif document_type:
# BBB no other choice than to ask ooo.mimemapper
return mimemapper.getAllowedExtensionList(document_type=document_type)
if mimetype: if mimetype:
allowed_extension_list = [] allowed_extension_set = set()
for ext in guess_all_extensions(mimetype): for content_type, title in self.getAllowedConversionFormatList(mimetype):
ext = ext.replace('.', '') ext = BBB_guess_extension(content_type, title)
extension_list = mimemapper.getAllowedExtensionList(extension=ext, if ext:
document_type=document_type) allowed_extension_set.add((ext.lstrip('.'), title))
for extension in extension_list: return list(allowed_extension_set) or [('', '')]
if extension not in allowed_extension_list:
allowed_extension_list.append(extension)
return allowed_extension_list
elif extension:
extension = extension.replace('.', '')
return mimemapper.getAllowedExtensionList(extension=extension,
document_type=document_type)
elif document_type:
return mimemapper.getAllowedExtensionList(document_type=document_type)
else: else:
return [('', '')] return [('', '')]
def getAllowedConversionFormatList(self, source_mimetype):
r"""Returns a list content_type and their titles which are supported
by enabled handlers.
[('application/vnd.oasis.opendocument.text', 'ODF Text Document'),
('application/pdf', 'PDF - Portable Document Format'),
...
]
This methods gets handler conversion mimetype availability according
to Cloudooo's mimetype registry.
/!\ unlike `self.getAllowedExtensionList`, it may return empty list
instead of `[('', '')]`.
/!\ the returned list may have the same mimetype twice with different title.
"""
handler_dict = {} # handler_dict["ooo"] = ["text/*", "application/*"]
for entry in self.mimetype_registry:
split_entry = entry.split()
if fnmatch(source_mimetype, split_entry[0]):
if split_entry[2] in handler_dict:
handler_dict[split_entry[2]].append(split_entry[1])
else:
handler_dict[split_entry[2]] = [split_entry[1]]
output_mimetype_set = set()
for handler, mimetype_filter_list in handler_dict.items():
for output_mimetype in self.handler_dict[handler].getAllowedConversionFormatList(source_mimetype):
for mimetype_filter in mimetype_filter_list:
if fnmatch(output_mimetype[0], mimetype_filter):
output_mimetype_set.add(output_mimetype)
break
return list(output_mimetype_set)
def run_convert(self, filename='', data=None, meta=None, extension=None, def run_convert(self, filename='', data=None, meta=None, extension=None,
orig_format=None): orig_format=None):
"""Method to support the old API. Wrapper getFileMetadataItemList but """Method to support the old API. Wrapper getFileMetadataItemList but
...@@ -242,7 +318,7 @@ class Manager(object): ...@@ -242,7 +318,7 @@ class Manager(object):
""" """
# calculate original extension required by convertFile from # calculate original extension required by convertFile from
# given content_type (orig_format) # given content_type (orig_format)
original_extension = guess_extension(orig_format).strip('.') original_extension = BBB_guess_extension(orig_format).strip('.')
# XXX - ugly way to remove "/" and "." # XXX - ugly way to remove "/" and "."
orig_format = orig_format.split('.')[-1] orig_format = orig_format.split('.')[-1]
orig_format = orig_format.split('/')[-1] orig_format = orig_format.split('/')[-1]
...@@ -282,7 +358,12 @@ class Manager(object): ...@@ -282,7 +358,12 @@ class Manager(object):
""" """
response_dict = {} response_dict = {}
try: try:
extension_list = self.getAllowedExtensionList({"mimetype": content_type}) mimetype_list = self.getAllowedConversionFormatList(content_type)
extension_list = []
for m, t in mimetype_list:
ext = BBB_guess_extension(m, t)
if ext:
extension_list.append((ext.lstrip('.'), t))
response_dict['response_data'] = extension_list response_dict['response_data'] = extension_list
return (200, response_dict, '') return (200, response_dict, '')
except Exception, e: except Exception, e:
......
...@@ -30,6 +30,8 @@ import logging ...@@ -30,6 +30,8 @@ import logging
import mimetypes import mimetypes
import pkg_resources import pkg_resources
import os import os
import mimetools
import cStringIO
from zipfile import ZipFile, ZIP_DEFLATED from zipfile import ZipFile, ZIP_DEFLATED
logger = logging.getLogger('Cloudooo') logger = logging.getLogger('Cloudooo')
...@@ -133,3 +135,18 @@ def unzip(source, destination): ...@@ -133,3 +135,18 @@ def unzip(source, destination):
zipfile = ZipFile(source) zipfile = ZipFile(source)
zipfile.extractall(destination) zipfile.extractall(destination)
zipfile.close() zipfile.close()
def parseContentType(content_type):
"""Parses `text/plain;charset="utf-8"` to a mimetools.Message object.
Note: Content type or MIME type are built like `maintype/subtype[;params]`.
parsed_content_type = parseContentType('text/plain;charset="utf-8"')
parsed_content_type.gettype() -> 'text/plain'
parsed_content_type.getmaintype() -> 'text'
parsed_content_type.getsubtype() -> 'plain'
parsed_content_type.getplist() -> 'charset="utf-8"'
parsed_content_type.getparam('charset') -> 'utf-8'
parsed_content_type.typeheader -> 'text/plain;charset="utf-8"'
"""
return mimetools.Message(cStringIO.StringIO("Content-Type:" + content_type.replace("\r\n", "\r\n\t")))
Markdown is supported
0%
or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment