Commit 35cde2d7 authored by Yusei Tahara's avatar Yusei Tahara

import PortalTransforms


git-svn-id: https://svn.erp5.org/repos/public/erp5/trunk@24294 20353a03-c40f-0410-a6d1-a30d3c3de9de
parent 63a35fb1
DONT USE ChangeLog USE HISTORY.txt instead.
2004-07-24 Christian Heimes <heimes@faho.rwth-aachen.de>
* Changed version to stick to Archetypes version.
2004-05-25 Christian Heimes <heimes@faho.rwth-aachen.de>
* Seperate MimetypesRegistry to a new product
2004-04-20 Christian Heimes <heimes@faho.rwth-aachen.de>
* transforms/rest.py: rest transform is now using the zope implementation if
available
2004-04-07 Christian Heimes <heimes@faho.rwth-aachen.de>
* transforms/text_pre_to_html.py: new transform for preformatted plain text
* transforms/text_to_html.py: changed <br/> to <br />
2004-03-17 Christian Heimes <heimes@faho.rwth-aachen.de>
* transforms/pdf_to_text.py: return text utf-8 encoded
2004-02-04 Sylvain Thénault <syt@logilab.fr>
* transforms/office_com.py: fix wrong import
2003-12-03 Sidnei da Silva <sidnei@awkly.org>
* mime_types/magic.py (guessMime): Don't try to be so magic :)
2003-11-18 Andreas Jung <andreas@andreas-jung.com)
* commandtransform.py: fixed sick cleanDir() implementation
2003-11-17 Andreas Jung <andreas@andreas-jung.com)
* added rtf_to_html.py converter
* added rtf to as mimetypes to mime_types/__init__.py
* added rtf_to_xml.py converter
* added pdf_to_text.py converter
* removed dependency from CMFDefault.utils for misc converters
(integrated code into libtransforms/utils.py)
2003-11-14 Sidnei da Silva <sidnei@plone.org>
* MimeTypesRegistry.py (MimeTypesRegistry.classify): If no results
this far, use magic.py module, written by Jason Petrone, and
updated by Gabriel Wicke with the data from gnome-vfs-mime-magic.
2003-11-07 Sylvain Thénault <syt@logilab.fr>
* use the same license as Archetypes (BSD like instead of GPL)
* www/tr_widgets.zpt: fix bug in the list widget (space before the
parameter's name, making it unrecognized)
* zope/Transform.py: fix set parameters to correctly remap
transform if editable inputs or output. (fix #837244)
* TransformEngine.py: better error messages, a few lines wrapping
* zope/__init__.py: use pt_globals instead of globals for variable
handling the product globals, making it reloadable
* Extensions/Install.py: use pt_globals
* www/listMimeTypes.zpt: use mt/normalized as id instead of mt/name
2003-11-05 Sylvain Thénault <syt@logilab.fr>
* unsafe_tranforms/command.py: added dummy output mime type to avoid
error when added via the ZMI (fix #837252)
2003-10-30 Sylvain Thénault <syt@logilab.fr>
* fixed addMimeType, editMimeType and tr_widget templates (fix #832958)
2003-10-03 Sidnei da Silva <sidnei@dreamcatcher.homeunix.org>
* utils.py (TransformException.getToolByName): Modified
getToolByName to have a fallback mimetypes_registry, so we can
simplify BaseUnit.
2003-09-23 Sylvain Thénault <syt@logilab.fr>
* MimesTypesRegistry.py: make unicode error handling configurable
* zope/MimesTypesTool.py: add a property for unicode error handling
* zope/Transform.py: make tests working
2003-08-19 Sylvain Thénault <syt@logilab.fr>
* transforms/rest.py: override "traceback" setting to avoid
sys.exit !
* transforms/text_to_html.py: use html_quote
2003-08-12 Sylvain Thénault <syt@logilab.fr>
* TransformEngine.py: set "encoding" in datastream metadata if
tranform provides a "output_encoding" attribute. Fix access to
"id" instead of "name()"
* zope/Transform.py: add some code to handle output encoding...
2003-08-08 Sylvain Thénault <syt@logilab.fr>
* MimeTypesRegistry.py: use suffix map has the standard mime types
module, hopefully correct behaviour of classify
* unsafe_transforms/build_transforms.py: fix inputs and output
mime type of ps_to_text transform
2003-08-07 Sylvain Thenault <sylvain.thenault@logilab.fr>
* encoding.py: new module which aims to detect encoding of text
files
* MimeTypesRegistry.py: use the encoding module in iadapter
2003-08-06 Sylvain Thenault <sylvain.thenault@logilab.fr>
* MimeTypesRegistry.py (classify): return
'application/octet-stream' instead of None
* transforms/text_to_html.py: replace '\n' with <br/> instead of
<pre> wrapping
* unsafe_transforms/build_transforms.py: create a ps_to_text
transform if ps2ascii is available
* tests/test_transforms.py: handle name of transforms to test on
command line
* transforms/__init__.py: do not print traceback on missing binary
exception
2003-08-01 Sylvain Thenault <sylvain.thenault@logilab.fr>
* transforms/text_to_html.py: new transform to wrap plain text in
<pre> for html
* transforms/test_transforms.py: add test for text_to_html
2003-07-28 Sylvain Thenault <sylvain.thenault@logilab.fr>
* zope/TransformsChain.py: fixes to make it works within Zope.
* www/editTransformsChain.zpt: add inputs / output information.
2003-07-28 Sylvain Thenault <sylvain.thenault@logilab.fr>
* transforms/rest.py: remove class="document"
* tests/test_transforms.py: added missing output for the identity
transform's test, fix initialize method.
2003-07-21 Sylvain Thenault <sylvain.thenault@logilab.fr>
* transforms/identity.py: added identity transform (used for instance
to convert text/x-rest to text/plain).
* tests/test_transforms.py: added test for the identity transform.
2003-07-11 Sylvain Thenault <sylvain.thenault@logilab.fr>
* unsafe_transforms/xml.py: just make it working.
* unsafe_transforms/command.py: add missing "name" argument to the
constructor. Use popen3 instead of popen4.
* unsafe_transforms/build_transforms.py: create an xml_to_html
transform if an xslt processor is available (however this transform
is not configured for any doctypes / dtds). Create tidy_html
transform if the tidy command is available.
* tests/test_transforms.py: add test cases for the xml and
html_tidy transform.
* transform.py: added transform_customize hook.
* docs/user_manual.rst: explain difference between python distro
and zope product. Added notice about archetypes integration.
* docs/dev_manual.rst: minor fixes.
003-07-10 Sylvain Thenault <sylvain.thenault@logilab.fr>
* refactoring to permit use of this package outside zope :)
Zope mode is triggered when "import Zope" doesn't fail
* fix bug in word_to_html / office_wvware transform
* add a generic test for transforms. It's much more easier now to
add a test for a transform :)
* add licensing information
* interfaces.py: complete / cleanup interfaces
* bin/tranform: add command line tool
* unsafe_transforms/command.py: bug fix
* addTransformsChain.zpt: fix typo
* fix #768927
2003-07-09 Sylvain Thenault <sylvain.thenault@logilab.fr>
* code cleaning:
- moved Transform and TransformsChain in their own files
- removed no more used bindingmixin and sourceAdapter
- merged transform and chain classes together
- generic cache and misc utilities in the new utils.py.
* ready for 1.0 alpha1 :)
2003-07-05 Sylvain Thenault <sylvain.thenault@logilab.fr>
* make the PortalTransforms product from the original transform
package and the mimetypes / transforms tools originaly defined in
Archetypes.
* drop the ability to use it as a standalone python package, since
there was too much duplicated code to make it works.
* some works on tests to make them succeed :)
* MimeTypesTool.py (MimeTypesTool.lookup): return an empty list
instead of None when no matching mime types is found.
2003-05-14 Sidnei da Silva <sidnei@x3ng.com>
* interface.py: Trying to normalize the way interfaces are
imported in different versions of Zope.
2003-04-21 Sidnei da Silva <sidnei@x3ng.com>
* __init__.py: Fixed lots of things here and there to make it work
with the new BaseUnit in Archetypes.
2003-04-20 Sidnei da Silva <sidnei@x3ng.com>
* tests/output/rest3.out: Fixed subtitle and added a test.
2003-04-19 Sidnei da Silva <sidnei@x3ng.com>
* tests/test_rest.py (BaseReSTFileTest.testSame): Added tests
based on input/output dirs to make it easy to add new tests for reST.
* transforms/rest.py (rest.convert): Rendering of
reST was broken. It was not rendering references the right way,
and it didnt seem like it was doing the right thing with
titles. Updated to use docutils.core.publish_string.
* tests/test_all.py (test_suite): Added lynx_dump to transform
html -> text. With tests.
2003-04-18 Sidnei da Silva <sidnei@x3ng.com>
* tests/test_all.py (test_suite): Removed dependencies from
CMFCore on testsuite.
* __init__.py: Made it work without being inside Products. We
eventually need to make a distutils setup, and then this can be
removed. If someone knows a better way to do this, please do.
lynx
pdftohtml
python-docutils
import os
from Products.CMFCore.DirectoryView import addDirectoryViews
from Products.CMFCore.DirectoryView import registerDirectory
from Products.CMFCore.DirectoryView import createDirectoryView
from Products.CMFCore.DirectoryView import manage_listAvailableDirectories
from Products.CMFCore.utils import getToolByName
from Products.CMFCore.utils import minimalpath
from Globals import package_home
from Acquisition import aq_base
from OFS.ObjectManager import BadRequestException
from Products.PortalTransforms import GLOBALS
from Products.PortalTransforms import skins_dir
from StringIO import StringIO
def install(self):
out = StringIO()
qi=getToolByName(self, 'portal_quickinstaller')
qi.installProduct('MimetypesRegistry',)
id = 'portal_transforms'
if hasattr(aq_base(self), id):
pt = getattr(self, id)
if not getattr(aq_base(pt), '_new_style_pt', None) == 1:
print >>out, 'Removing old portal transforms tool'
self.manage_delObjects([id,])
if not hasattr(aq_base(self), id):
addTool = self.manage_addProduct['PortalTransforms'].manage_addTool
addTool('Portal Transforms')
print >>out, 'Installing portal transforms tool'
updateSafeHtml(self, out)
correctMapping(self, out)
# not required right now
# installSkin(self)
return out.getvalue()
def correctMapping(self, out):
pt = getToolByName(self, 'portal_transforms')
pt_ids = pt.objectIds()
for m_in, m_out_dict in pt._mtmap.items():
for m_out, transforms in m_out_dict.items():
for transform in transforms:
if transform.id not in pt_ids:
#error, mapped transform is no object in portal_transforms. correct it!
print >>out, "have to unmap transform (%s) cause its not in portal_transforms ..." % transform.id
try:
pt._unmapTransform(transform)
except:
raise
else:
print >>out, "...ok"
def updateSafeHtml(self, out):
print >>out, 'Update safe_html...'
safe_html_id = 'safe_html'
safe_html_module = "Products.PortalTransforms.transforms.safe_html"
pt = getToolByName(self, 'portal_transforms')
for id in pt.objectIds():
transform = getattr(pt, id)
if transform.id == safe_html_id and transform.module == safe_html_module:
try:
disable_transform = transform.get_parameter_value('disable_transform')
except KeyError:
print >>out, ' replace safe_html (%s, %s) ...' % (transform.name(), transform.module)
try:
pt.unregisterTransform(id)
pt.manage_addTransform(id, safe_html_module)
except:
raise
else:
print >>out, ' ...done'
print >>out, '...done'
def installSkin(self):
skinstool=getToolByName(self, 'portal_skins')
fullProductSkinsPath = os.path.join(package_home(GLOBALS), skins_dir)
productSkinsPath = minimalpath(fullProductSkinsPath)
registered_directories = manage_listAvailableDirectories()
if productSkinsPath not in registered_directories:
registerDirectory(skins_dir, GLOBALS)
try:
addDirectoryViews(skinstool, skins_dir, GLOBALS)
except BadRequestException, e:
pass # directory view has already been added
files = os.listdir(fullProductSkinsPath)
for productSkinName in files:
if os.path.isdir(os.path.join(fullProductSkinsPath, productSkinName)) \
and productSkinName != 'CVS':
for skinName in skinstool.getSkinSelections():
path = skinstool.getSkinPath(skinName)
path = [i.strip() for i in path.split(',')]
try:
if productSkinName not in path:
path.insert(path.index('custom') +1, productSkinName)
except ValueError:
if productSkinName not in path:
path.append(productSkinName)
path = ','.join(path)
skinstool.addSkinSelection(skinName, path)
1.4.0-final - 2006-06-16
========================
* Shut down a noisy logging message to DEBUG level.
[hannosch]
* Converted logging infrastructure from zLOG usage to Python's logging module.
[hannosch]
* Avoid DeprecationWarning for manageAddDelete.
[hannosch]
* Spring-cleaning of tests infrastructure.
[hannosch]
1.4.0-beta1 - 2006-03-26
========================
* removed odd archetypes 1.3 style version checking
[jensens]
* Removed BBB code for CMFCorePermissions import location.
[hannosch]
* removed deprecation-warning for ToolInit
[jensens]
1.3.9-final02 - 2006-01-15
==========================
* nothing - the odd version checking needs a version change to stick to
Archetypes version.
[yenzenz]
1.3.9-RC1 - 2005-12-29
======================
* Fixed [ 1293684 ], unregistered Transforms are not unmaped,
Transformation was deleted from portal_transforms, but remained
active.
http://sourceforge.net/tracker/index.php?func=detail&aid=1293684&group_id=75272&atid=543430
Added a cleanup that unmaps deleted transforms on reinstall
[csenger]
* Replaced the safe_html transformation with a configurable version
with the same functionality. Migration is handled on reinstall.
http://trac.plone.org/plone/ticket/4538
[csenger] [dreamcatcher]
* Removed CoUnInitialize call. According to Mark Hammond: The
right thing to do is call that function, although almost noone
does (including pywin32 itself, which does CoInitialize the main
thread) and I've never heard of problem caused by this
omission.
[sidnei]
* Fix a long outstanding issue with improper COM thread model
initialization. Initialize COM for multi-threading, ignoring any
errors when someone else has already initialized differently.
https://trac.plone.org/plone/ticket/4712
[sidnei]
* Correct some wrong security settings.
[hannosch]
* Fixed the requirements look-up from the policy
(#1358085)
1.3.8-final02 - 2005-10-11
==========================
* nothing - the odd version checking needs a version change to stick to
Archetypes version.
[yenzenz]
1.3.7-final01 - 2005-08-30
==========================
* nothing - the odd version checking needs a version change to stick to
Archetypes version.
[yenzenz]
1.3.6-final02 - 2005-08-07
==========================
* nothing - the odd version checking needs a version change to stick to
Archetypes version.
[yenzenz]
1.3.6-final - 2005-08-01
========================
* Added q to the list of valid and safe html tags by limi's request.
Wrote test for safe_html parsing.
[hannosch]
* Added ins and del to the list of valid and safe html tags.
[ 1199917 ] XHTML DEL tag is removed during the safe_html conversion
[tiran]
1.3.5-final02 - 2005-07-17
==========================
* changed version to stick to appropiate Archetypes Version.
[yenzenz]
1.3.5-final - 2005-07-06
========================
* pdf_to_html can show images now. Revert it to command transformer and
make it work under windows.
[panjunyong]
* refined command based unsafe transform to make it work with windows.
[panjunyong]
* Disabled office_uno by default because it doesn't support multithread yet
[panjunyong]
* Rewrote office_uno to make it work for the recent PyUNO.
[panjunyong]
1.3.4-final01 - 2005-05-20
==========================
* nothing (I hate to write this. But the odd version checking needs it).
[yenzenz]
1.3.4-rc1 - 2005-03-25
======================
* Better error handling for safe html transformation
[tiran]
1.3.3-final - 2005-03-05
========================
* Updated link to rtf converter to http://freshmeat.net/projects/rtfconverter/
[tiran]
* Small fix for the com office converter. COM could crash if word is
invisible. Also a pop up might appeare when quitting word.
[gogo]
* Fixed [ 1053846 ] Charset problem with wvware word_to_html conversion
[flacoste]
* Fixed python and test pre transforms to use html quote special characters.
Thx to stain. [ 1091670 ] Python source code does not escape HTML.
[tiran]
* Fixed [ 1121812 ] fix PortalTransforms unregisterTransformation()
unregisterTransformation() misses to remove from the zodb the persistance
wrapper added to the trasformation
[dan_t]
* Fixed [ 1118739 ] popentransform does not work on windows
[duncanb]
* Fixed [ 1122175 ] extra indnt sytax error in office_uno.py
[ryuuguu]
* fixed bug with some transformers' temp filename: it tried to use original filename
which is encoded in utf8 and may contrain invalid charset for my Windows server.
Just use filename as: unknown.suffix
[panjunyong]
* STX header level is set to 2 instead of using zope.conf. Limi forced me to
change it.
[tiran]
* fixed bug: word_to_html uses office_com under windows
1.3.2-5 - 2004-10-17
====================
* Fixed [ 1041637 ] RichWidget: STX level should be set to 3 instead 1. The
structured text transform is now using the zope.conf option or has an
optional level paramenter in the convert method.
[tiran]
* Added win32api.GetShortPathName to libtransforms/commandtransform
so binaries found in directories which have spaces in their names
will work as expected
[runyaga]
1.3.2-4 - 2004-09-30
====================
* nothing changed
1.3.2-3 - 2004-09-25
====================
* Fixed more unit tests
[tiran]
1.3.2-2 - 2004-09-17
====================
* Fixed [ 1025066 ] Serious persistency bug
[dmaurer]
* Fixed some unit tests failurs. Some unit tests did fail because the reST
and STX output has changed slightly.
[tiran]
* Don't include the first three lines of the lynx output which are url,
title and a blank line. This fixed also a unit test because the url
which was a file in the fs did change every time.
[tiran]
* Fixed a bug in make_unpersistent. It seemed that this method touched values
inside the mapping.
[dreamcatcher]
1.3.2-1 - 2004-09-04
====================
* Disabled filters that were introduced in 1.3.1-1. The currently used
transform path algo is broken took too long to find a path.
[tiran]
* Cleaned up major parts of PT by removing the python only implementation which
was broken anyway
* Fixed [ 1019632 ] current svn bundle (rev 2942) broken
1.3.1-1 - 2004-08-16
====================
* Introduce the concept of filters (one-hop transforms where the source and
destination are the same mimetype).
[dreamcatcher]
* Add a html filter to extract the content of the body tag (so we don't get a
double <body> when uploading full html files).
[dreamcatcher]
* Change base class for Transform to SimpleItem which is equivalent to the
previous base classes and provides a nice __repr__.
[dreamcatcher]
* Lower log levels.
[dreamcatcher]
* cache.py: Added purgeCache, fixed has cache test.
[tiran]
* Fixed non critical typo in error message: Unvalid -> Invalid
[tirna]
1.3.0-3 - 2004-08-06
====================
* Added context to the convert, convertTo and __call__ methods. The context is
the object on which the transform was called.
[tiran]
* Added isCacheable flag and setCacheable to idatastream (data.py). Now you can
disable the caching of the result of a transformation.
[tiran]
* Added __setstate__ to load new transformations from the file system.
[tiran]
* Fixed [ 1002014 ] Add policy screen doesn't accept single entry
[tiran]
1.3.0-2 - 2004-07-29
====================
* Added workaround for [ 997998 ] PT breaks ZMI/Find [tiran]
Copyright (c) 2002-2003, Benjamin Saller <bcsaller@ideasuite.com>, and
the respective authors.
All rights reserved.
Redistribution and use in source and binary forms, with or without
modification, are permitted provided that the following conditions are
met:
* Redistributions of source code must retain the above copyright
notice, this list of conditions and the following disclaimer.
* Redistributions in binary form must reproduce the above
copyright notice, this list of conditions and the following disclaimer
in the documentation and/or other materials provided with the
distribution.
* Neither the name of Archetypes nor the names of its contributors
may be used to endorse or promote products derived from this software
without specific prior written permission.
THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
"AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
(INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
include ChangeLog
include README
include TODO
include version.txt
include bin/transform
include bin/transform.bat
recursive-include docs *.txt
recursive-include docs *.rst
recursive-include docs *.html
recursive-include tests/input *
recursive-include tests/output *
NAME=PortalTransforms
MAJOR_VERSION=1.0
MINOR_VERSION=4
RELEASE_TAG=
PACKAGE_NAME=${NAME}-${MAJOR_VERSION}.${MINOR_VERSION}${RELEASE_TAG}
PYTHON="/usr/bin/python"
TMPDIR=~/tmp
CURDIR=~/src/archetypes/head/PortalTransforms
BASE_DIR=${CURDIR}/..
SOFTWARE_HOME=~/src/zope/2_7/lib/python
INSTANCE_HOME=~/src/instance/shellex
PACKAGES=PortalTransforms
RM=rm -f
RMRF=rm -rf
FIND=find
XARGS=xargs
CD=cd
LN=ln -sfn
CP=cp
TAR=tar
MKDIR=mkdir -p
.PHONY : clean test reindent reindent_clean sdist
.PHONY : default
# default: The default step (invoked when make is called without a target)
default: clean test
clean :
find . \( -name '*~' -o -name '*.py[co]' -o -name '*.bak' \) -exec rm {} \; -print
reindent :
~/src/reindent.py -r -v .
test :
export INSTANCE_HOME=${INSTANCE_HOME}; export SOFTWARE_HOME=${SOFTWARE_HOME}; \
cd ${CURDIR}/tests && ${PYTHON} runalltests.py
# sdist: Create a source distribution file (implies clean).
#
sdist: reindent clean sdist_tgz
# sdist_tgz: Create a tgz archive file as a source distribution.
#
sdist_tgz:
echo -n "${MAJOR_VERSION}.${MINOR_VERSION}${RELEASE_TAG}" >\
${CURDIR}/version.txt
${MKDIR} ${TMPDIR}/${PACKAGE_NAME}
${CD} ${TMPDIR}/${PACKAGE_NAME} && \
for package in ${PACKAGES}; do ${LN} ${BASE_DIR}/$$package .; done && \
${CD} ${TMPDIR} && ${TAR} czfh ${BASE_DIR}/${PACKAGE_NAME}.tgz ${PACKAGE_NAME} \
--exclude=${PACKAGE_NAME}.tgz\
--exclude=CVS \
--exclude=.cvsignore \
--exclude=makefile \
--exclude=Makefile \
--exclude=*.pyc \
--exclude=TAGS \
--exclude=*~ \
--exclude=.#*
${RMRF} ${TMPDIR}/${PACKAGE_NAME}
Portal Transforms
=================
This Zope product provides two new tools for the CMF in order to make MIME
types based transformations on the portal contents, and so an easy to way to
plugin some new transformations for previously unsupported content types. The
provided tools are :
* portal_transform (the transform tool) : handle transformation of data from a
mime type to another
A bunch of ready to use transformations are also provided. Look at the
documentation for more information.
Notice this package can also be used as a standalone Python package. If
you've downloaded the Python distribution, you can't make it a Zope
product since Zope files have been removed from this distribution.
This product is an off-spring of the Archetypes project.
Installation
------------
WARNING : The two installation methods may conflict, choose the one adapted to
your need.
Zope
````
* Put this package in your Zope's Products directory and restart Zope
* either use the QuickInstaller to add this product to your CMF site or add an
external method to the root of your CMF site with the following information :
:module: PortalTransforms.Install
:method: install
and click the test tab to run it.
Python
``````
* Extract the tarball
* Run "python setup.py install". See "python setup.py install --help" for
installation options.
* That's it, you should have the library and the *transform* command line tool
installed.
Documentation
-------------
See the *docs* directory in this package.
Mailing-list
------------
Discussion about this products occurs to the archetypes mailing list :
http://sourceforge.net/mail/?group_id=75272
or on the #plone channel of irc.freenode.net.
Authors
-------
Benjamin Saller <bcsaller@yahoo.com>
Sidnei da Silva <sidnei@x3ng.com>
Sylvain Thénault <sylvain.thenault@logilab.fr>
wv
xsltproc
tidy
unrtf
ppthtml
xlhtml
gs-common
TODO list for the Portal Transforms product
-------------------------------------------
* enhance unsafe_transforms/build_transforms to provide a bunch of
transformations using command/xml with various configuration
* iencoding_classifier ?
* make more transforms :)
This diff is collapsed.
This diff is collapsed.
from Products.PortalTransforms.TransformEngine import TransformTool
"""FIXME: backward compat, remove later
"""
from Products.PortalTransforms.chain import Chain as TransformsChain
import os.path
__version__ = open(os.path.join(__path__[0], 'version.txt')).read().strip()
from Products.PortalTransforms.utils import skins_dir
from Products.PortalTransforms.TransformEngine import TransformTool
GLOBALS = globals()
PKG_NAME = 'PortalTransforms'
tools = (
TransformTool,
)
# XXX backward compatibility tricks to make old PortalTransform based Mimetypes
# running (required)
import sys
this_module = sys.modules[__name__]
from Products.MimetypesRegistry import mime_types
setattr(this_module, 'mime_types', mime_types)
from Products.MimetypesRegistry import MimeTypeItem
setattr(this_module, 'MimeTypeItem', MimeTypeItem)
from Products.MimetypesRegistry import MimeTypeItem
sys.modules['Products.PortalTransforms.zope.MimeTypeItem'] = MimeTypeItem
def initialize(context):
from Products.CMFCore.DirectoryView import registerDirectory
#registerDirectory(skins_dir, GLOBALS)
from Products.CMFCore import utils
utils.ToolInit("%s Tool" % PKG_NAME,
tools=tools,
icon="tool.gif",
).initialize(context)
from Products import PortalTransforms as PRODUCT
import os.path
version=PRODUCT.__version__
modname=PRODUCT.__name__
# (major, minor, patchlevel, release info) where release info is:
# -99 for alpha, -49 for beta, -19 for rc and 0 for final
# increment the release info number by one e.g. -98 for alpha2
major, minor, bugfix = version.split('.')[:3]
bugfix, release = bugfix.split('-')[:2]
relinfo=-99 #alpha
if 'beta' in release:
relinfo=-49
if 'rc' in release:
relinfo=-19
if 'final' in release:
relinfo=0
numversion = (int(major), int(minor), int(bugfix), relinfo)
license = 'BSD like'
license_text = open(os.path.join(PRODUCT.__path__[0], 'LICENSE.txt')).read()
copyright = '''Copyright (c) 2003 LOGILAB S.A. (Paris, FRANCE)'''
author = "Archetypes developement team"
author_email = "archetypes-devel@lists.sourceforge.net"
short_desc = "MIME types based transformations for the CMF"
long_desc = """This package provides two new CMF tools in order to
make MIME types based transformations on the portal contents and so an
easy to way to plugin some new transformations for previously
unsupported content types. You will find more info in the package's
README and docs directory.
.
It's part of the Archetypes project, but the only requirement to use it
is to have a CMF based site. If you are using Archetypes, this package
replaces the transform package.
.
Notice this package can also be used as a standalone Python package. If
you've downloaded the Python distribution, you can't make it a Zope
product since Zope files have been removed from this distribution.
"""
web = "http://plone.org/products/archetypes"
ftp = ""
mailing_list = "archetypes-devel@lists.sourceforge.net"
debian_name = "zope-cmftransforms"
debian_maintainer = "Sylvain Thenault"
debian_maintainer_email = "sylvain.thenault@logilab.fr"
debian_handler = "zope"
<configure
xmlns="http://namespaces.zope.org/five"
>
<bridge
zope2=".interfaces.idatastream"
package=".z3.interfaces"
name="IDataStream"
/>
<bridge
zope2=".interfaces.itransform"
package=".z3.interfaces"
name="ITransform"
/>
<bridge
zope2=".interfaces.ichain"
package=".z3.interfaces"
name="IChain"
/>
<bridge
zope2=".interfaces.iengine"
package=".z3.interfaces"
name="IEngine"
/>
</configure>
"""Cache
"""
from time import time
from Acquisition import aq_base
class Cache:
def __init__(self, context, _id='_v_transform_cache'):
self.context = context
self._id =_id
def _genCacheKey(self, identifier, *args):
key = identifier
for arg in args:
key = '%s_%s' % (key, arg)
key = key.replace('/', '_')
key = key.replace('+', '_')
key = key.replace('-', '_')
key = key.replace(' ', '_')
return key
def setCache(self, key, value):
"""cache a value indexed by key"""
if not value.isCacheable():
return
context = self.context
key = self._genCacheKey(key)
if getattr(aq_base(context), self._id, None) is None:
setattr(context, self._id, {})
getattr(context, self._id)[key] = (time(), value)
return key
def getCache(self, key):
"""try to get a cached value for key
return None if not present
else return a tuple (time spent in cache, value)
"""
context = self.context
key = self._genCacheKey(key)
dict = getattr(context, self._id, None)
if dict is None :
return None
try:
orig_time, value = dict.get(key, None)
return time() - orig_time, value
except TypeError:
return None
def purgeCache(self, key=None):
"""Remove cache
"""
context = self.context
id = self._id
if not shasattr(context, id):
return
if key is None:
delattr(context, id)
else:
cache = getattr(context, id)
key = self._genCacheKey(key)
if cache.has_key(key):
del cache[key]
from Products.PageTemplates.PageTemplateFile import PageTemplateFile
from Globals import Persistent
from Globals import InitializeClass
from Acquisition import Implicit
from OFS.SimpleItem import Item
from AccessControl.Role import RoleManager
from AccessControl import ClassSecurityInfo
from Products.CMFCore.permissions import ManagePortal, ManageProperties
from Products.CMFCore.utils import getToolByName
from Products.PortalTransforms.utils import TransformException, _www
from Products.PortalTransforms.interfaces import ichain
from Products.PortalTransforms.interfaces import itransform
from UserList import UserList
class chain(UserList):
"""A chain of transforms used to transform data"""
__implements__ = (ichain, itransform)
def __init__(self, name='',*args):
UserList.__init__(self, *args)
self.__name__ = name
if args:
self._update()
def name(self):
return self.__name__
def registerTransform(self, transform):
self.append(transform)
def unregisterTransform(self, name):
for i in range(len(self)):
tr = self[i]
if tr.name() == name:
self.pop(i)
break
else:
raise Exception('No transform named %s registered' % name)
def convert(self, orig, data, **kwargs):
for transform in self:
data = transform.convert(orig, data, **kwargs)
orig = data.getData()
md = data.getMetadata()
md['mimetype'] = self.output
return data
def __setitem__(self, key, value):
UserList.__setitem__(self, key, value)
self._update()
def append(self, value):
UserList.append(self, value)
self._update()
def insert(self, *args):
UserList.insert(*args)
self._update()
def remove(self, *args):
UserList.remove(*args)
self._update()
def pop(self, *args):
UserList.pop(*args)
self._update()
def _update(self):
self.inputs = self[0].inputs
self.output = self[-1].output
for i in range(len(self)):
if hasattr(self[-i-1], 'output_encoding'):
self.output_encoding = self[-i-1].output_encoding
break
else:
try:
del self.output_encoding
except:
pass
class TransformsChain(Implicit, Item, RoleManager, Persistent):
""" a transforms chain is suite of transforms to apply in order.
It follows the transform API so that a chain is itself a transform.
"""
meta_type = 'TransformsChain'
meta_types = all_meta_types = ()
manage_options = (
({'label':'Configure',
'action':'manage_main'},
{'label':'Reload',
'action':'manage_reloadTransform'},) +
Item.manage_options
)
manage_main = PageTemplateFile('editTransformsChain', _www)
manage_reloadTransform = PageTemplateFile('reloadTransform', _www)
security = ClassSecurityInfo()
def __init__(self, id, description, ids=()):
self.id = id
self.description = description
self._object_ids = list(ids)
self.inputs = ('application/octet-stream',)
self.output = 'application/octet-stream'
self._chain = None
def __setstate__(self, state):
""" __setstate__ is called whenever the instance is loaded
from the ZODB, like when Zope is restarted.
We should rebuild the chain at this time
"""
TransformsChain.inheritedAttribute('__setstate__')(self, state)
self._chain = None
def _chain_init(self):
""" build the transforms chain """
tr_tool = getToolByName(self, 'portal_transforms')
self._chain = c = chain()
for id in self._object_ids:
object = getattr(tr_tool, id)
c.registerTransform(object)
self.inputs = c.inputs or ('application/octet-stream',)
self.output = c.output or 'application/octet-stream'
security.declarePublic('convert')
def convert(self, *args, **kwargs):
""" return apply the transform and return the result """
if self._chain is None:
self._chain_init()
return self._chain.convert(*args, **kwargs)
security.declarePublic('name')
def name(self):
"""return the name of the transform instance"""
return self.id
security.declarePrivate('manage_beforeDelete')
def manage_beforeDelete(self, item, container):
Item.manage_beforeDelete(self, item, container)
if self is item:
# unregister self from catalog on deletion
tr_tool = getToolByName(self, 'portal_transforms')
tr_tool.unregisterTransform(self.id)
security.declareProtected(ManagePortal, 'manage_addObject')
def manage_addObject(self, id, REQUEST=None):
""" add a new transform or chain to the chain """
assert id not in self._object_ids
self._object_ids.append(id)
self._chain_init()
if REQUEST is not None:
REQUEST['RESPONSE'].redirect(self.absolute_url()+'/manage_main')
security.declareProtected(ManagePortal, 'manage_delObjects')
def manage_delObjects(self, ids, REQUEST=None):
""" delete the selected mime types """
for id in ids:
self._object_ids.remove(id)
self._chain_init()
if REQUEST is not None:
REQUEST['RESPONSE'].redirect(self.absolute_url()+'/manage_main')
# transforms order handling #
security.declareProtected(ManagePortal, 'move_object_to_position')
def move_object_to_position(self, id, newpos):
""" overriden from OrderedFolder to store id instead of objects
"""
oldpos = self._object_ids.index(id)
if (newpos < 0 or newpos == oldpos or newpos >= len(self._object_ids)):
return 0
self._object_ids.pop(oldpos)
self._object_ids.insert(newpos, id)
self._chain_init()
return 1
security.declareProtected(ManageProperties, 'move_object_up')
def move_object_up(self, id, REQUEST=None):
""" move object with the given id up in the list """
newpos = self._object_ids.index(id) - 1
self.move_object_to_position(id, newpos)
if REQUEST is not None:
REQUEST['RESPONSE'].redirect(self.absolute_url()+'/manage_main')
security.declareProtected(ManageProperties, 'move_object_down')
def move_object_down(self, id, REQUEST=None):
""" move object with the given id down in the list """
newpos = self._object_ids.index(id) + 1
self.move_object_to_position(id, newpos)
if REQUEST is not None:
REQUEST['RESPONSE'].redirect(self.absolute_url()+'/manage_main')
# Z transform interface #
security.declareProtected(ManagePortal, 'reload')
def reload(self):
""" reload the module where the transformation class is defined """
for tr in self.objectValues():
tr.reload()
# utilities #
security.declareProtected(ManagePortal, 'listAddableObjectIds')
def listAddableObjectIds(self):
""" return a list of addable transform """
tr_tool = getToolByName(self, 'portal_transforms')
return [id for id in tr_tool.objectIds() if not (id == self.id or id in self._object_ids)]
security.declareProtected(ManagePortal, 'objectIds')
def objectIds(self):
""" return a list of addable transform """
return tuple(self._object_ids)
security.declareProtected(ManagePortal, 'objectValues')
def objectValues(self):
""" return a list of addable transform """
tr_tool = getToolByName(self, 'portal_transforms')
return [getattr(tr_tool, id) for id in self.objectIds()]
InitializeClass(TransformsChain)
from Products.PortalTransforms.interfaces import idatastream
class datastream:
"""A transformation datastream packet"""
__implements__ = idatastream
__slots__ = ('name', '_data', '_metadata')
def __init__(self, name):
self.__name__ = name
self._data = ''
self._metadata = {}
self._objects = {}
self._cacheable = 1
def __str__(self):
return self.getData()
def name(self):
return self.__name__
def setData(self, value):
"""set the main data produced by a transform, i.e. usually a string"""
self._data = value
def getData(self):
"""provide access to the transformed data object, i.e. usually a string.
This data may references subobjects.
"""
if callable(self._data):
data = self._data()
else:
data = self._data
return data
def setSubObjects(self, objects):
"""set a dict-like object containing subobjects.
keys should be object's identifier (e.g. usually a filename) and
values object's content.
"""
self._objects = objects
def getSubObjects(self):
"""return a dict-like object with any optional subobjects associated
with the data"""
return self._objects
def getMetadata(self):
"""return a dict-like object with any optional metadata from
the transform"""
return self._metadata
def isCacheable(self):
"""Return a bool which indicates wether the result should be cached
Default is true
"""
return self._cacheable
def setCacheable(self, value):
"""Set cacheable flag to yes or no
"""
self._cacheable = not not value
#data = property('getData', 'setData', None, """idata.data""")
#metadata = property('getMetadata', 'setMetadata', None,
#"""idata.metadata""")
===================================
Portal Transforms'Developper manual
===================================
:Author: Sylvain Thenault
:Contact: syt@logilab.fr
:Date: $Date: 2005-08-19 23:43:41 +0200 (Fre, 19 Aug 2005) $
:Version: $Revision: 1.5 $
:Web site: http://sourceforge.net/projects/archetypes
.. contents::
Tools interfaces
----------------
The MIME types registry
```````````````````````
class isourceAdapter(Interface):
def __call__(data, \**kwargs):
"""convert data to unicode, may take optional kwargs to aid in conversion"""
class imimetypes_registry(isourceAdapter):
def classify(data, mimetype=None, filename=None):
"""return a content type for this data or None
None should rarely be returned as application/octet can be
used to represent most types.
"""
def lookup(mimetypestring):
"""Lookup for imimetypes object matching mimetypestring.
mimetypestring may have an empty minor part or containing a wildcard (*).
mimetypestring may be an imimetype object (in this case it will be
returned unchanged, else it should be a RFC-2046 name.
Return a list of mimetypes objects associated with the RFC-2046 name.
Return an empty list if no one is known.
"""
def lookupExtension(filename):
""" return the mimetypes object associated with the file's extension
or None if it is not known.
filename maybe a file name like 'content.txt' or an extension like 'rest'
"""
def mimetypes():
"""return all defined mime types, each one implements at least imimetype
"""
def list_mimetypes():
"""return all defined mime types, as string"""
The tranformation tool
``````````````````````
class iengine(Interface):
def registerTransform(transform):
"""register a transform
transform must implements itransform
"""
def unregisterTransform(name):
""" unregister a transform
name is the name of a registered transform
"""
def convertTo(mimetype, orig, idata=None, \**kwargs):
"""Convert orig to a given mimetype
return an object implementing idatastream or None if not path has been
found
"""
def convert(name, orig, idata=None, \**kwargs):
"""run a tranform of a given name on data
name is the name of a registered transform
return an object implementing idatastream
"""
def __call__(name, orig, idata=None, \**kwargs):
"""run a transform returning the raw data product
name is the name of a registered transform
return an object implementing idatastream
"""
Writing a new transformation
----------------------------
Writing a new transform should be an easy task. You only have to follow a
simple interface to do it, but knowing some advanced features and provided
utilities may help to do it quicker...
Related interfaces
``````````````````
class itransform(Interface):
"""A transformation plugin -- tranform data somehow must be threadsafe and stateless"""
inputs = Attribute("""list of imimetypes (or registered rfc-2046
names) this transform accepts as inputs""")
output = Attribute("output imimetype as instance or rfc-2046 name"")
def name(self):
"""return the name of the transform instance"""
def convert(data, idata, \**kwargs):
"""convert the data, store the result in idata and return that"""
class idatastream(Interface):
"""data stream, is the result of a transform"""
def setData(self, value):
"""set the main data produced by a transform, i.e. usually a string"""
def getData():
"""provide access to the transformed data object, i.e. usually a string.
This data may references subobjects.
"""
def setSubObjects(self, objects):
"""set a dict-like object containing subobjects.
keys should be object's identifier (e.g. usually a filename) and
values object's content.
"""
def getSubObjects(self):
"""return a dict-like object with any optional subobjects associated
with the data"""
def getMetadata():
"""return a dict-like object with any optional metadata from
the transform"""
Important note about encoding
`````````````````````````````
A transform receive data as an encoded string. A priori, no assumption can be
made about the used encoding. Data returned by a transform must use the same
encoding as received data, unless the transform provides a *output_encoding*
attribute indicating the output encoding (for instance this may be usefull for
XSLT based transforms).
Configurable transformation
```````````````````````````
You can make your transformation configurable through the ZMI by setting a
*config* dictionnary on your transform instance or class. Keys are parameter's
name and values parameter's value. Another dictionnary *config_metadata*
describes each parameter. In this mapping, keys are also parameter's name but
values are a tree-uple : (<parameter's type>, <parameter's label>, <parameter's
description>).
Possible types for parameters are :
:int: field is an integer
:string: field is a string
:list: field is a list
:dict: field is a dictionnary
You can look at the **command** and **xml** transforms for an example of
configurable transform.
Images / sub-objects management
````````````````````````````````
A transformation may produce some sub-objects, for instance when you convert a
PDF document to HTML. That's the purpose of the setObjects method of
the idatastream interface.
Some utilities
``````````````
Transform utilities may be found in the libtransforms subpackage. You'll find
there the following modules :
*commandtransform*
provides a base class for external command based transforms.
*retransform*
provides a base class for regular expression based transforms.
*html4zope*
provides a docutils HTML writer for Zope.
*utils*
provides some utilities functions.
Write a test your transform !
`````````````````````````````
Every transform should have its test... And it's easy to write a test for your
transform ! Imagine you have made a transform named "colabeer" which transforms
cola into beer (I let you find MIME type for these content types ;). Basically,
your test file should be :
from test_transforms import make_tests
tests =('Products.MyTransforms.colabeer', "drink.cola", "drink.beer", None, 0)
def test_suite():
return TestSuite([makeSuite(test) for test in make_tests()])
if __name__=='__main__':
main(defaultTest='test_suite')
In this example :
- "Products.MyTransforms.colabeer" is the module defining your transform (you
can also give directly the transform instance).
- "drink.cola" is the name of the file containing data to give to your transform
as input.
- "drink.beer" is the file containing expected transform result (what the getData
method of idatastream will return).
- Additional arguments (*None* and *0*) are respectivly an optional normalizing
function to apply to both the transform result and the output file content, and
the number of subobjects that the transform is expected to produce.
This example supposes your test is in the *tests* directory of PortalTransforms
and your input and output files respectively in *tests/input* and
*tests/output*.
\ No newline at end of file
REST2HTML=html.py --compact-lists --date --generator
all: user_manual.html dev_manual.html
user_manual.html: user_manual.rst
$(REST2HTML) user_manual.rst user_manual.html
dev_manual.html: dev_manual.rst
$(REST2HTML) dev_manual.rst dev_manual.html
clean:
rm *.html
===================================
How to setup PyUNO for zope
===================================
:Author: Junyong Pan <panjy at zopechina.com>, Anton Stonor <stonor@giraffen.dk>
:Date: $Date: 2003-08-12 02:50:50 -0800 (Tue, 12 Aug 2003) $
:Version: $Revision: 1.5 $
(to be refined)
Portal Transforms allows you to convert Word documents to HTML. A cool
feature
if you want to preview Word documents at your web site or use Word as a web
authoring tool.
To do the actual transform, Portal Transforms rely on a third party
application
to do the heavy lifting. If you have not installed such an application,
Portal
Transforms will not perfom Word to HTML transforms.
One of the options is Open Office. It is not the easiest application to
set up
to work with Portal Transforms, but it works on both Windows and Unix
and delivers
fairly good HTML.
Problems
====================
- PyUNO is cool, but PyUNO now ship with its own python intepreter, which is not compatible with zope's
- PyUNO is not threadsafe now.
SETTING UP OPEN OFFICE ON WINDOWS
=======================================
WARNING: You can setup pyuno, but you can't use it concurrently. see 'Install oood'
1) Install Open Office 2.0
Just run the standard installer.
Pyuno in this version ship with python 2.3, which is compatible with Zope 2.7
2) Set the environment PATH
Add the Open Office program dir to the Windows PATH, e.g.
C:\Program Files\OpenOffice.org 1.9.82\program
See this article on how to set the Windows PATH:
http://vlaurie.com/computers2/Articles/environment.htm
You can also look at the python.bat (located in your Open Office program
dir)
for inspiration.
3) Set the PYTHONPATH
You need to add these directories to the PYTHONPATH:
a) The Open Office program dir (e.g. C:\Program Files\OpenOffice.org
1.9.82\program)
b) The Open Office python lib dir (e.g. C:\Program Files\OpenOffice.org
1.9.82\program\python-core-2.3.4\lib)
From the Windows system shell, just run e.g.:
set PYTHONPATH= C:\Program Files\OpenOffice.org 1.9.82\program
set PYTHONPATH= C:\Program Files\OpenOffice.org
1.9.82\program\python-core-2.3.4\lib
You can also look at the python.bat (located in your Open Office program
dir) for inspiration.
4) Start Open Office as UNO-server
Run soffice "-accept=socket,host=localhost,port=2002;urp;"
5) Now it should work
For Debian Linux Users
=========================
see: http://bibus-biblio.sourceforge.net/html/en/LinuxInstall.html
1. install version 1.1, which doesn't contain pyuno::
apt-get install openoffice
2. install a version of pyuno which enable ucs4 unicode
- you can download at http://sourceforge.net/projects/bibus-biblio/
- copy to /usr/lib/openoffice/program
3. set up environment variables
OPENOFFICE_PATH="/usr/lib/openoffice/program"
export PYTHONPATH="$OPENOFFICE_PATH"
export LD_LIBRARY_PATH="$OPENOFFICE_PATH"
Install oood
===================
Note, this product is for linux only
http://udk.openoffice.org/python/oood/
UNDERSTANDING OPEN OFFICE AND UNO
=============================================
Open Office allows programmers to remotely control it. Portal Transforms
takes
advantage of this opportunity by scripting Open Office from Python. It
is possible
through PyUNO that exposes the Open Office API in Python.
Now, you can't download and install PyUNO as a module for your the Python
interpreter that is running your Zope server. PyUNO only comes bundled
with Open
Office and the Python that is distributed with Open Office. To make
PyUNO work
from within your standard Python you must expand the PYTHONPATH as done
above so
Python also will look inside Open Office for modules. If it works you
should be
able to start up a Python shell and do
>>>import uno
In some cases you can be unlucky and the Python used for Zope is not in
sync with
the Python that is distributed with Open Office. That is solved by
rebuilding
Python -- a task that is beyond the scope of this guide.
=============================
Portal Transforms'User manual
=============================
:Author: Sylvain Thénault
:Contact: syt@logilab.fr
:Date: $Date: 2005-08-19 23:43:41 +0200 (Fre, 19 Aug 2005) $
:Version: $Revision: 1.7 $
:Web site: http://sourceforge.net/projects/archetypes
.. contents::
What does this package provide ?
================================
This package is both a python library for MIME type based content
transformation, including a command line tool, (what i call the python package)
and a Zope product providing two new tools for the CMF (what i call the Zope
product). A python only distribution will be available, where all the Zope
specific files won't be included.
Python side
===========
The *transform* command line tool
`````````````````````````````````
command line tool for MIME type based transformation
USAGE: transform [OPTIONS] input_file output_file
OPTIONS:
-h / --help
display this help message and exit.
-o / --output <output mime type>
output MIME type. (conflict with --transform)
-t / --transform <transform id>
id of the transform to apply. (conflict with --output)
EXAMPLE:
$ transform -o text/html dev_manual.rst dev_manual.html
Customization hook
``````````````````
You can customize the transformation engine by providing a module
"transform_customize" somewhere in your Python path. The module must provide a
*initialize* method which will take the engine as only argument. This method
will have the reponsability to initialize the engine with desired
transformations. When it's not found, the *initialize* method from the
*transforms* subpackage will be used.
Zope side
=========
The MIME types registry
```````````````````````
This tool registered known MIME types. The information associated with a MIME
type are :
* a title
* a list rfc-2046 types
* a list of files extensions
* a binary flag
* an icon path
You can see regitered types by going to the *mimetypes_registry* object at the
root of your CMF site, using the ZMI. There you can modify existent information
or add / delete types. This product cames with a default set of MIME types icons
located in portal_skins/mimetypes_icons.
The tranformation tool
``````````````````````
It's a MIME type based transformation engine. It's has been designed to
transform portal content from a given MIME type to another. You can add / delete
transformations by going to the *portal_transforms* object at the root of your
CMF site, using the ZMI. Some transformations are configurable, but not all. A
transform is a Python object implementing a special interface. See the
developper documentation if you're interested in writing a new
transformation.
Archetypes integration
``````````````````````
Archetypes will use this product for automatic transformation if you have
configurated it to use the new base unit (set USE_NEW_BASEUNIT to 1 in the
Archetypes config.py). If you're using the old base unit (still default in 1.0),
the transform tool won't be used (at least by the Archetypes library).
Default transformations
=======================
The default transformations are described here. They are separated in two groups,
safe and unsafe. Safe transforms are located in the *transforms* directory of this
product. Unsafe transforms are located in the *unsafe_transforms* directory and
are not registered by default. Moreover, there is no __init__.py file in this
directory so it requires a manual intervention to make them addable to the
transforms tool. Usually unsafe transforms are so called since they allow
configuration of a path to a binary executable on the server, which may be
indesirable for Zope service providers.
Safe transforms
```````````````
*st*
transforms Structured Text to HTML. Not configurable.
*rest*
transforms Re Structured Text to HTML. You need docutils to use this
transformation. Not configurable.
*word_to_html*
transforms M$ Word file to HTML, using either COM (on windows), wvWare or
PyUNO (from OpenOffice.org). Not configurable.
*pdf_to_html*
transforms Adobe PDF to HTML. This transforms requires the "pdftohtml"
program. Not Configurable.
*lynx_dump*
transforms HTML to plain text. This transforms requires the "lynx"
program. Not Configurable.
*python*
transforms Python source code to colorized HTML. You can configure used
colors.
*text_to_html*
transforms plain text file to HTML by replacing new lines with
<br/>. You can configure allowable inputs for this transform.
*rest_to_text*
This is an example use of the *identity* transform, which does
basically nothing :). It's used here to transform ReST files
(text/x-rst) to text/plain. You can configure allowable inputs and
outuput on this transform.
Unsafe transforms
`````````````````
*command*
this is a fully configurable transform based on external commands. For
instance, you can obtain the same transformation as the previous
*lynx_dump*:
1. add a new transform named "lynx_dump" with
"Products.PortalTransforms.unsafe_transforms.command" as module
(this supposes that you've added a __init__.py file to the
unsafe_transforms directory to make them importable).
2. go to the configure tab of this transform and set the following
parameters :
:binary_path: '/usr/bin/lynx'
:command_line: '-dump %s'
:input: 'text/html'
:output: 'text/plain'
*xml*
this transform has been designed to handle XML file on a doctype / DTD
basis. All the real transformation work is done by a xslt processor. This
transform only associate XSLT to doctypes or DTD, and use give the correct
transformation to the processor when some content has to be
transformed.
FIXME: add an example on how to setup docbook transform.
Advanced features
=================
Transformation chains
`````````````````````
A transformation chain is an ordered suite of transformations. A chain
itselve is a transformation. You can build a transformations chain
using the ZMI.
Transformation policy
`````````````````````
You can set a simple transformation policies for the transforms
tool. A policy say that when you try to convert content to a given
MIME type, you have to include a given transformation. For instance,
imagine you have a *html_tidy* tranformation which tidy HTML page, you
can say that the transformation path to text/html should include the
*html_tidy* transform.
Caches
``````
For efficiency, transformation's result are cached. You can set the
life time of a cached result using the ZMI. This is a time exprimed in
seconds.
<configure
xmlns="http://namespaces.zope.org/five"
>
<implements
class=".chain.chain"
interface=".z3.interfaces.IChain"
/>
<implements
class=".chain.chain"
interface=".z3.interfaces.ITransform"
/>
<implements
class=".data.datastream"
interface=".z3.interfaces.IDataStream"
/>
<implements
class=".Transform.Transform"
interface=".z3.interfaces.ITransform"
/>
<implements
class=".TransformEngine.TransformTool"
interface=".z3.interfaces.IEngine"
/>
<implements
class=".libtransforms.commandtransform.commandtransform"
interface=".z3.interfaces.ITransform"
/>
<implements
class=".libtransforms.commandtransform.popentransform"
interface=".z3.interfaces.ITransform"
/>
<implements
class=".libtransforms.retransform.retransform"
interface=".z3.interfaces.ITransform"
/>
<!-- TODO: more -->
</configure>
from Interface import Interface, Attribute
class idatastream(Interface):
"""data stream, is the result of a transform"""
def setData(value):
"""set the main data produced by a transform, i.e. usually a string"""
def getData():
"""provide access to the transformed data object, i.e. usually a string.
This data may references subobjects.
"""
def setSubObjects(objects):
"""set a dict-like object containing subobjects.
keys should be object's identifier (e.g. usually a filename) and
values object's content.
"""
def getSubObjects():
"""return a dict-like object with any optional subobjects associated
with the data"""
def getMetadata():
"""return a dict-like object with any optional metadata from
the transform
You can modify the returned dictionnary to add/change metadata
"""
def isCacheable():
"""Return a bool which indicates wether the result should be cached
Default is true
"""
def setCachable(value):
"""Set cacheable flag to yes or no
"""
class itransform(Interface):
"""A transformation plugin -- tranform data somehow
must be threadsafe and stateless"""
# inputs = Attribute("""list of imimetypes (or registered rfc-2046
# names) this transform accepts as inputs.""")
# output = Attribute("""output imimetype as instance or rfc-2046
# name""")
# output_encoding = Attribute("""output encoding of this transform.
# If not specified, the transform should output the same encoding as received data
# """)
def name(self):
"""return the name of the transform instance"""
def convert(data, idata, filename=None, **kwargs):
"""convert the data, store the result in idata and return that
optional argument filename may give the original file name of received data
additional arguments given to engine's convert, convertTo or __call__ are
passed back to the transform
The object on which the translation was invoked is available as context
(default: None)
"""
class ichain(itransform):
def registerTransform(transform, condition=None):
"""Append a transform to the chain"""
class iengine(Interface):
def registerTransform(transform):
"""register a transform
transform must implements itransform
"""
def unregisterTransform(name):
""" unregister a transform
name is the name of a registered transform
"""
def convertTo(mimetype, orig, data=None, object=None, context=None, **kwargs):
"""Convert orig to a given mimetype
* orig is an encoded string
* data an optional idatastream object. If None a new datastream will be
created and returned
* optional object argument is the object on which is bound the data.
If present that object will be used by the engine to bound cached data.
* optional context argument is the object on which the transformation
was called.
* additional arguments (kwargs) will be passed to the transformations.
return an object implementing idatastream or None if no path has been
found.
"""
def convert(name, orig, data=None, context=None, **kwargs):
"""run a tranform of a given name on data
* name is the name of a registered transform
see convertTo docstring for more info
"""
def __call__(name, orig, data=None, context=None, **kwargs):
"""run a transform by its name, returning the raw data product
* name is the name of a registered transform.
return an encoded string.
see convert docstring for more info on additional arguments.
"""
""" package containing some utilities which aim to facilitae transformation writing
"""
import os
import sys
import tempfile
import re
import shutil
from os.path import join, basename
from Products.PortalTransforms.interfaces import itransform
from Products.PortalTransforms.libtransforms.utils import bin_search, sansext, getShortPathName
class commandtransform:
"""abstract class for external command based transform
"""
__implements__ = itransform
def __init__(self, name=None, binary=None, **kwargs):
if name is not None:
self.__name__ = name
if binary is not None:
self.binary = bin_search(binary)
self.binary = getShortPathName(self.binary)
def name(self):
return self.__name__
def initialize_tmpdir(self, data, **kwargs):
"""create a temporary directory, copy input in a file there
return the path of the tmp dir and of the input file
"""
tmpdir = tempfile.mktemp()
os.mkdir(tmpdir)
filename = kwargs.get("filename", '')
fullname = join(tmpdir, basename(filename))
filedest = open(fullname , "wb").write(data)
return tmpdir, fullname
def subObjects(self, tmpdir):
imgs = []
for f in os.listdir(tmpdir):
result = re.match("^.+\.(?P<ext>.+)$", f)
if result is not None:
ext = result.group('ext')
if ext in ('png', 'jpg', 'gif'):
imgs.append(f)
path = join(tmpdir, '')
return path, imgs
def fixImages(self, path, images, objects):
for image in images:
objects[image] = open(join(path, image), 'rb').read()
def cleanDir(self, tmpdir):
shutil.rmtree(tmpdir)
class popentransform:
"""abstract class for external command based transform
Command must read from stdin and write to stdout
"""
__implements__ = itransform
binaryName = ""
binaryArgs = ""
useStdin = True
def __init__(self, name=None, binary=None, binaryArgs=None, useStdin=None,
**kwargs):
if name is not None:
self.__name__ = name
if binary is not None:
self.binary = bin_search(binary)
else:
self.binary = bin_search(self.binaryName)
if binaryArgs is not None:
self.binaryArgs = binaryArgs
if useStdin is not None:
self.useStdin = useStdin
def name(self):
return self.__name__
def getData(self, couterr):
return couterr.read()
def convert(self, data, cache, **kwargs):
command = "%s %s" % (self.binary, self.binaryArgs)
if not self.useStdin:
tmpfile, tmpname = tempfile.mkstemp(text=False) # create tmp
os.write(tmpfile, data) # write data to tmp using a file descriptor
os.close(tmpfile) # close it so the other process can read it
command = command % { 'infile' : tmpname } # apply tmp name to command
cin, couterr = os.popen4(command, 'b')
if self.useStdin:
cin.write(str(data))
status = cin.close()
out = self.getData(couterr)
couterr.close()
if not self.useStdin:
# remove tmp file
os.unlink(tmpname)
cache.setData(out)
return cache
from Products.PortalTransforms.interfaces import itransform
from StringIO import StringIO
import PIL.Image
class PILTransforms:
__implements__ = itransform
__name__ = "piltransforms"
def __init__(self, name=None):
if name is not None:
self.__name__ = name
def name(self):
return self.__name__
def convert(self, orig, data, **kwargs):
imgio = StringIO()
orig = StringIO(orig)
newwidth = kwargs.get('width',None)
newheight = kwargs.get('height',None)
pil_img = PIL.Image.open(orig)
if(self.format in ['jpeg','ppm']):
pil_img.draft("RGB", pil_img.size)
pil_img = pil_img.convert("RGB")
if(newwidth or newheight):
pil_img.thumbnail((newwidth,newheight),PIL.Image.ANTIALIAS)
pil_img.save(imgio,self.format)
data.setData(imgio.getvalue())
return data
def register():
return PILTransforms()
from Products.PortalTransforms.interfaces import itransform
import re
class retransform:
"""abstract class for regex transforms (re.sub wrapper)"""
__implements__ = itransform
inputs = ('text/',)
def __init__(self, name, *args):
self.__name__ = name
self.regexes = []
for pat, repl in args:
self.addRegex(pat, repl)
def name(self):
return self.__name__
def addRegex(self, pat, repl):
r = re.compile(pat)
self.regexes.append((r, repl))
def convert(self, orig, data, **kwargs):
for r, repl in self.regexes:
orig = r.sub(repl, orig)
data.setData(orig)
return data
import re
import os
import sys
from sgmllib import SGMLParser
try:
import win32api
WIN32 = True
except ImportError:
WIN32 = False
class MissingBinary(Exception): pass
envPath = os.environ['PATH']
bin_search_path = [path for path in envPath.split(os.pathsep)
if os.path.isdir(path)]
cygwin = 'c:/cygwin'
# cygwin support
if sys.platform == 'win32' and os.path.isdir(cygwin):
for p in ['/bin', '/usr/bin', '/usr/local/bin' ]:
p = os.path.join(cygwin, p)
if os.path.isdir(p):
bin_search_path.append(p)
if sys.platform == 'win32':
extensions = ('.exe', '.com', '.bat', )
else:
extensions = ()
def bin_search(binary):
"""search the bin_search_path for a given binary returning its fullname or
raises MissingBinary"""
result = None
mode = os.R_OK | os.X_OK
for path in bin_search_path:
for ext in ('', ) + extensions:
pathbin = os.path.join(path, binary) + ext
if os.access(pathbin, mode) == 1:
result = pathbin
break
if not result:
raise MissingBinary('Unable to find binary "%s" in %s' %
(binary, os.pathsep.join(bin_search_path)))
else:
return result
def getShortPathName(binary):
if WIN32:
try:
binary = win32api.GetShortPathName(binary)
except win32api.error:
log("Failed to GetShortPathName for '%s'" % binary)
return binary
def sansext(path):
return os.path.splitext(os.path.basename(path))[0]
##########################################################################
# The code below is taken from CMFDefault.utils to remove
# dependencies for Python-only installations
##########################################################################
def bodyfinder(text):
""" Return body or unchanged text if no body tags found.
Always use html_headcheck() first.
"""
lowertext = text.lower()
bodystart = lowertext.find('<body')
if bodystart == -1:
return text
bodystart = lowertext.find('>', bodystart) + 1
if bodystart == 0:
return text
bodyend = lowertext.rfind('</body>', bodystart)
if bodyend == -1:
return text
return text[bodystart:bodyend]
#
# HTML cleaning code
#
# These are the HTML tags that we will leave intact
VALID_TAGS = { 'a' : 1
, 'b' : 1
, 'base' : 0
, 'blockquote' : 1
, 'body' : 1
, 'br' : 0
, 'caption' : 1
, 'cite' : 1
, 'code' : 1
, 'div' : 1
, 'dl' : 1
, 'dt' : 1
, 'dd' : 1
, 'em' : 1
, 'h1' : 1
, 'h2' : 1
, 'h3' : 1
, 'h4' : 1
, 'h5' : 1
, 'h6' : 1
, 'head' : 1
, 'hr' : 0
, 'html' : 1
, 'i' : 1
, 'img' : 0
, 'kbd' : 1
, 'li' : 1
, 'meta' : 0
, 'ol' : 1
, 'p' : 1
, 'pre' : 1
, 'span' : 1
, 'strong' : 1
, 'table' : 1
, 'tbody' : 1
, 'td' : 1
, 'th' : 1
, 'title' : 1
, 'tr' : 1
, 'tt' : 1
, 'u' : 1
, 'ul' : 1
}
NASTY_TAGS = { 'script' : 1
, 'object' : 1
, 'embed' : 1
, 'applet' : 1
}
class IllegalHTML( ValueError ):
pass
class StrippingParser( SGMLParser ):
""" Pass only allowed tags; raise exception for known-bad. """
from htmlentitydefs import entitydefs # replace entitydefs from sgmllib
def __init__( self ):
SGMLParser.__init__( self )
self.result = ""
def handle_data( self, data ):
if data:
self.result = self.result + data
def handle_charref( self, name ):
self.result = "%s&#%s;" % ( self.result, name )
def handle_entityref(self, name):
if self.entitydefs.has_key(name):
x = ';'
else:
# this breaks unstandard entities that end with ';'
x = ''
self.result = "%s&%s%s" % (self.result, name, x)
def unknown_starttag(self, tag, attrs):
""" Delete all tags except for legal ones.
"""
if VALID_TAGS.has_key(tag):
self.result = self.result + '<' + tag
for k, v in attrs:
if k.lower().startswith( 'on' ):
raise IllegalHTML, 'Javascipt event "%s" not allowed.' % k
if v.lower().startswith( 'javascript:' ):
raise IllegalHTML, 'Javascipt URI "%s" not allowed.' % v
self.result = '%s %s="%s"' % (self.result, k, v)
endTag = '</%s>' % tag
if VALID_TAGS.get(tag):
self.result = self.result + '>'
else:
self.result = self.result + ' />'
elif NASTY_TAGS.get( tag ):
raise IllegalHTML, 'Dynamic tag "%s" not allowed.' % tag
else:
pass # omit tag
def unknown_endtag(self, tag):
if VALID_TAGS.get( tag ):
self.result = "%s</%s>" % (self.result, tag)
remTag = '</%s>' % tag
def scrubHTML( html ):
""" Strip illegal HTML tags from string text. """
parser = StrippingParser()
parser.feed( html )
parser.close()
return parser.result
##############################################################################
#
# Copyright (c) 2001 Zope Corporation and Contributors. All Rights Reserved.
#
# This software is subject to the provisions of the Zope Public License,
# Version 2.0 (ZPL). A copy of the ZPL should accompany this distribution.
# THIS SOFTWARE IS PROVIDED "AS IS" AND ANY AND ALL EXPRESS OR IMPLIED
# WARRANTIES ARE DISCLAIMED, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED
# WARRANTIES OF TITLE, MERCHANTABILITY, AGAINST INFRINGEMENT, AND FITNESS
# FOR A PARTICULAR PURPOSE
#
##############################################################################
"""Wrapper to integrate reStructuredText into Zope
This implementation requires docutils 0.3.4+ from http://docutils.sf.net/
Based on the new implementation of Zope 2.7.1 altered for PortalTransforms
"""
try:
import docutils
except ImportError:
raise ImportError, 'Please install docutils 0.3.3+ from http://docutils.sourceforge.net/#download.'
version = docutils.__version__.split('.')
if version < ['0', '3', '3']:
raise ImportError, """Old version of docutils found:
Got: %(version)s, required: 0.3.3+
Please remove docutils from %(path)s and replace it with a new version. You
can download docutils at http://docutils.sourceforge.net/#download.
""" % {'version' : docutils.__version__, 'path' : docutils.__path__[0] }
import sys, os, locale
##from App.config import getConfiguration
from docutils.core import publish_parts
# get encoding
##default_enc = sys.getdefaultencoding()
##default_output_encoding = getConfiguration().rest_output_encoding or default_enc
##default_input_encoding = getConfiguration().rest_input_encoding or default_enc
default_enc = 'utf-8'
default_output_encoding = default_enc
default_input_encoding = default_enc
# starting level for <H> elements (default behaviour inside Zope is <H3>)
default_level = 3
##initial_header_level = getConfiguration().rest_header_level or default_level
initial_header_level = default_level
# default language
##default_lang = getConfiguration().locale or locale.getdefaultlocale()[0]
default_lang = locale.getdefaultlocale()[0]
if default_lang and '_' in default_lang:
default_lang = default_lang[:default_lang.index('_')]
class Warnings:
def __init__(self):
self.messages = []
def write(self, message):
self.messages.append(message)
def render(src,
writer='html4css1',
report_level=1,
stylesheet='default.css',
input_encoding=default_input_encoding,
output_encoding=default_output_encoding,
language_code=default_lang,
initial_header_level = initial_header_level,
settings = {}):
"""get the rendered parts of the document the and warning object
"""
# Docutils settings:
settings = settings.copy()
settings['input_encoding'] = input_encoding
settings['output_encoding'] = output_encoding
settings['stylesheet'] = stylesheet
settings['language_code'] = language_code
# starting level for <H> elements:
settings['initial_header_level'] = initial_header_level + 1
# set the reporting level to something sane:
settings['report_level'] = report_level
# don't break if we get errors:
settings['halt_level'] = 6
# remember warnings:
settings['warning_stream'] = warning_stream = Warnings()
parts = publish_parts(source=src, writer_name=writer,
settings_overrides=settings,
config_section='zope application')
return parts, warning_stream
def HTML(src,
writer='html4css1',
report_level=1,
stylesheet='default.css',
input_encoding=default_input_encoding,
output_encoding=default_output_encoding,
language_code=default_lang,
initial_header_level = initial_header_level,
warnings = None,
settings = {}):
""" render HTML from a reStructuredText string
- 'src' -- string containing a valid reST document
- 'writer' -- docutils writer
- 'report_level' - verbosity of reST parser
- 'stylesheet' - Stylesheet to be used
- 'input_encoding' - encoding of the reST input string
- 'output_encoding' - encoding of the rendered HTML output
- 'report_level' - verbosity of reST parser
- 'language_code' - docutils language
- 'initial_header_level' - level of the first header tag
- 'warnings' - will be overwritten with a string containing the warnings
- 'settings' - dict of settings to pass in to Docutils, with priority
"""
parts, warning_stream = render(src,
writer = writer,
report_level = report_level,
stylesheet = stylesheet,
input_encoding = input_encoding,
output_encoding = output_encoding,
language_code=language_code,
initial_header_level = initial_header_level,
settings = settings)
header = '<h%(level)s class="title">%(title)s</h%(level)s>\n' % {
'level': initial_header_level,
'title': parts['title'],
}
body = '%(docinfo)s%(body)s' % {
'docinfo': parts['docinfo'],
'body': parts['body'],
}
if parts['title']:
output = header + body
else:
output = body
warnings = ''.join(warning_stream.messages)
return output.encode(output_encoding)
__all__ = ("HTML", 'render')
##############################################################################
#
# ZopeTestCase
#
# COPY THIS FILE TO YOUR 'tests' DIRECTORY.
#
# This version of framework.py will use the SOFTWARE_HOME
# environment variable to locate Zope and the Testing package.
#
# If the tests are run in an INSTANCE_HOME installation of Zope,
# Products.__path__ and sys.path with be adjusted to include the
# instance's Products and lib/python directories respectively.
#
# If you explicitly set INSTANCE_HOME prior to running the tests,
# auto-detection is disabled and the specified path will be used
# instead.
#
# If the 'tests' directory contains a custom_zodb.py file, INSTANCE_HOME
# will be adjusted to use it.
#
# If you set the ZEO_INSTANCE_HOME environment variable a ZEO setup
# is assumed, and you can attach to a running ZEO server (via the
# instance's custom_zodb.py).
#
##############################################################################
#
# The following code should be at the top of every test module:
#
# import os, sys
# if __name__ == '__main__':
# execfile(os.path.join(sys.path[0], 'framework.py'))
#
# ...and the following at the bottom:
#
# if __name__ == '__main__':
# framework()
#
##############################################################################
__version__ = '0.2.3'
# Save start state
#
__SOFTWARE_HOME = os.environ.get('SOFTWARE_HOME', '')
__INSTANCE_HOME = os.environ.get('INSTANCE_HOME', '')
if __SOFTWARE_HOME.endswith(os.sep):
__SOFTWARE_HOME = os.path.dirname(__SOFTWARE_HOME)
if __INSTANCE_HOME.endswith(os.sep):
__INSTANCE_HOME = os.path.dirname(__INSTANCE_HOME)
# Find and import the Testing package
#
if not sys.modules.has_key('Testing'):
p0 = sys.path[0]
if p0 and __name__ == '__main__':
os.chdir(p0)
p0 = ''
s = __SOFTWARE_HOME
p = d = s and s or os.getcwd()
while d:
if os.path.isdir(os.path.join(p, 'Testing')):
zope_home = os.path.dirname(os.path.dirname(p))
sys.path[:1] = [p0, p, zope_home]
break
p, d = s and ('','') or os.path.split(p)
else:
print 'Unable to locate Testing package.',
print 'You might need to set SOFTWARE_HOME.'
sys.exit(1)
import Testing, unittest
execfile(os.path.join(os.path.dirname(Testing.__file__), 'common.py'))
# Include ZopeTestCase support
#
if 1: # Create a new scope
p = os.path.join(os.path.dirname(Testing.__file__), 'ZopeTestCase')
if not os.path.isdir(p):
print 'Unable to locate ZopeTestCase package.',
print 'You might need to install ZopeTestCase.'
sys.exit(1)
ztc_common = 'ztc_common.py'
ztc_common_global = os.path.join(p, ztc_common)
f = 0
if os.path.exists(ztc_common_global):
execfile(ztc_common_global)
f = 1
if os.path.exists(ztc_common):
execfile(ztc_common)
f = 1
if not f:
print 'Unable to locate %s.' % ztc_common
sys.exit(1)
# Debug
#
print 'SOFTWARE_HOME: %s' % os.environ.get('SOFTWARE_HOME', 'Not set')
print 'INSTANCE_HOME: %s' % os.environ.get('INSTANCE_HOME', 'Not set')
sys.stdout.flush()
This diff is collapsed.
<?xml version="1.0" encoding="ISO-8859-1"?>
<!DOCTYPE rss PUBLIC "-//Netscape Communications//DTD RSS 0.91//EN" "http://my.netscape.com/publish/formats/rss-0.91.dtd">
<rss version="0.91"><channel><title>Logilab.org news</title><language>en</language><item><title>xmltools 1.3.7</title><descr>bugfix in namespace handling</descr></item><item><title>Python-logic</title><descr>Set up of the Python-Logic special interest group</descr></item><item><title>PyReverse 0.2.3</title><descr>New features and bug fixes</descr></item><item><title>xmltools 1.3.6</title><descr>Uses the new APIs in pyxml-0.7 and 4Suite-0.12.0</descr></item><item><title>hmm-0.2</title><descr>New learning algorithms available</descr></item><item><title>Version 1.2a1 is out</title><descr>Overall refactoring of the engine. Backward incompatible changes
in the syntax of recipes and in modules, in order to ease product development.</descr></item><item><title>XMLdiff v0.5.3 (bug fixes)</title><descr>Version 0.5.3 fixes packaging bugs.</descr></item><item><title>hmm-0.1</title><descr>hmm is a module for Hidden Markov Model manipulation.</descr></item><item><title>PyReverse 0.1 (new product)</title><descr>
Beta release for this set of tools for reverse engineering python code
</descr></item><item><title>PyPaSax 0.3 (bug fixes)</title><descr>A few changes in the DTD, improved PyXML compatibility</descr></item><item><title>XMLdiff v0.5.2 (bug fixes)</title><descr>Version 0.5.2 fixes several bugs.</descr></item><item><title>Version 1.1 is out</title><descr>bugfixes over beta 3.</descr></item><item><title>Version 1.1b3 is out</title><descr>Great speed improvement for Horn. All-in-one windows installer.</descr></item><item><title>xmltools-1.3.5</title><descr>Version 1.3.5 code cleanup.</descr></item><item><title>xmltools-1.3.4</title><descr>Version 1.3.4 fixes a sever encoding bug that could cause crashes on windows machines.</descr></item><item><title>Version 1.1b1 is out</title><descr>Version 1.1b1 drops support for Python 1.5.2 in favor of Python 2.1, and features a new version of Horn, with localization support</descr></item><item><title>XMLdiff v0.5 (algorithm change, bug fixes)</title><descr>Version 0.5. The new algorithm makes it now usable either on big
documents and really faster in any cases. Fixes Unicode problem.</descr></item><item><title>XMLtools v1.3.1 (bugfixes)</title><descr>Version 1.3.1. This release fixes some minor glitches that had slipped in 1.3.</descr></item><item><title>XMLdiff v0.2 (performance improvement)</title><descr>Version 0.2. Huge performance improvement, and output cleanup.</descr></item><item><title>XMLdiff v0.1.1 (beta release)</title><descr>Version 0.1.1. Fully functionnal. Beta release.</descr></item><item><title>XPathVis v1.0beta (beta release)</title><descr>Version 1.0beta. Works nicely.</descr></item><item><title>XMLtools v1.3 (new features)</title><descr>Version 1.3. This release is compatible with Python 2.x and Unicode. It is not guaranteed to work with Python 1.5.2.</descr></item><item><title>Narval on developerWorks</title><descr>An Introduction to Narval was published on developerWorks.</descr><link>http://www-106.ibm.com/developerworks/library/l-ai/</link></item><item><title>Version 1.0.1 is out</title><descr>Version 1.0.1 is a bugfix release.</descr></item><item><title>Narval reviewed on AI.About.com</title><descr>AI.About.com published a review of Narval.</descr><link>http://ai.about.com/compute/ai/library/weekly/aa060801a.htm</link></item><item><title>Narval at BotShow 2001</title><descr>Narval was presented at the first BotShow event. The slides will soon be available online.</descr><link>http://www.ptolemee.com/botshow/text/text_fr/edito/edito_set.html</link></item><item><title>Version 1.0 is out</title><descr>Version 1.0. Celebration time, come on!</descr></item><item><title>Network-boot-HOWTO v0.2.1</title><descr>Version 0.2.1 is out.</descr></item><item><title>GuessLang v0.1.0 (beta release)</title><descr>Version 0.1.0 is out.</descr></item><item><title>Network-boot-HOWTO v0.1.1</title><descr>Version 0.1.1 is out.</descr></item><item><title>PyPaSax v0.1</title><descr>Version 0.1 is out.</descr></item><item><title>RC2 is out</title><descr>Release Candidate 2 is out. French documentation will be updated within a few days. We also released several applications (or maybe extension sets?) that are in alpha/beta stage. Give them a try!</descr></item><item><title>VCalSax v0.1 (beta)</title><descr>Version 0.1 is out. Still beta, but fully functional.</descr></item><item><title>Talk at LinuxExpo in English</title><descr>A translation of the talk we gave at Linux Expo 2001 is available on-line.</descr><link>http://www.logilab.com/press/linux-expo2001/</link></item><item><title>RC1 is out</title><descr>Release Candidate 1 is out. Documentation will be updated within a few days. Please help us test this one so that we can release 1.0 quicker.</descr></item><item><title>XMLtools v1.2 (stable release)</title><descr>Version 1.2 is released. Bugfixes, mainly..</descr></item><item><title>Application section on web site</title><descr>We just added a new applications section on Logilab.org web site.</descr><link>http://www.logilab.org/narval/app.html</link></item><item><title>WMgMon v0.4.0</title><descr>version 0.4.0 is out. Bugfixes and new monitor functions. </descr></item><item><title>XmlTools v1.1</title><descr>version 1.1 is out. New features in XmlTree.</descr></item><item><title>Beta5 is out</title><descr>Beta 5 is out. Lots of bugfixes in Narval and Horn, client server communication between the kernel and the graphical interface using SOAP. Windows specific bugfixes.</descr></item><item><title>XmlTools v1.0</title><descr>Initial release.</descr></item><item><title>PyGantt v0.6.0</title><descr>Version 0.6.0 released. New features added.</descr></item><item><title>Beta4 is out</title><descr>Beta 4 is out. Improved Windows compatibility. New features and bugfixes in both Narval and Horn.</descr></item><item><title>Article on Narval in Linux Gazette</title><descr>We published an article in the #59 issue of the Linux Gazette. It describes Narval and its use to set up Gazo, the assistant-coordinator for the translation of the Linux Gazette.</descr><link>http://www.linuxgazette.com/issue59/chauvat.html</link></item><item><title>Beta3 is out</title><descr>Beta 3 is out. No more memory leaks (almost). Time conditions work correctly. A step can be an XSL transform. Changes in the Narval DTD.</descr></item><item><title>Logilab invited at Linux Expo</title><descr>We got invited to give a talk at Linux Expo in Paris, France. The talk will be geared toward business uses of Narval. The title will be Using XML and Intelligent Personnal Assistants to enhance groupware and workflow enterprise applications. Come and meet with us!</descr><link>http://www.linuxexpoparis.com/EN/conferences</link></item><item><title>Beta2 is out</title><descr>Beta 2 is out. Installation is much easier. Tutorial. GUI improvements. Bugfixes.</descr></item><item><title>Beta1 is out</title><descr>Beta 1 is out. Many bug fixes.</descr></item><item><title>Beta0 is out</title><descr>Beta 0 is out. Logilab.org is one-line.</descr><link>http://www.logilab.org</link></item></channel></rss>
This is a test of the *reST* transform
o one
o two
o three
Heading 1
=========
Some text.
Heading 2
---------
Some text, bla ble bli blo blu. Yes, i know this is Stupid_.
.. _Stupid: http://www.example.com
=====
Title
=====
--------
Subtitle
--------
This is a test document to make sure subtitle gets the right heading.
Now the real heading
====================
The brown fox jumped over the lazy dog.
With a subheading
------------------
Some text, bla ble bli blo blu. Yes, i know this is Stupid_.
.. _Stupid: http://www.example.com
<?xml version="1.0" encoding="ISO-8859-1"?>
<xsl:transform xmlns:xsl='http://www.w3.org/1999/XSL/Transform' version='1.0'>
<xsl:strip-space elements='*'/>
<xsl:output method='xml'/>
<!-- Narval prototype ====================================================== -->
<al:prototype xmlns:al="http://www.logilab.org/namespaces/Narval/1.2">
<al:description lang="fr">Transforme du RSS en du HTML.</al:description>
<al:description lang="en">Turns RSS into HTML.</al:description>
<al:input id="input"><al:match>rss</al:match></al:input>
<al:output id="output" list="yes"><al:match>html-body</al:match></al:output>
</al:prototype>
<!-- root ================================================================== -->
<xsl:template match='rss/rss/channel'>
<html-body>
<h2>
<xsl:value-of select='title'/>
</h2>
<p>
<xsl:element name='a'>
<xsl:attribute name='href'><xsl:value-of select='link'/></xsl:attribute>
<xsl:value-of select='title'/>
</xsl:element>
<em><xsl:value-of select='description'/></em>
</p>
<table>
<xsl:apply-templates select='item'/>
</table>
</html-body>
</xsl:template>
<xsl:template match='item'>
<tr>
<td>
<xsl:element name='a'>
<xsl:attribute name='href'><xsl:value-of select='link'/></xsl:attribute>
<xsl:value-of select='title'/>
</xsl:element>
<xsl:apply-templates mode='multi' select='description'/>
</td>
</tr>
</xsl:template>
</xsl:transform>
<?xml version="1.0" encoding="utf-8" ?>
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
<html xmlns="http://www.w3.org/1999/xhtml" xml:lang="en" lang="en">
<head>
<meta http-equiv="Content-Type" content="text/html; charset=utf-8" />
<meta name="generator" content="Docutils 0.2.8: http://docutils.sourceforge.net/" />
<title>Copying Docutils</title>
<meta name="author" content="David Goodger" />
<meta name="date" content="2002-10-03" />
<link rel="stylesheet" href="tools/stylesheets/default.css" type="text/css" />
</head>
<body>
<div class="document" id="copying-docutils">
<h1 class="title">Copying Docutils</h1>
<table class="docinfo" frame="void" rules="none">
<col class="docinfo-name" />
<col class="docinfo-content" />
<tbody valign="top">
<tr><th class="docinfo-name">Author:</th>
<td>David Goodger</td></tr>
<tr><th class="docinfo-name">Contact:</th>
<td><a class="first last reference" href="mailto:goodger&#64;users.sourceforge.net">goodger&#64;users.sourceforge.net</a></td></tr>
<tr><th class="docinfo-name">Date:</th>
<td>2002-10-03</td></tr>
<tr class="field"><th class="docinfo-name">Web site:</th><td class="field-body"><a class="reference" href="http://docutils.sourceforge.net/">http://docutils.sourceforge.net/</a></td>
</tr>
</tbody>
</table>
<p>Most of the files included in this project are in the public domain,
and therefore have no license requirement and no restrictions on
copying or usage. The exceptions are:</p>
<ul class="simple">
<li>docutils/optik.py, copyright Gregory P. Ward, released under a
BSD-style license (which can be found in the module's source code).</li>
<li>docutils/roman.py, copyright by Mark Pilgrim, released under the
<a class="reference" href="http://www.python.org/2.1.1/license.html">Python 2.1.1 license</a>.</li>
<li>test/difflib.py, copyright by the Python Software Foundation,
released under the <a class="reference" href="http://www.python.org/2.2/license.html">Python 2.2 license</a>. This file is included for
compatibility with Python versions less than 2.2; if you have Python
2.2 or higher, difflib.py is not needed and may be removed. (It's
only used to report test failures anyhow; it isn't installed
anywhere. The included file is a pre-generator version of the
difflib.py module included in Python 2.2.)</li>
</ul>
<p>(Disclaimer: I am not a lawyer.) Both the BSD license and the Python
license are <a class="reference" href="http://opensource.org/licenses/">OSI-approved</a> and <a class="reference" href="http://www.gnu.org/philosophy/license-list.html">GPL-compatible</a>. Although complicated
by multiple owners and lots of legalese, the Python license basically
lets you copy, use, modify, and redistribute files as long as you keep
the copyright attribution intact, note any changes you make, and don't
use the owner's name in vain. The BSD license is similar.</p>
</div>
<hr class="footer"/>
<div class="footer">
Generated on: 2003-04-19 15:32 UTC.
Generated by <a class="reference" href="http://docutils.sourceforge.net/">Docutils</a> from <a class="reference" href="http://docutils.sourceforge.net/rst.html">reStructuredText</a> source.
</div>
</body>
</html>
""" nice docstring """
class A : pass
# comment
def inc(i):
return i+1
def greater(a, b):
"""foo <html />"""
return a > b
<?xml version="1.0" encoding="utf-8" ?>
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
<html xmlns="http://www.w3.org/1999/xhtml" xml:lang="en" lang="en">
<head>
<meta http-equiv="Content-Type" content="text/html; charset=utf-8" />
<title>Test page for save html rendering</title>
<meta name="date" content="2005-07-22" />
</head>
<body>
<h1>Test page</h1>
<table>
<tr>
<th>Test1</th>
<td>test2</td>
</tr>
</table>
<p>This is a text used as a blind text.</p>
<ul>
<li>A sample list item1</li>
<li>A sample list item2</li>
</ul>
<p>This is again a blind text with a<br>line break.</p>
<div>
Can we <q>quote</q> or write something we <del>didn't</del> mean to write? Or how is <ins>this</ins> instead?
</div>
<hr>
<div>
<a href="http://www.plone.org"><img src="http://www.plone.org/logo.jpg"/></a> is just great.
</div>
</body>
</html>
This diff is collapsed.
This diff is collapsed.
P6
24 23
255
̙̙
\ No newline at end of file
<?xml version="1.0"?>
Logilab.org newsen<tr><td><a href="">xmltools 1.3.7</a></td></tr><tr><td><a href="">Python-logic</a></td></tr><tr><td><a href="">PyReverse 0.2.3</a></td></tr><tr><td><a href="">xmltools 1.3.6</a></td></tr><tr><td><a href="">hmm-0.2</a></td></tr><tr><td><a href="">Version 1.2a1 is out</a></td></tr><tr><td><a href="">XMLdiff v0.5.3 (bug fixes)</a></td></tr><tr><td><a href="">hmm-0.1</a></td></tr><tr><td><a href="">PyReverse 0.1 (new product)</a></td></tr><tr><td><a href="">PyPaSax 0.3 (bug fixes)</a></td></tr><tr><td><a href="">XMLdiff v0.5.2 (bug fixes)</a></td></tr><tr><td><a href="">Version 1.1 is out</a></td></tr><tr><td><a href="">Version 1.1b3 is out</a></td></tr><tr><td><a href="">xmltools-1.3.5</a></td></tr><tr><td><a href="">xmltools-1.3.4</a></td></tr><tr><td><a href="">Version 1.1b1 is out</a></td></tr><tr><td><a href="">XMLdiff v0.5 (algorithm change, bug fixes)</a></td></tr><tr><td><a href="">XMLtools v1.3.1 (bugfixes)</a></td></tr><tr><td><a href="">XMLdiff v0.2 (performance improvement)</a></td></tr><tr><td><a href="">XMLdiff v0.1.1 (beta release)</a></td></tr><tr><td><a href="">XPathVis v1.0beta (beta release)</a></td></tr><tr><td><a href="">XMLtools v1.3 (new features)</a></td></tr><tr><td><a href="http://www-106.ibm.com/developerworks/library/l-ai/">Narval on developerWorks</a></td></tr><tr><td><a href="">Version 1.0.1 is out</a></td></tr><tr><td><a href="http://ai.about.com/compute/ai/library/weekly/aa060801a.htm">Narval reviewed on AI.About.com</a></td></tr><tr><td><a href="http://www.ptolemee.com/botshow/text/text_fr/edito/edito_set.html">Narval at BotShow 2001</a></td></tr><tr><td><a href="">Version 1.0 is out</a></td></tr><tr><td><a href="">Network-boot-HOWTO v0.2.1</a></td></tr><tr><td><a href="">GuessLang v0.1.0 (beta release)</a></td></tr><tr><td><a href="">Network-boot-HOWTO v0.1.1</a></td></tr><tr><td><a href="">PyPaSax v0.1</a></td></tr><tr><td><a href="">RC2 is out</a></td></tr><tr><td><a href="">VCalSax v0.1 (beta)</a></td></tr><tr><td><a href="http://www.logilab.com/press/linux-expo2001/">Talk at LinuxExpo in English</a></td></tr><tr><td><a href="">RC1 is out</a></td></tr><tr><td><a href="">XMLtools v1.2 (stable release)</a></td></tr><tr><td><a href="http://www.logilab.org/narval/app.html">Application section on web site</a></td></tr><tr><td><a href="">WMgMon v0.4.0</a></td></tr><tr><td><a href="">XmlTools v1.1</a></td></tr><tr><td><a href="">Beta5 is out</a></td></tr><tr><td><a href="">XmlTools v1.0</a></td></tr><tr><td><a href="">PyGantt v0.6.0</a></td></tr><tr><td><a href="">Beta4 is out</a></td></tr><tr><td><a href="http://www.linuxgazette.com/issue59/chauvat.html">Article on Narval in Linux Gazette</a></td></tr><tr><td><a href="">Beta3 is out</a></td></tr><tr><td><a href="http://www.linuxexpoparis.com/EN/conferences">Logilab invited at Linux Expo</a></td></tr><tr><td><a href="">Beta2 is out</a></td></tr><tr><td><a href="">Beta1 is out</a></td></tr><tr><td><a href="http://www.logilab.org">Beta0 is out</a></td></tr>
<p>This is a test of the *reST* transform<br /> o one<br /> o two<br /> o three</p>
\ No newline at end of file
<dl class="docutils">
<dt>This is a test of the <em>reST</em> transform</dt>
<dd>o one
o two
o three</dd>
</dl>
This is a test of the *reST* transform
o one
o two
o three
<h2 class="title">Heading 1</h2>
<p>Some text.</p>
<div class="section" id="heading-2">
<h3><a name="heading-2">Heading 2</a></h3>
<p>Some text, bla ble bli blo blu. Yes, i know this is <a class="reference" href="http://www.example.com">Stupid</a>.</p>
</div>
<h2 class="title">Title</h2>
<h3 class="subtitle">Subtitle</h3>
<p>This is a test document to make sure subtitle gets the right heading.</p>
<div class="section" id="now-the-real-heading">
<h3><a name="now-the-real-heading">Now the real heading</a></h3>
<p>The brown fox jumped over the lazy dog.</p>
<div class="section" id="with-a-subheading">
<h4><a name="with-a-subheading">With a subheading</a></h4>
<p>Some text, bla ble bli blo blu. Yes, i know this is <a class="reference" href="http://www.example.com">Stupid</a>.</p>
</div>
</div>
Copying Docutils
Copying Docutils
Author:
David Goodger
Contact:
goodger&#64;users.sourceforge.net
Date:
2002-10-03
Web site:http://docutils.sourceforge.net/
Most of the files included in this project are in the public domain,
and therefore have no license requirement and no restrictions on
copying or usage. The exceptions are:
docutils/optik.py, copyright Gregory P. Ward, released under a
BSD-style license (which can be found in the module's source code).
docutils/roman.py, copyright by Mark Pilgrim, released under the
Python 2.1.1 license.
test/difflib.py, copyright by the Python Software Foundation,
released under the Python 2.2 license. This file is included for
compatibility with Python versions less than 2.2; if you have Python
2.2 or higher, difflib.py is not needed and may be removed. (It's
only used to report test failures anyhow; it isn't installed
anywhere. The included file is a pre-generator version of the
difflib.py module included in Python 2.2.)
(Disclaimer: I am not a lawyer.) Both the BSD license and the Python
license are OSI-approved and GPL-compatible. Although complicated
by multiple owners and lots of legalese, the Python license basically
lets you copy, use, modify, and redistribute files as long as you keep
the copyright attribution intact, note any changes you make, and don't
use the owner's name in vain. The BSD license is similar.
Generated on: 2003-04-19 15:32 UTC.
Generated by Docutils from reStructuredText source.
Copying Docutils
Author: David Goodger
Contact: goodger@users.sourceforge.net
Date: 2002-10-03
Web site: http://docutils.sourceforge.net/
Most of the files included in this project are in the public domain,
and therefore have no license requirement and no restrictions on
copying or usage. The exceptions are:
* docutils/optik.py, copyright Gregory P. Ward, released under a
BSD-style license (which can be found in the module's source
code).
* docutils/roman.py, copyright by Mark Pilgrim, released under the
Python 2.1.1 license.
* test/difflib.py, copyright by the Python Software Foundation,
released under the Python 2.2 license. This file is included for
compatibility with Python versions less than 2.2; if you have
Python 2.2 or higher, difflib.py is not needed and may be removed.
(It's only used to report test failures anyhow; it isn't installed
anywhere. The included file is a pre-generator version of the
difflib.py module included in Python 2.2.)
(Disclaimer: I am not a lawyer.) Both the BSD license and the Python
license are OSI-approved and GPL-compatible. Although complicated by
multiple owners and lots of legalese, the Python license basically
lets you copy, use, modify, and redistribute files as long as you keep
the copyright attribution intact, note any changes you make, and don't
use the owner's name in vain. The BSD license is similar.
_________________________________________________________________
Generated on: 2003-04-19 15:32 UTC. Generated by Docutils from
reStructuredText source.
<pre class="python">
<span style="color: #004080;">&quot;&quot;&quot; nice docstring &quot;&quot;&quot;</span>
<span style="color: #C00000;">class</span> <span style="color: #000000;">A</span> <span style="color: #0000C0;">:</span> <span style="color: #C00000;">pass</span>
<span style="color: #008000;"># comment
</span>
<span style="color: #C00000;">def</span> <span style="color: #000000;">inc</span><span style="color: #0000C0;">(</span><span style="color: #000000;">i</span><span style="color: #0000C0;">)</span><span style="color: #0000C0;">:</span>
<span style="color: #C00000;">return</span> <span style="color: #000000;">i</span><span style="color: #0000C0;">+</span><span style="color: #0080C0;">1</span>
<span style="color: #C00000;">def</span> <span style="color: #000000;">greater</span><span style="color: #0000C0;">(</span><span style="color: #000000;">a</span><span style="color: #0000C0;">,</span> <span style="color: #000000;">b</span><span style="color: #0000C0;">)</span><span style="color: #0000C0;">:</span>
<span style="color: #004080;">&quot;&quot;&quot;foo &lt;html /&gt;&quot;&quot;&quot;</span>
<span style="color: #C00000;">return</span> <span style="color: #000000;">a</span> <span style="color: #0000C0;">&gt;</span> <span style="color: #000000;">b</span>
</pre>
<h1>Test page</h1>
<table>
<tr>
<th>Test1</th>
<td>test2</td>
</tr>
</table>
<p>This is a text used as a blind text.</p>
<ul>
<li>A sample list item1</li>
<li>A sample list item2</li>
</ul>
<p>This is again a blind text with a<br />line break.</p>
<div>
Can we <q>quote</q> or write something we <del>didn't</del> mean to write? Or how is <ins>this</ins> instead?
</div>
<hr />
<div>
<a href="http://www.plone.org"><img src="http://www.plone.org/logo.jpg" /></a> is just great.
</div>
\ No newline at end of file
<br />
<p><div name="Default" align="left" style=" padding: 0.00mm 0.00mm 0.00mm 0.00mm; ">
<p style="text-indent: 0.00mm; text-align: left; line-height: 4.166667mm; color: Black; background-color: White; ">
how odd: blank named file in directory
</p></div>
#
# Runs all tests in the current directory
#
# Execute like:
# python runalltests.py
#
# Alternatively use the testrunner:
# python /path/to/Zope/utilities/testrunner.py -qa
#
import os, sys
if __name__ == '__main__':
execfile(os.path.join(sys.path[0], 'framework.py'))
import unittest
TestRunner = unittest.TextTestRunner
suite = unittest.TestSuite()
tests = os.listdir(os.curdir)
tests = [n[:-3] for n in tests if n.startswith('test') and n.endswith('.py')]
for test in tests:
m = __import__(test)
if hasattr(m, 'test_suite'):
suite.addTest(m.test_suite())
if __name__ == '__main__':
TestRunner().run(suite)
#!/bin/bash
#
# example test runner shell script
#
# full path to the python interpretor
export PYTHON="/usr/local/bin/python2.3"
# path to ZOPE_HOME/lib/python
export SOFTWARE_HOME="/opt/zope/releases/Zope-2_7-branch/lib/python"
# path to your instance. Don't set it if you aren't having instance
export INSTANCE_HOME="/opt/zope/instances/plone21/"
${PYTHON} runalltests.py
import os, sys
if __name__ == '__main__':
execfile(os.path.join(sys.path[0], 'framework.py'))
from Testing import ZopeTestCase
from Products.Archetypes.tests.atsitetestcase import ATSiteTestCase
from Products.PortalTransforms.utils import TransformException
from Products.PortalTransforms.interfaces import *
from Products.PortalTransforms.chain import chain
import urllib
import time
import re
class BaseTransform:
def name(self):
return getattr(self, '__name__', self.__class__.__name__)
class HtmlToText(BaseTransform):
__implements__ = itransform
inputs = ('text/html',)
output = 'text/plain'
def __call__(self, orig, **kwargs):
orig = re.sub('<[^>]*>(?i)(?m)', '', orig)
return urllib.unquote(re.sub('\n+', '\n', orig)).strip()
def convert(self, orig, data, **kwargs):
orig = self.__call__(orig)
data.setData(orig)
return data
class HtmlToTextWithEncoding(HtmlToText):
output_encoding = 'ascii'
class FooToBar(BaseTransform):
__implements__ = itransform
inputs = ('text/*',)
output = 'text/plain'
def __call__(self, orig, **kwargs):
orig = re.sub('foo', 'bar', orig)
return urllib.unquote(re.sub('\n+', '\n', orig)).strip()
def convert(self, orig, data, **kwargs):
orig = self.__call__(orig)
data.setData(orig)
return data
class TransformNoIO(BaseTransform):
__implements__ = itransform
class BadTransformMissingImplements(BaseTransform):
__implements__ = None
inputs = ('text/*',)
output = 'text/plain'
class BadTransformBadMIMEType1(BaseTransform):
__implements__ = itransform
inputs = ('truc/muche',)
output = 'text/plain'
class BadTransformBadMIMEType2(BaseTransform):
__implements__ = itransform
inputs = ('text/plain',)
output = 'truc/muche'
class BadTransformNoInput(BaseTransform):
__implements__ = itransform
inputs = ()
output = 'text/plain'
class BadTransformWildcardOutput(BaseTransform):
__implements__ = itransform
inputs = ('text/plain',)
output = 'text/*'
class TestEngine(ATSiteTestCase):
def afterSetUp(self):
ATSiteTestCase.afterSetUp(self)
self.engine = self.portal.portal_transforms
self.data = '<b>foo</b>'
def register(self):
#A default set of transforms to prove the interfaces work
self.engine.registerTransform(HtmlToText())
self.engine.registerTransform(FooToBar())
def testRegister(self):
self.register()
def testFailRegister(self):
register = self.engine.registerTransform
self.assertRaises(TransformException, register, TransformNoIO())
self.assertRaises(TransformException, register, BadTransformMissingImplements())
self.assertRaises(TransformException, register, BadTransformBadMIMEType1())
self.assertRaises(TransformException, register, BadTransformBadMIMEType2())
self.assertRaises(TransformException, register, BadTransformNoInput())
self.assertRaises(TransformException, register, BadTransformWildcardOutput())
def testCall(self):
self.register()
data = self.engine('HtmlToText', self.data)
self.failUnlessEqual(data, "foo")
data = self.engine('FooToBar', self.data)
self.failUnlessEqual(data, "<b>bar</b>")
def testConvert(self):
self.register()
data = self.engine.convert('HtmlToText', self.data)
self.failUnlessEqual(data.getData(), "foo")
self.failUnlessEqual(data.getMetadata()['mimetype'], 'text/plain')
self.failUnlessEqual(data.getMetadata().get('encoding'), None)
self.failUnlessEqual(data.name(), "HtmlToText")
self.engine.registerTransform(HtmlToTextWithEncoding())
data = self.engine.convert('HtmlToTextWithEncoding', self.data)
self.failUnlessEqual(data.getMetadata()['mimetype'], 'text/plain')
self.failUnlessEqual(data.getMetadata()['encoding'], 'ascii')
self.failUnlessEqual(data.name(), "HtmlToTextWithEncoding")
def testConvertTo(self):
self.register()
data = self.engine.convertTo('text/plain', self.data, mimetype="text/html")
self.failUnlessEqual(data.getData(), "foo")
self.failUnlessEqual(data.getMetadata()['mimetype'], 'text/plain')
self.failUnlessEqual(data.getMetadata().get('encoding'), None)
self.failUnlessEqual(data.name(), "text/plain")
self.engine.unregisterTransform('HtmlToText')
self.engine.unregisterTransform('FooToBar')
self.engine.registerTransform(HtmlToTextWithEncoding())
data = self.engine.convertTo('text/plain', self.data, mimetype="text/html")
self.failUnlessEqual(data.getMetadata()['mimetype'], 'text/plain')
# HtmlToTextWithEncoding. Now None is the right
#self.failUnlessEqual(data.getMetadata()['encoding'], 'ascii')
# XXX the new algorithm is choosing html_to_text instead of
self.failUnlessEqual(data.getMetadata()['encoding'], None)
self.failUnlessEqual(data.name(), "text/plain")
def testChain(self):
self.register()
hb = chain('hbar')
hb.registerTransform(HtmlToText())
hb.registerTransform(FooToBar())
self.engine.registerTransform(hb)
cache = self.engine.convert('hbar', self.data)
self.failUnlessEqual(cache.getData(), "bar")
self.failUnlessEqual(cache.name(), "hbar")
def testSame(self):
data = "This is a test"
mt = "text/plain"
out = self.engine.convertTo('text/plain', data, mimetype=mt)
self.failUnlessEqual(out.getData(), data)
self.failUnlessEqual(out.getMetadata()['mimetype'], 'text/plain')
def testCache(self):
data = "This is a test"
other_data = 'some different data'
mt = "text/plain"
self.engine.max_sec_in_cache = 20
out = self.engine.convertTo(mt, data, mimetype=mt, object=self)
self.failUnlessEqual(out.getData(), data, out.getData())
out = self.engine.convertTo(mt, other_data, mimetype=mt, object=self)
self.failUnlessEqual(out.getData(), data, out.getData())
self.engine.max_sec_in_cache = -1
out = self.engine.convertTo(mt, data, mimetype=mt, object=self)
self.failUnlessEqual(out.getData(), data, out.getData())
out = self.engine.convertTo(mt, other_data, mimetype=mt, object=self)
self.failUnlessEqual(out.getData(), other_data, out.getData())
def test_suite():
from unittest import TestSuite, makeSuite
suite = TestSuite()
suite.addTest(makeSuite(TestEngine))
return suite
if __name__ == '__main__':
framework()
import os, sys
if __name__ == '__main__':
execfile(os.path.join(sys.path[0], 'framework.py'))
from Testing import ZopeTestCase
from Products.Archetypes.tests.atsitetestcase import ATSiteTestCase
from utils import input_file_path
FILE_PATH = input_file_path("demo1.pdf")
class TestGraph(ATSiteTestCase):
def afterSetUp(self):
ATSiteTestCase.afterSetUp(self)
self.engine = self.portal.portal_transforms
def testGraph(self):
### XXX Local file and expected output
data = open(FILE_PATH, 'r').read()
out = self.engine.convertTo('text/plain', data, filename=FILE_PATH)
assert(out.getData())
def test_suite():
from unittest import TestSuite, makeSuite
suite = TestSuite()
suite.addTest(makeSuite(TestGraph))
return suite
if __name__ == '__main__':
framework()
from __future__ import nested_scopes
import os, sys
if __name__ == '__main__':
execfile(os.path.join(sys.path[0], 'framework.py'))
from Testing import ZopeTestCase
from Products.Archetypes.tests.atsitetestcase import ATSiteTestCase
from utils import input_file_path, output_file_path, normalize_html,\
load, matching_inputs
from Products.PortalTransforms.data import datastream
from Products.PortalTransforms.interfaces import idatastream
from Products.MimetypesRegistry.MimeTypesTool import MimeTypesTool
from Products.PortalTransforms.TransformEngine import TransformTool
from Products.PortalTransforms.libtransforms.utils import MissingBinary
from Products.PortalTransforms.transforms.image_to_gif import image_to_gif
from Products.PortalTransforms.transforms.image_to_png import image_to_png
from Products.PortalTransforms.transforms.image_to_jpeg import image_to_jpeg
from Products.PortalTransforms.transforms.image_to_bmp import image_to_bmp
from Products.PortalTransforms.transforms.image_to_tiff import image_to_tiff
from Products.PortalTransforms.transforms.image_to_ppm import image_to_ppm
from Products.PortalTransforms.transforms.image_to_pcx import image_to_pcx
from os.path import exists
import sys
# we have to set locale because lynx output is locale sensitive !
os.environ['LC_ALL'] = 'C'
class TransformTest(ATSiteTestCase):
def do_convert(self, filename=None):
if filename is None and exists(self.output + '.nofilename'):
output = self.output + '.nofilename'
else:
output = self.output
input = open(self.input)
orig = input.read()
input.close()
data = datastream(self.transform.name())
res_data = self.transform.convert(orig, data, filename=filename)
self.assert_(idatastream.isImplementedBy(res_data))
got = res_data.getData()
try:
output = open(output)
except IOError:
import sys
print >>sys.stderr, 'No output file found.'
print >>sys.stderr, 'File %s created, check it !' % self.output
output = open(output, 'w')
output.write(got)
output.close()
self.assert_(0)
expected = output.read()
if self.normalize is not None:
expected = self.normalize(expected)
got = self.normalize(got)
output.close()
self.assertEquals(got, expected,
'[%s]\n\n!=\n\n[%s]\n\nIN %s(%s)' % (
got, expected, self.transform.name(), self.input))
self.assertEquals(self.subobjects, len(res_data.getSubObjects()),
'%s\n\n!=\n\n%s\n\nIN %s(%s)' % (
self.subobjects, len(res_data.getSubObjects()), self.transform.name(), self.input))
def testSame(self):
self.do_convert(filename=self.input)
def testSameNoFilename(self):
self.do_convert()
def __repr__(self):
return self.transform.name()
class PILTransformsTest(ATSiteTestCase):
def afterSetUp(self):
ATSiteTestCase.afterSetUp(self)
self.pt = self.portal.portal_transforms
def test_image_to_bmp(self):
self.pt.registerTransform(image_to_bmp())
imgFile = open(input_file_path('logo.jpg'), 'rb')
data = imgFile.read()
self.failUnlessEqual(self.portal.mimetypes_registry.classify(data),'image/jpeg')
data = self.pt.convertTo(target_mimetype='image/x-ms-bmp',orig=data)
self.failUnlessEqual(data.getMetadata()['mimetype'], 'image/x-ms-bmp')
def test_image_to_gif(self):
self.pt.registerTransform(image_to_gif())
imgFile = open(input_file_path('logo.png'), 'rb')
data = imgFile.read()
self.failUnlessEqual(self.portal.mimetypes_registry.classify(data),'image/png')
data = self.pt.convertTo(target_mimetype='image/gif',orig=data)
self.failUnlessEqual(data.getMetadata()['mimetype'], 'image/gif')
def test_image_to_jpeg(self):
self.pt.registerTransform(image_to_jpeg())
imgFile = open(input_file_path('logo.gif'), 'rb')
data = imgFile.read()
self.failUnlessEqual(self.portal.mimetypes_registry.classify(data),'image/gif')
data = self.pt.convertTo(target_mimetype='image/jpeg',orig=data)
self.failUnlessEqual(data.getMetadata()['mimetype'], 'image/jpeg')
def test_image_to_png(self):
self.pt.registerTransform(image_to_png())
imgFile = open(input_file_path('logo.jpg'), 'rb')
data = imgFile.read()
self.failUnlessEqual(self.portal.mimetypes_registry.classify(data),'image/jpeg')
data = self.pt.convertTo(target_mimetype='image/png',orig=data)
self.failUnlessEqual(data.getMetadata()['mimetype'], 'image/png')
def test_image_to_pcx(self):
self.pt.registerTransform(image_to_pcx())
imgFile = open(input_file_path('logo.gif'), 'rb')
data = imgFile.read()
self.failUnlessEqual(self.portal.mimetypes_registry.classify(data),'image/gif')
data = self.pt.convertTo(target_mimetype='image/pcx',orig=data)
self.failUnlessEqual(data.getMetadata()['mimetype'], 'image/pcx')
def test_image_to_ppm(self):
self.pt.registerTransform(image_to_ppm())
imgFile = open(input_file_path('logo.png'), 'rb')
data = imgFile.read()
self.failUnlessEqual(self.portal.mimetypes_registry.classify(data),'image/png')
data = self.pt.convertTo(target_mimetype='image/x-portable-pixmap',orig=data)
self.failUnlessEqual(data.getMetadata()['mimetype'], 'image/x-portable-pixmap')
def test_image_to_tiff(self):
self.pt.registerTransform(image_to_tiff())
imgFile = open(input_file_path('logo.jpg'), 'rb')
data = imgFile.read()
self.failUnlessEqual(self.portal.mimetypes_registry.classify(data),'image/jpeg')
data = self.pt.convertTo(target_mimetype='image/tiff',orig=data)
self.failUnlessEqual(data.getMetadata()['mimetype'], 'image/tiff')
TRANSFORMS_TESTINFO = (
('Products.PortalTransforms.transforms.pdf_to_html',
"demo1.pdf", "demo1.html", None, 0
),
('Products.PortalTransforms.transforms.word_to_html',
"test.doc", "test_word.html", normalize_html, 0
),
('Products.PortalTransforms.transforms.lynx_dump',
"test_lynx.html", "test_lynx.txt", None, 0
),
('Products.PortalTransforms.transforms.html_to_text',
"test_lynx.html", "test_html_to_text.txt", None, 0
),
('Products.PortalTransforms.transforms.identity',
"rest1.rst", "rest1.rst", None, 0
),
('Products.PortalTransforms.transforms.text_to_html',
"rest1.rst", "rest1.html", None, 0
),
('Products.PortalTransforms.transforms.safe_html',
"test_safehtml.html", "test_safe.html", None, 0
),
('Products.PortalTransforms.transforms.image_to_bmp',
"logo.jpg", "logo.bmp", None, 0
),
('Products.PortalTransforms.transforms.image_to_gif',
"logo.bmp", "logo.gif", None, 0
),
('Products.PortalTransforms.transforms.image_to_jpeg',
"logo.gif", "logo.jpg", None, 0
),
('Products.PortalTransforms.transforms.image_to_png',
"logo.bmp", "logo.png", None, 0
),
('Products.PortalTransforms.transforms.image_to_ppm',
"logo.gif", "logo.ppm", None, 0
),
('Products.PortalTransforms.transforms.image_to_tiff',
"logo.png", "logo.tiff", None, 0
),
('Products.PortalTransforms.transforms.image_to_pcx',
"logo.png", "logo.pcx", None, 0
),
)
def initialise(transform, normalize, pattern):
global TRANSFORMS_TESTINFO
for fname in matching_inputs(pattern):
outname = '%s.out' % fname.split('.')[0]
#print transform, fname, outname
TRANSFORMS_TESTINFO += ((transform, fname, outname, normalize, 0),)
# ReST test cases
initialise('Products.PortalTransforms.transforms.rest', normalize_html, "rest*.rst")
# Python test cases
initialise('Products.PortalTransforms.transforms.python', normalize_html, "*.py")
# FIXME missing tests for image_to_html, st
TR_NAMES = None
def make_tests(test_descr=TRANSFORMS_TESTINFO):
"""generate tests classes from test info
return the list of generated test classes
"""
tests = []
for _transform, tr_input, tr_output, _normalize, _subobjects in test_descr:
# load transform if necessary
if type(_transform) is type(''):
try:
_transform = load(_transform).register()
except MissingBinary:
# we are not interessted in tests with missing binaries
continue
except:
import traceback
traceback.print_exc()
continue
if TR_NAMES is not None and not _transform.name() in TR_NAMES:
print 'skip test for', _transform.name()
continue
class TransformTestSubclass(TransformTest):
input = input_file_path(tr_input)
output = output_file_path(tr_output)
transform = _transform
normalize = lambda x, y: _normalize(y)
subobjects = _subobjects
tests.append(TransformTestSubclass)
tests.append(PILTransformsTest)
return tests
def test_suite():
from unittest import TestSuite, makeSuite
suite = TestSuite()
for test in make_tests():
suite.addTest(makeSuite(test))
return suite
if __name__ == '__main__':
framework()
import re
import glob
from unittest import TestSuite
from sys import modules
from os.path import join, abspath, dirname, basename
def normalize_html(s):
s = re.sub(r"\s+", " ", s)
s = re.sub(r"(?s)\s+<", "<", s)
s = re.sub(r"(?s)>\s+", ">", s)
s = re.sub(r"\r", "", s)
return s
def build_test_suite(package_name,module_names,required=1):
"""
Utlitity for building a test suite from a package name
and a list of modules.
If required is false, then ImportErrors will simply result
in that module's tests not being added to the returned
suite.
"""
suite = TestSuite()
try:
for name in module_names:
the_name = package_name+'.'+name
__import__(the_name,globals(),locals())
suite.addTest(modules[the_name].test_suite())
except ImportError:
if required:
raise
return suite
PREFIX = abspath(dirname(__file__))
def input_file_path(file):
return join(PREFIX, 'input', file)
def output_file_path(file):
return join(PREFIX, 'output', file)
def matching_inputs(pattern):
return [basename(path) for path in glob.glob(join(PREFIX, "input", pattern))]
def load(dotted_name, globals=None):
""" load a python module from it's name """
mod = __import__(dotted_name, globals)
components = dotted_name.split('.')
for comp in components[1:]:
mod = getattr(mod, comp)
return mod
from rigging import transformer
import os
from stat import ST_MTIME
## BIG BAD FUNCTIONAL TEST OF OOo Word Conversion
## The interfaces work, but are not quite what we need
## I might have to back fill a chain from source/dest graphing
file = "/tmp/word.doc"
class curry:
def __init__(self, func, *fixed_args):
self.func = func
self.fixed_args = fixed_args
def __call__(self, *variable_args):
return apply(self.func, self.fixed_args +
variable_args)
data = open("/tmp/word.doc", "r").read()
data = transformer.convert("WordToHtml", data, filename="word.doc")
print data.getData()
This diff is collapsed.
from Products.PortalTransforms.interfaces import itransform
from Products.PortalTransforms.utils import log
WARNING=100
class BrokenTransform:
__implements__ = itransform
__name__ = "broken transform"
inputs = ("BROKEN",)
output = "BROKEN"
def __init__(self, id, module, error):
self.id = id
self.module = module
self.error = error
def name(self):
return self.__name__
def convert(self, orig, data, **kwargs):
# do the format
msg = "Calling convert on BROKEN transform %s (%s). Error: %s" % \
(self.id, self.module, self.error)
log(msg, severity=WARNING)
print msg
data.setData('')
return data
def register():
return broken()
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
Markdown is supported
0%
or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment