Skip to content
Projects
Groups
Snippets
Help
Loading...
Help
Support
Keyboard shortcuts
?
Submit feedback
Contribute to GitLab
Sign in / Register
Toggle navigation
erp5
Project overview
Project overview
Details
Activity
Releases
Repository
Repository
Files
Commits
Branches
Tags
Contributors
Graph
Compare
Issues
0
Issues
0
List
Boards
Labels
Milestones
Merge Requests
0
Merge Requests
0
CI / CD
CI / CD
Pipelines
Jobs
Schedules
Analytics
Analytics
CI / CD
Repository
Value Stream
Wiki
Wiki
Snippets
Snippets
Members
Members
Collapse sidebar
Close sidebar
Activity
Graph
Create a new issue
Jobs
Commits
Issue Boards
Open sidebar
alecs_myu
erp5
Commits
30341500
Commit
30341500
authored
Jun 29, 2017
by
francois
Committed by
Ayush Tiwari
Jul 07, 2017
Browse files
Options
Browse Files
Download
Email Patches
Plain Diff
erp5_receipt_recognition Update bt5 following merge request advices
parent
a99e746b
Changes
17
Hide whitespace changes
Inline
Side-by-side
Showing
17 changed files
with
108 additions
and
59 deletions
+108
-59
bt5/erp5_receipt_recognition/ActionTemplateItem/portal_types/Receipt%20Recognition/receipt_convert.xml
...em/portal_types/Receipt%20Recognition/receipt_convert.xml
+1
-1
bt5/erp5_receipt_recognition/ActionTemplateItem/portal_types/Receipt%20Recognition/view.xml
...nTemplateItem/portal_types/Receipt%20Recognition/view.xml
+1
-1
bt5/erp5_receipt_recognition/ExtensionTemplateItem/portal_components/extension.erp5.ReceiptRecognition.py
...em/portal_components/extension.erp5.ReceiptRecognition.py
+91
-39
bt5/erp5_receipt_recognition/ExtensionTemplateItem/portal_components/extension.erp5.ReceiptRecognition.xml
...m/portal_components/extension.erp5.ReceiptRecognition.xml
+1
-4
bt5/erp5_receipt_recognition/PortalTypeAllowedContentTypeTemplateItem/allowed_content_types.xml
...eAllowedContentTypeTemplateItem/allowed_content_types.xml
+1
-1
bt5/erp5_receipt_recognition/PortalTypePropertySheetTemplateItem/property_sheet_list.xml
...rtalTypePropertySheetTemplateItem/property_sheet_list.xml
+1
-1
bt5/erp5_receipt_recognition/PortalTypeTemplateItem/portal_types/Receipt%20Recognition.xml
...alTypeTemplateItem/portal_types/Receipt%20Recognition.xml
+1
-1
bt5/erp5_receipt_recognition/SkinTemplateItem/portal_skins/erp5_receipt_recognition/ReceiptRecognition_convertImage.py
...p5_receipt_recognition/ReceiptRecognition_convertImage.py
+4
-4
bt5/erp5_receipt_recognition/SkinTemplateItem/portal_skins/erp5_receipt_recognition/ReceiptRecognition_convertImage.xml
...5_receipt_recognition/ReceiptRecognition_convertImage.xml
+1
-1
bt5/erp5_receipt_recognition/SkinTemplateItem/portal_skins/erp5_receipt_recognition/ReceiptRecognition_view.xml
...kins/erp5_receipt_recognition/ReceiptRecognition_view.xml
+1
-1
bt5/erp5_receipt_recognition/SkinTemplateItem/portal_skins/erp5_receipt_recognition/ReceiptRecognition_view/my_follow_up_title.xml
...ecognition/ReceiptRecognition_view/my_follow_up_title.xml
+0
-0
bt5/erp5_receipt_recognition/SkinTemplateItem/portal_skins/erp5_receipt_recognition/ReceiptRecognition_view/my_title.xml
..._receipt_recognition/ReceiptRecognition_view/my_title.xml
+0
-0
bt5/erp5_receipt_recognition/SkinTemplateItem/portal_skins/erp5_receipt_recognition/ReceiptRecognition_view/my_total.xml
..._receipt_recognition/ReceiptRecognition_view/my_total.xml
+0
-0
bt5/erp5_receipt_recognition/bt/template_action_path_list
bt5/erp5_receipt_recognition/bt/template_action_path_list
+2
-2
bt5/erp5_receipt_recognition/bt/template_portal_type_allowed_content_type_list
...gnition/bt/template_portal_type_allowed_content_type_list
+1
-1
bt5/erp5_receipt_recognition/bt/template_portal_type_id_list
bt5/erp5_receipt_recognition/bt/template_portal_type_id_list
+1
-1
bt5/erp5_receipt_recognition/bt/template_portal_type_property_sheet_list
...t_recognition/bt/template_portal_type_property_sheet_list
+1
-1
No files found.
bt5/erp5_receipt_recognition/ActionTemplateItem/portal_types/Receipt/receipt_convert.xml
→
bt5/erp5_receipt_recognition/ActionTemplateItem/portal_types/Receipt
%20Recognition
/receipt_convert.xml
View file @
30341500
...
...
@@ -77,7 +77,7 @@
<dictionary>
<item>
<key>
<string>
text
</string>
</key>
<value>
<string>
string:${object_url}/Receipt
Convers
ion_convertImage
</string>
</value>
<value>
<string>
string:${object_url}/Receipt
Recognit
ion_convertImage
</string>
</value>
</item>
</dictionary>
</pickle>
...
...
bt5/erp5_receipt_recognition/ActionTemplateItem/portal_types/Receipt/view.xml
→
bt5/erp5_receipt_recognition/ActionTemplateItem/portal_types/Receipt
%20Recognition
/view.xml
View file @
30341500
...
...
@@ -77,7 +77,7 @@
<dictionary>
<item>
<key>
<string>
text
</string>
</key>
<value>
<string>
string:${object_url}/Receipt_view
</string>
</value>
<value>
<string>
string:${object_url}/Receipt
Recognition
_view
</string>
</value>
</item>
</dictionary>
</pickle>
...
...
bt5/erp5_receipt_recognition/ExtensionTemplateItem/portal_components/extension.erp5.ReceiptRecognition.py
View file @
30341500
...
...
@@ -6,6 +6,8 @@ to work inside erp5 and adapt to receipt binaries and with more
explanation
https://github.com/tmbdev/ocropy
"""
# pylint: disable=unpacking-non-sequence
# Pylint is confused by ocropy.
import
numpy
as
np
import
scipy.ndimage
as
ndi
...
...
@@ -14,9 +16,10 @@ from matplotlib import pylab
import
matplotlib.image
as
mpimg
import
scipy.stats
as
stats
import
re
import
cPickle
import
ocrolib
def
getReceiptValue
(
self
,
image_data
):
def
getReceiptValue
(
self
,
image_data
,
model_name
=
"en-default.pyrnn"
):
"""
Function called from an erp5 script through externalMethod
that take an image and its name and save its binarized
...
...
@@ -27,20 +30,46 @@ def getReceiptValue(self, image_data):
Represent the erp5 object from which externalmethods or module
objects can be called
- image_data:
base64 r
epresentation of the image to analyse
R
epresentation of the image to analyse
@return:
-
ret
: float
-
anon
: float
Represent total value paid on the receipt
----------------------------
This function return the total value of the receipt in euros.
This function look for euros only and return a price with a two digit
precison like "135.79" or "43,89".
"""
image_as_string
=
StringIO
.
StringIO
(
image_data
)
image_as_array
=
mpimg
.
imread
(
image_as_string
,
format
=
'JPG'
)
line_list
,
cleared
=
getLinesFromPicture
(
image_as_array
)
# Start the neural network
network
,
lnorm
=
initRnnModel
()
network
,
lnorm
=
initRnnModel
(
model_name
)
return
findReceiptValue
(
line_list
,
cleared
,
network
,
lnorm
)
def
findReceiptValue
(
line_list
,
cleared
,
network
,
lnorm
):
"""
Function that run the neural network through the receipt and extract
meaningfull value
-----------------------------
@args:
- lines: array list
Represent lines of text that will be extracted
from the image
- cleared:2D array
Represent binarized image cropped and cleaned,
from which we will extract text lines
- network: lstm object
Represent the trained neural net
- lnorm: method from lstm object
Represent the size of the lstm object. Is used to scale the objects
to recognize from original size to the average network object.
@return:
- anon: float
Represent total value paid on the receipt
-----------------------------
This function can bemodified to add more field to detect. It might be
possible to run a classification neural net on the result.
"""
value_list
=
[]
tofind
=
r"(EUR)|€|(TOT)"
for
_
,
line
in
enumerate
(
line_list
):
...
...
@@ -48,15 +77,34 @@ def getReceiptValue(self, image_data):
# Corner case: he dewarping function from the normalizer fail
# sometimes on empty lines. Can be corrected with better segmentation
try
:
evaluate
=
getStringFromImage
(
binline
,
lnorm
,
network
)
evaluate
=
getStringFromImage
(
binline
,
lnorm
,
network
)
if
re
.
search
(
tofind
,
evaluate
.
upper
()):
number
=
re
.
findall
(
r"\
d+[
\.|,]\
d
\d"
,
evaluate
)
value_list
+=
[
float
(
char
.
replace
(
','
,
'.'
))
for
char
in
number
]
except
ValueError
:
pass
return
round
(
max
(
value_list
),
2
)
def
getRnnModelFromDataStream
(
self
,
model_name
=
"en-default.pyrnn"
):
"""
This function load a neural network from a dataStream
----------------------------
@args:
- model_name: string, default: en-default.pyrnn
Id of the object in data_stream_module that contain the rnn model
@return:
- network: lstm object
Represent the trained neural net
- lnorm: method from lstm object
Represent the size of the lstm object. Is used to scale the objects
to recognize from original size to the average network object.
----------------------------
WARNING: This function present a security issue and should NOT be called with
an user-defined model name (see cpickle security issue)
"""
network
=
cPickle
.
loads
(
self
.
data_stream_module
[
model_name
].
getData
())
lnorm
=
getattr
(
network
,
"lnorm"
,
None
)
return
network
,
lnorm
def
initRnnModel
(
model_name
=
"en-default.pyrnn"
):
"""
...
...
@@ -65,7 +113,7 @@ def initRnnModel(model_name = "en-default.pyrnn"):
----------------------------
@args:
- model_name: string, default: en-default.pyrnn
Id of the object in
data_stream_module
that contain the rnn model
Id of the object in
the filesystem
that contain the rnn model
@return:
- network: lstm object
Represent the trained neural net
...
...
@@ -108,8 +156,8 @@ def getLinesFromPicture(image_as_array):
independant picture
"""
grey_image
=
convertGreyscale
(
image_as_array
)
flattened_image
=
imageTransformation
(
grey_image
)
binarized_image
=
imageBinarization
(
flatten
ed_image
)
cropped_image
=
cropImage
(
grey_image
)
binarized_image
=
imageBinarization
(
cropp
ed_image
)
binary
=
1
-
binarized_image
cleaned
,
scale
=
removeObjects
(
binary
)
angle
=
getEstimatedSkewAngle
(
cleaned
,
np
.
linspace
(
-
4
,
4
,
24
))
...
...
@@ -293,26 +341,6 @@ def removeObjects(binarized):
binarized
=
np
.
minimum
(
binarized
,
1
-
(
sums
>
0
)
*
(
sums
<
scale
))
return
binarized
,
scale
def
getImageWhitelevel
(
image
):
"""
Function that help flatten the image by estimating locals
whitelevels. This remove local extremes and give an image with
homogenous background and no details
------------------------------
@args:
- image: 2D array
Represent a greyscale image
@return:
- white_image: 2D array
Represent a greyscale image with no local extreme
------------------------------
This function result will be substracted from the original image
to make that only local extremes stand out.
"""
white_image
=
ndi
.
filters
.
percentile_filter
(
image
,
50
,
size
=
(
80
,
2
))
white_image
=
ndi
.
filters
.
percentile_filter
(
white_image
,
50
,
size
=
(
2
,
80
))
return
white_image
def
getEstimatedSkewAngle
(
image
,
angle_list
):
"""
Function that estimate at which angle the image is the most
...
...
@@ -343,8 +371,37 @@ def getEstimatedSkewAngle(image, angle_list):
_
,
angle
=
max
(
estimates
)
return
angle
def
removeBackground
(
image
,
percentile
=
50
):
"""
Function that help flatten the image by estimating locals
whitelevels. This remove local extremes and give an image with
homogenous background and no details
------------------------------
@args:
- image: 2D array
Represent a greyscale image
- percentile: integer between -100 and 100
A percentile filter with a value of 50 is basically a
median filter, value of 0 is a minimum filter and with
a value of 100 a maximum filter
@return:
- 2D array
Represent a greyscale image with no local extreme
------------------------------
The filter result will be substracted from the original image
to make that only local extremes stand out.
A Kuwahara filter might give better results.
"""
# Reduce extreme differences in the greyscale image
image
=
image
-
pylab
.
amin
(
image
)
image
/=
pylab
.
amax
(
image
)
white_image
=
ndi
.
filters
.
percentile_filter
(
image
,
percentile
,
size
=
(
80
,
2
))
white_image
=
ndi
.
filters
.
percentile_filter
(
white_image
,
percentile
,
size
=
(
2
,
80
))
# Get the difference between the whiteleveled image and the
# original one and put them betewwn 0 an 1
return
np
.
clip
(
image
-
white_image
+
1
,
0
,
1
)
def
imageTransformation
(
grey
):
def
cropImage
(
image
):
"""
Function that perform cropping and flattening -- Removing
homogenous background and small extremes-- on an image.
...
...
@@ -360,22 +417,17 @@ def imageTransformation(grey):
homogenous background
"""
# Reduce extreme differences in the greyscale image
image
=
grey
-
pylab
.
amin
(
grey
)
image
/=
pylab
.
amax
(
image
)
white_image
=
getImageWhitelevel
(
image
)
white_image
=
removeBackground
(
image
)
# Get the difference between the whiteleveled image and the
# original one and put them betewwn 0 an 1
flat
=
np
.
clip
(
image
-
white_image
+
1
,
0
,
1
)
# Calculate coordinate to crop the image, can be done in another
# function to improve readability
mask
=
ndi
.
gaussian_filter
(
flat
,
7.0
)
<
0.9
*
np
.
amax
(
flat
)
white_image
,
7.0
)
<
0.9
*
np
.
amax
(
white_image
)
coords
=
np
.
argwhere
(
mask
)
# Bounding box of kept pixels.
x_min
,
y_min
=
coords
.
min
(
axis
=
0
)
x_max
,
y_max
=
coords
.
max
(
axis
=
0
)
return
flat
[
x_min
-
10
:
x_max
+
10
,
y_min
-
10
:
y_max
+
10
]
return
white_image
[
x_min
-
10
:
x_max
+
10
,
y_min
-
10
:
y_max
+
10
]
def
imageBinarization
(
flattened_image
):
...
...
bt5/erp5_receipt_recognition/ExtensionTemplateItem/portal_components/extension.erp5.ReceiptRecognition.xml
View file @
30341500
...
...
@@ -45,10 +45,7 @@
<item>
<key>
<string>
text_content_warning_message
</string>
</key>
<value>
<tuple>
<string>
W:243, 2: Attempting to unpack a non-sequence defined at line 181 of scipy.ndimage.measurements (unpacking-non-sequence)
</string>
<string>
W:272, 2: Attempting to unpack a non-sequence defined at line 181 of scipy.ndimage.measurements (unpacking-non-sequence)
</string>
</tuple>
<tuple/>
</value>
</item>
<item>
...
...
bt5/erp5_receipt_recognition/PortalTypeAllowedContentTypeTemplateItem/allowed_content_types.xml
View file @
30341500
<allowed_content_type_list>
<portal_type
id=
"Receipt Recognition Module"
>
<item>
Receipt
</item>
<item>
Receipt
Recognition
</item>
</portal_type>
</allowed_content_type_list>
\ No newline at end of file
bt5/erp5_receipt_recognition/PortalTypePropertySheetTemplateItem/property_sheet_list.xml
View file @
30341500
<property_sheet_list>
<portal_type
id=
"Receipt"
>
<portal_type
id=
"Receipt
Recognition
"
>
<item>
Document
</item>
</portal_type>
</property_sheet_list>
\ No newline at end of file
bt5/erp5_receipt_recognition/PortalTypeTemplateItem/portal_types/Receipt.xml
→
bt5/erp5_receipt_recognition/PortalTypeTemplateItem/portal_types/Receipt
%20Recognition
.xml
View file @
30341500
...
...
@@ -28,7 +28,7 @@
</item>
<item>
<key>
<string>
id
</string>
</key>
<value>
<string>
Receipt
</string>
</value>
<value>
<string>
Receipt
Recognition
</string>
</value>
</item>
<item>
<key>
<string>
init_script
</string>
</key>
...
...
bt5/erp5_receipt_recognition/SkinTemplateItem/portal_skins/erp5_receipt_recognition/Receipt
Convers
ion_convertImage.py
→
bt5/erp5_receipt_recognition/SkinTemplateItem/portal_skins/erp5_receipt_recognition/Receipt
Recognit
ion_convertImage.py
View file @
30341500
...
...
@@ -2,17 +2,17 @@ image = context.getFollowUpValue()
if
image
is
not
None
:
try
:
total
=
container
.
ReceiptRecognition_getReceiptValue
(
image
.
getData
())
m
sg
=
"Total found"
m
essage
=
"Total found"
context
.
edit
(
total
=
total
,
)
except
ValueError
as
e
:
m
sg
=
"Could not find value, please submit it manually"
m
essage
=
"Could not find value, please submit it manually"
else
:
m
sg
=
"Cannot find the image"
m
essage
=
"Cannot find the image"
if
batch_mode
:
return
context
.
Base_redirect
(
'view'
,
keep_items
=
dict
(
portal_status_message
=
m
sg
,
my_source
=
"test"
))
'view'
,
keep_items
=
dict
(
portal_status_message
=
m
essage
,
my_source
=
"test"
))
bt5/erp5_receipt_recognition/SkinTemplateItem/portal_skins/erp5_receipt_recognition/Receipt
Convers
ion_convertImage.xml
→
bt5/erp5_receipt_recognition/SkinTemplateItem/portal_skins/erp5_receipt_recognition/Receipt
Recognit
ion_convertImage.xml
View file @
30341500
...
...
@@ -54,7 +54,7 @@
</item>
<item>
<key>
<string>
id
</string>
</key>
<value>
<string>
Receipt
Convers
ion_convertImage
</string>
</value>
<value>
<string>
Receipt
Recognit
ion_convertImage
</string>
</value>
</item>
</dictionary>
</pickle>
...
...
bt5/erp5_receipt_recognition/SkinTemplateItem/portal_skins/erp5_receipt_recognition/Receipt_view.xml
→
bt5/erp5_receipt_recognition/SkinTemplateItem/portal_skins/erp5_receipt_recognition/Receipt
Recognition
_view.xml
View file @
30341500
...
...
@@ -89,7 +89,7 @@
</item>
<item>
<key>
<string>
id
</string>
</key>
<value>
<string>
Receipt_view
</string>
</value>
<value>
<string>
Receipt
Recognition
_view
</string>
</value>
</item>
<item>
<key>
<string>
method
</string>
</key>
...
...
bt5/erp5_receipt_recognition/SkinTemplateItem/portal_skins/erp5_receipt_recognition/Receipt_view/my_follow_up_title.xml
→
bt5/erp5_receipt_recognition/SkinTemplateItem/portal_skins/erp5_receipt_recognition/Receipt
Recognition
_view/my_follow_up_title.xml
View file @
30341500
File moved
bt5/erp5_receipt_recognition/SkinTemplateItem/portal_skins/erp5_receipt_recognition/Receipt_view/my_title.xml
→
bt5/erp5_receipt_recognition/SkinTemplateItem/portal_skins/erp5_receipt_recognition/Receipt
Recognition
_view/my_title.xml
View file @
30341500
File moved
bt5/erp5_receipt_recognition/SkinTemplateItem/portal_skins/erp5_receipt_recognition/Receipt_view/my_total.xml
→
bt5/erp5_receipt_recognition/SkinTemplateItem/portal_skins/erp5_receipt_recognition/Receipt
Recognition
_view/my_total.xml
View file @
30341500
File moved
bt5/erp5_receipt_recognition/bt/template_action_path_list
View file @
30341500
Receipt Recognition Module | view
Receipt | receipt_convert
Receipt | view
\ No newline at end of file
Receipt Recognition | receipt_convert
Receipt Recognition | view
\ No newline at end of file
bt5/erp5_receipt_recognition/bt/template_portal_type_allowed_content_type_list
View file @
30341500
Receipt Recognition Module | Receipt
\ No newline at end of file
Receipt Recognition Module | Receipt Recognition
\ No newline at end of file
bt5/erp5_receipt_recognition/bt/template_portal_type_id_list
View file @
30341500
Receipt
Receipt
Recognition
Receipt Recognition Module
\ No newline at end of file
bt5/erp5_receipt_recognition/bt/template_portal_type_property_sheet_list
View file @
30341500
Receipt | Document
\ No newline at end of file
Receipt Recognition | Document
\ No newline at end of file
Write
Preview
Markdown
is supported
0%
Try again
or
attach a new file
Attach a file
Cancel
You are about to add
0
people
to the discussion. Proceed with caution.
Finish editing this message first!
Cancel
Please
register
or
sign in
to comment