bt5/erp5_receipt_recognition/bt/license · b7e7b88cdf324e432a4767be1062927c88b9be9f · francois / erp5

erp5_receipt_recognition: Execute OCR on receipt image to find total value · b7e7b88c

francois authored Mar 29, 2017

This commit contain the business template that take a receipt image as a source,
binarize then segmentize it, and apply OCR on it. It then extract the meaning
with regular expressions. The image should already be loaded inside the
image module before it can be read.

The business template contain:
	* The receipt recognition module
	* An extension containing the code that binarize, crop and
	  segmentize the image then analize it.
	* A new type "Receipt" that contain a source image and the
	  field that contain the "total" value
	* A portal skin folder containing the extension externalMethods
	  aswell as the conversion script that call the recognition and
	  update the Receipt "total" field

Improvements (not limited to this list):
	- Easier loading of picture: directly from the receipt page.
	- Easier loading of picture 2: from phone with OfficeJS
	  (or any renderJS) application?
	- Detect when images are sideway and rotate them straight
	- Better "boxing" and segmentation: some lines are deleted from
	  the original image during the segmentation when they are too
	  close from other
	- Modify the neural network (lstm) to increase weight of signs
	  like $, euro, / and numbers
	- Use of a faster/smaller neural network: Most of the time is
	  lost with the loading of the neural network
	- Caching the neural network: See previous statement.
	- Extract currency, date and receipt emettor.
	- Use a neural network for the meaning extraction?

b7e7b88c

license 3 Bytes

Replace license