1. 07 Jul, 2017 5 commits
    • francois's avatar
      erp5_data_analysis_request Create ticket for document analysis · 7699acb7
      francois authored
      This bt is composed of 3 modules. Once an image is loaded inside erp5,
      one can create a follow-up to a new data analysis request ticket that
      will launch document conversion and recognition upon the read text.
      7699acb7
    • francois's avatar
    • francois's avatar
      erp5_receipt_recognition_test: Add unit test for receipt recognition module · a99e746b
      francois authored
      This commit contain a testing business template for the receipt recognition module
      test the "Receipt" type update as well as the OCR success and fail on a set of poor
      
      This commit contain binary files that are test images..
      a99e746b
    • francois's avatar
      erp5_receipt_recognition: Execute OCR on receipt image to find total value · 4ba30106
      francois authored
      This commit contain the business template that take a receipt image as a source,
      binarize then segmentize it, and apply OCR on it. It then extract the meaning
      with regular expressions. The image should already be loaded inside the
      image module before it can be read.
      
      The business template contain:
      	* The receipt recognition module
      	* An extension containing the code that binarize, crop and
      	  segmentize the image then analize it.
      	* A new type "Receipt" that contain a source image and the
      	  field that contain the "total" value
      	* A portal skin folder containing the extension externalMethods
      	  aswell as the conversion script that call the recognition and
      	  update the Receipt "total" field
      
      Improvements (not limited to this list):
      	- Easier loading of picture: directly from the receipt page.
      	- Easier loading of picture 2: from phone with OfficeJS
      	  (or any renderJS) application?
      	- Detect when images are sideway and rotate them straight
      	- Better "boxing" and segmentation: some lines are deleted from
      	  the original image during the segmentation when they are too
      	  close from other
      	- Modify the neural network (lstm) to increase weight of signs
      	  like $, euro, / and numbers
      	- Use of a faster/smaller neural network: Most of the time is
      	  lost with the loading of the neural network
      	- Caching the neural network: See previous statement.
      	- Extract currency, date and receipt emettor.
      	- Use a neural network for the meaning extraction?
      4ba30106
    • Vincent Pelletier's avatar
      DomainTool: Simplify ranged properties discovery · 559a14f9
      Vincent Pelletier authored
      Avoid iterating over all columns known to catalog to then restrict to a
      single table by using SQLCatalog API.
      Only check for one range column suffix as code anyway relies on the triplet
      of columns to be consistently present. Document this in the code and get
      rid of now-unneeded range_column_set mechanism.
      559a14f9
  2. 06 Jul, 2017 13 commits
  3. 05 Jul, 2017 8 commits
  4. 04 Jul, 2017 5 commits
  5. 03 Jul, 2017 9 commits