web_renderjs_ui: use lxml to extract data-i18n messages

The previous regular expression based approach sometimes could not extract
message properly. Using xml parser simplify code and fix several messages
that were not extracted properly, like messages containing ", [] or {}

This also fix some problems when looking for messages sources:
  - archived web pages were sometimes used instead of published ones
  - messages from gadgets implemented as page templates/OFS files were not
    extracted.

A few more unit tests for the scripts involved in this process are added.
2 jobs for <span class="ref-name">feat/lxml-message-extract</span>
Status Job ID Name Coverage
  External
running ERP5.CodingStyleTest-Master

2476:42:02

running ERP5.UnitTest-Master.Medusa

2476:42:08