web_renderjs_ui: use lxml to extract data-i18n messages

The previous regular expression based approach sometimes could not extract
message properly. Using xml parser simplify code and fix several messages
that were not extracted properly, like messages containing ", [] or {}

This also fix some problems when looking for messages sources:
  - archived web pages were sometimes used instead of published ones
  - messages from gadgets implemented as page templates/OFS files were not
    extracted.

A few more unit tests for the scripts involved in this process are added.
2 jobs for feat/lxml-message-extract in 0 seconds
Status Job ID Name Coverage
  External
passed ERP5.CodingStyleTest-TestRunner2

01:35:48

failed ERP5.UnitTest-TestRunner2

01:27:51