Commit 1840bc00 authored by Jérome Perrin's avatar Jérome Perrin

knowledge_pad: ignore feedparser.NonXMLContentType

When server replies with a non XML content type feeparser attempt to
parse anyway, but set the bozo flag.

https://universal-feedparser.readthedocs.io/en/latest/#handling-incorrectly-declared-media-types

If parsing fail because the feed is not a valid RSS XML stream,
feedparser fail in other ways, so we can safely ignore this error.

This should repair recent failures with testFeedReaderGadget now that le
monde changed configuration on their website:

     $ curl -sD -  -o /dev/null https://www.lemonde.fr/rss/une.xml | grep Content-Type
     Content-Type: text/html;charset=UTF-8

/reviewed-on !1027
parent 7e2725fa
Pipeline #7457 passed with stage
...@@ -32,6 +32,7 @@ def getRssDataAsDict(context, url, username=None, password=None): ...@@ -32,6 +32,7 @@ def getRssDataAsDict(context, url, username=None, password=None):
# some bozo exceptions can be ignored # some bozo exceptions can be ignored
if not isinstance(d.bozo_exception, ( if not isinstance(d.bozo_exception, (
feedparser.CharacterEncodingOverride, feedparser.CharacterEncodingOverride,
feedparser.NonXMLContentType,
)): )):
return {'status': -5} return {'status': -5}
if d.status == 401: if d.status == 401:
......
Markdown is supported
0%
or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment