Commit d743c185 authored by Vincent Pelletier's avatar Vincent Pelletier

apachedex: Tolerate non-ascii URLs.

Otherwise, if `url` contains non-ascii chars, startswith will fail with
an error like:
  UnicodeDecodeError: 'ascii' codec can't decode byte 0xc3 in position 84: ordinal not in range(128)
becasue 'http' is unicode. So byte-ify it to avoid this transcoding.
parent 1cff57c4
...@@ -1556,7 +1556,7 @@ def main(): ...@@ -1556,7 +1556,7 @@ def main():
no_url_lines += 1 no_url_lines += 1
continue continue
url = url_match.group('url') url = url_match.group('url')
if url.startswith('http'): if url.startswith(b'http'):
url = splithost(splittype(url)[1])[1] url = splithost(splittype(url)[1])[1]
url = get_url_prefix(match, url) url = get_url_prefix(match, url)
for site, prefix_match, action in site_list: for site, prefix_match, action in site_list:
......
Markdown is supported
0%
or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment