README.webchecker.txt 2.22 KB
Newer Older
1 2 3
Utility able to call wget and varnishlog to extract Headers and return all failures
according expected caching policy.

4
This utility is configurable through a configuration file like:
5

6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23
[web_checker]
url = http://www.example.com/
working_directory = /home/me/tmp/crawled_content
varnishlog_binary_path = varnishlog
email_address = me@example.com
smtp_host = localhost
debug_level = debug

[header_list]
Last-Modified = True
Cache-Control = max-age=300
                max-age=3600
Vary = Accept-Language, Cookie, Accept-Encoding
       Accept-Language, Cookie
       Accept-Language,Cookie,Accept-Encoding
       Accept-Language,Cookie
Expires = True

24 25 26 27 28 29 30 31 32 33 34
[header url=.*/sitemap]
Last-Modified = True

[header content-type=.*/javascript]
Last-Modified = True
Cache-Control = max-age=3600
Expires = True

[no_header content-type=(image/.*|text/css)]
Vary = None

Nicolas Delaby's avatar
Nicolas Delaby committed
35 36 37 38 39 40 41 42
[erp5_extension_list]
prohibited_file_name_list = WebSection_viewAsWeb
                            Base_viewHistory
                            list
prohibited_folder_name_list = web_page_module
                              document_module


43 44 45 46 47 48 49 50 51 52

with
  url : website to check
  working_directory : fetched data will be downloaded
  varnishlog_binary_path :  path to varnishlog
  email_address : email address to send result
  smtp_host : smtp host to use
  debug_level : log level of this utility (debug =>very verbose,
                                          info=>normal,
                                          warning=>nothing)
53

Nicolas Delaby's avatar
Nicolas Delaby committed
54 55 56 57 58 59 60 61 62 63 64
  header_list : Key == Header id.
                value: if equals to True, it means that header needs to be present in RESPONSE
                      if it is a tuple, the Header value must sastify at least one of the proposed values

  erp5_extension_list: Optional section.
    prohibited_file_name_list: which check that any links redirect to prohibited forms
      like WebSection_viewAsWeb, Base_viewHistory, list, ...
    prohibited_folder_name_list: usefull to detect that links does not redirect to
      specified modules like, web_page_module, document_module, ...


65 66 67 68 69
This utility requires wget => 1.12
And a callable varnishlog.
The utility must be run on same server where varnish is running.

web_checker reads varnishlogs to detect if a Query goes to the backend.