README.webchecker.txt 2.28 KB
Newer Older
1 2 3
Utility able to call wget and varnishlog to extract Headers and return all failures
according expected caching policy.

4
This utility is configurable through a configuration file like::
5

6 7 8 9 10 11 12
  [web_checker]
  url = http://www.example.com/
  working_directory = /home/me/tmp/crawled_content
  varnishlog_binary_path = varnishlog
  email_address = me@example.com
  smtp_host = localhost
  debug_level = debug
13

14 15 16 17 18 19 20 21 22
  [header_list]
  Last-Modified = True
  Cache-Control = max-age=300
                  max-age=3600
  Vary = Accept-Language, Cookie, Accept-Encoding
         Accept-Language, Cookie
         Accept-Language,Cookie,Accept-Encoding
         Accept-Language,Cookie
  Expires = True
23

24 25
  [header url=.*/sitemap]
  Last-Modified = True
26

27 28 29 30
  [header content-type=.*/javascript]
  Last-Modified = True
  Cache-Control = max-age=3600
  Expires = True
31

32 33
  [no_header content-type=(image/.*|text/css)]
  Vary = None
34

35 36 37 38 39 40
  [erp5_extension_list]
  prohibited_file_name_list = WebSection_viewAsWeb
                              Base_viewHistory
                              list
  prohibited_folder_name_list = web_page_module
                                document_module
Nicolas Delaby's avatar
Nicolas Delaby committed
41 42


43
with::
44 45 46 47 48 49 50 51 52

  url : website to check
  working_directory : fetched data will be downloaded
  varnishlog_binary_path :  path to varnishlog
  email_address : email address to send result
  smtp_host : smtp host to use
  debug_level : log level of this utility (debug =>very verbose,
                                          info=>normal,
                                          warning=>nothing)
53

Nicolas Delaby's avatar
Nicolas Delaby committed
54 55
  header_list : Key == Header id.
                value: if equals to True, it means that header needs to be present in RESPONSE
56
                       if it is a tuple, the Header value must sastify at least one of the proposed values
Nicolas Delaby's avatar
Nicolas Delaby committed
57 58 59 60 61 62 63 64

  erp5_extension_list: Optional section.
    prohibited_file_name_list: which check that any links redirect to prohibited forms
      like WebSection_viewAsWeb, Base_viewHistory, list, ...
    prohibited_folder_name_list: usefull to detect that links does not redirect to
      specified modules like, web_page_module, document_module, ...


65 66 67 68 69
This utility requires wget => 1.12
And a callable varnishlog.
The utility must be run on same server where varnish is running.

web_checker reads varnishlogs to detect if a Query goes to the backend.