README.caddy_frontend.rst 22.5 KB
Newer Older
1 2 3
==============
Caddy Frontend
==============
4

5
Frontend system using Caddy, based on apache-frontend software release, allowing to rewrite and proxy URLs like myinstance.myfrontenddomainname.com to real IP/URL of myinstance.
6

7
Caddy Frontend works using the master instance / slave instance design. It means that a single main instance of Caddy will be used to act as frontend for many slaves.
8

9 10
This documentation covers only specific scenarios. Most of the parameters are described in `software.cfg.json <software.cfg.json>`_.

11 12 13
Software type
=============

14 15 16 17 18
Caddy frontend is available in 4 software types:
  * ``default`` : The standard way to use the Caddy frontend configuring everything with a few given parameters
  * ``custom-personal`` : This software type allow each slave to edit its Caddy configuration file
  * ``default-slave`` : XXX
  * ``custom-personal-slave`` : XXX
19 20 21


About frontend replication
22 23 24
==========================

Slaves of the root instance are sent as a parameter to requested frontends which will process them. The only difference is that they will then return the would-be published information to the root instance instead of publishing it. The root instance will then do a synthesis and publish the information to its slaves. The replicate instance only use 5 type of parameters for itself and will transmit the rest to requested frontends.
25

26
These parameters are:
27

28 29
  * ``-frontend-type`` : the type to deploy frontends with. (default to "default")
  * ``-frontend-quantity`` : The quantity of frontends to request (default to "1")
30
  * ``-frontend-i-state``: The state of frontend i
31
  * ``-frontend-i-software-release-url``: Software release to be used for frontends, default to the current software release
32 33 34
  * ``-frontend-config-i-foo``: Frontend i will be requested with parameter foo, supported parameters are:
    * ``ram-cache-size``
    * ``disk-cache-size``
35 36
  * ``-sla-i-foo`` : where "i" is the number of the concerned frontend (between 1 and "-frontend-quantity") and "foo" a sla parameter.

37
For example::
38 39 40 41 42

  <parameter id="-frontend-quantity">3</parameter>
  <parameter id="-frontend-type">custom-personal</parameter>
  <parameter id="-frontend-2-state">stopped</parameter>
  <parameter id="-sla-3-computer_guid">COMP-1234</parameter>
43
  <parameter id="-frontend-3-software-release-url">https://lab.nexedi.com/nexedi/slapos/raw/someid/software/caddy-frontend/software.cfg</parameter>
44 45


46
will request the third frontend on COMP-1234 and with SR https://lab.nexedi.com/nexedi/slapos/raw/someid/software/caddy-frontend/software.cfg. All frontends will be of software type ``custom-personal``. The second frontend will be requested with the state stopped.
47 48 49 50

*Note*: the way slaves are transformed to a parameter avoid modifying more than 3 lines in the frontend logic.

**Important NOTE**: The way you ask for slave to a replicate frontend is the same as the one you would use for the software given in "-frontend-quantity". Do not forget to use "replicate" for software type. XXXXX So far it is not possible to do a simple request on a replicate frontend if you do not know the software_guid or other sla-parameter of the master instance. In fact we do not know yet the software type of the "requested" frontends. TO BE IMPLEMENTED
51 52 53 54

How to deploy a frontend server
===============================

55 56 57
This is to deploy an entire frontend server with a public IPv4.  If you want to use an already deployed frontend to make your service available via ipv4, switch to the "Example" parts.

First, you will need to request a "master" instance of Caddy Frontend with:
58

59
  * A ``domain`` parameter where the frontend will be available
60 61

like::
62

63 64 65 66 67
  <?xml version='1.0' encoding='utf-8'?>
  <instance>
   <parameter id="domain">moulefrite.org</parameter>
  </instance>

68 69
Then, it is possible to request many slave instances (currently only from slapconsole, UI doesn't work yet) of Caddy Frontend, like::

70
  instance = request(
71
    software_release=caddy_frontend,
72 73 74 75 76
    partition_reference='frontend2',
    shared=True,
    partition_parameter_kw={"url":"https://[1:2:3:4]:1234/someresource"}
  )

77 78 79
Those slave instances will be redirected to the "master" instance, and you will see on the "master" instance the associated proper directives of all slave instances.

Finally, the slave instance will be accessible from: https://someidentifier.moulefrite.org.
80

81 82
About SSL and SlapOS Master Zero Knowledge
==========================================
83 84 85

**IMPORTANT**: One Caddy can not serve more than one specific SSL site and be compatible with obsolete browser (i.e.: IE8). See http://wiki.apache.org/httpd/NameBasedSSLVHostsWithSNI

86 87
SSL keys and certificates are directly send to the frontend cluster in order to follow zero knowledge principle of SlapOS Master.

88 89
*Note*: Until master partition or slave specific certificate is uploaded each slave is served with fallback certificate.  This fallback certificate is self signed, does not match served hostname and results with lack of response on HTTPs.

90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110
Obtaining CA for KeDiFa
-----------------------

KeDiFa uses caucase and so it is required to obtain caucase CA certificate used to sign KeDiFa SSL certificate, in order to be sure that certificates are sent to valid KeDiFa.

The easiest way to do so is to use caucase.

On some secure and trusted box which will be used to upload certificate to master or slave frontend partition install caucase https://pypi.org/project/caucase/

Master and slave partition will return key ``kedifa-caucase-url``, so then create and start a ``caucase-updater`` service::

  caucase-updater \
    --ca-url "${kedifa-caucase-url}" \
    --cas-ca "${frontend_name}.caucased.ca.crt" \
    --ca "${frontend_name}.ca.crt" \
    --crl "${frontend_name}.crl"

where ``frontend_name`` is a frontend cluster to which you will upload the certificate (it can be just one slave).

Make sure it is automatically started when trusted machine reboots: you want to have it running so you can forget about it. It will keep KeDiFa's CA certificate up to date when it gets renewed so you know you are still talking to the same service as when you previously uploaded the certificate, up to the original upload.

111 112 113 114 115 116 117 118 119 120 121 122 123
Master partition
----------------

After requesting master partition it will return ``master-key-generate-auth-url`` and ``master-key-upload-url``.

Doing HTTP GET on ``master-key-generate-auth-url`` will return authentication token, which is used to communicate with ``master-key-upload-url``. This token shall be stored securely.

By doing HTTP PUT to ``master-key-upload-url`` with appended authentication token it is possible to upload PEM bundle of certificate, key and any accompanying CA certificates to the master.

Example sessions is::

  request(...)

124
  curl -g -X GET --cacert "${frontend_name}.ca.crt" --crlfile "${frontend_name}.crl" master-key-generate-auth-url
125 126
  > authtoken

127
  cat certificate.pem ca.pem key.pem > bundle.pem
128

129
  curl -g --upload-file bundle.pem --cacert "${frontend_name}.ca.crt" --crlfile "${frontend_name}.crl" master-key-upload-url+authtoken
130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151

This replaces old request parameters:

 * ``apache-certificate``
 * ``apache-key``
 * ``apache-ca-certificate``

(*Note*: They are still supported for backward compatibility, but any value send to the ``master-key-upload-url`` will supersede information from SlapOS Master.)

Slave partition
---------------

After requesting slave partition it will return ``key-generate-auth-url`` and ``key-upload-url``.

Doing HTTP GET on ``key-generate-auth-url`` will return authentication token, which is used to communicate with ``key-upload-url``. This token shall be stored securely.

By doing HTTP PUT to ``key-upload-url`` with appended authentication token it is possible to upload PEM bundle of certificate, key and any accompanying CA certificates to the master.

Example sessions is::

  request(...)

152
  curl -g -X GET --cacert "${frontend_name}.ca.crt" --crlfile "${frontend_name}.crl" key-generate-auth-url
153 154
  > authtoken

155
  cat certificate.pem ca.pem key.pem > bundle.pem
156

157
  curl -g --upload-file bundle.pem --cacert "${frontend_name}.ca.crt" --crlfile "${frontend_name}.crl" key-upload-url+authtoken
158 159 160 161 162 163 164 165 166 167

This replaces old request parameters:

 * ``ssl_crt``
 * ``ssl_key``
 * ``ssl_ca_crt``

(*Note*: They are still supported for backward compatibility, but any value send to the ``key-upload-url`` will supersede information from SlapOS Master.)


168 169 170 171 172 173
Instance Parameters
===================

Master Instance Parameters
--------------------------

174
The parameters for instances are described at `instance-input-schema.json <instance-input-schema.json>`_.
175 176 177 178 179

Here some additional informations about the parameters listed, below:

domain
~~~~~~
180 181 182

Name of the domain to be used (example: mydomain.com). Sub domains of this domain will be used for the slave instances (example: instance12345.mydomain.com). It is then recommended to add a wild card in DNS for the sub domains of the chosen domain like::

183
  *.mydomain.com. IN A 123.123.123.123
184 185

Using the IP given by the Master Instance.  "domain" is a mandatory Parameter.
186 187 188

port
~~~~
189
Port used by Caddy. Optional parameter, defaults to 4443.
190 191 192

plain_http_port
~~~~~~~~~~~~~~~
193
Port used by Caddy to serve plain http (only used to redirect to https).
194 195 196 197 198 199
Optional parameter, defaults to 8080.


Slave Instance Parameters
-------------------------

200
The parameters for instances are described at `instance-slave-input-schema.json <instance-slave-input-schema.json>`_.
201 202 203 204 205 206 207

Here some additional informations about the parameters listed, below:

path
~~~~
Only used if type is "zope".

208
Will append the specified path to the "VirtualHostRoot" of the zope's VirtualHostMonster.
209 210 211 212 213 214

"path" is an optional parameter, ignored if not specified.
Example of value: "/erp5/web_site_module/hosting/"

url
~~~
215
URL of the backend to use, optional but will result with non functioning slave.
216

217 218 219 220
Example: http://mybackend.com/myresource

enable_cache
~~~~~~~~~~~~
221

222
Enables HTTP cache, optional.
223

224

225 226
health-check-*
~~~~~~~~~~~~~~
227 228 229

This set of parameters is used to control the way how the backend checks will be done. Such active checks can be really useful for `stale-if-error` caching technique and especially in case if backend is very slow to reply or to connect to.

230
`health-check-http-method` can be used to configure the HTTP method used to check the backend. Special method `CONNECT` can be used to check only for connection attempt.
231

232
Please be aware that the `health-check-timeout` is really short by default, so in case if `/` of the backend is slow to reply configure proper path with `health-check-http-path` to not mark such backend down too fast, before increasing the check timeout.
233

234 235
Thanks to using health-check it's possible to configure failover system. By providing `health-check-failover-url` or `health-check-failover-https-url` some special backend can be used to reply in case if original backend replies with error (codes like `5xx`). As a note one can setup this failover URL like `https://failover.example.com/?p=` so that the path from the incoming request will be passed as parameter. Additionally authentication to failover URL is supported with `health-check-authenticate-to-failover-backend` and SSL Proxy verification with `health-check-failover-ssl-proxy-verify` and `health-check-failover-ssl-proxy-ca-crt`.

236 237
**Note**: It's important to correctly configure failover URL response, especially in case if it's expected to use `stale-if-error` simulation available while `enable_cache` is used. In order to serve pages from cache the failover URL have to return error HTTP code (like 503 SERVICE_UNAVAILABLE), so that in such case cached page will have precedence over the reply from failover URL.

238 239 240
Examples
========

241
Here are some example of how to make your SlapOS service available through an already deployed frontend.
242 243 244 245 246 247

Simple Example (default)
------------------------

Request slave frontend instance so that https://[1:2:3:4:5:6:7:8]:1234 will be
redirected and accessible from the proxy::
248

249
  instance = request(
250
    software_release=caddy_frontend,
251 252 253 254 255 256 257 258 259 260 261 262 263 264 265
    software_type="RootSoftwareInstance",
    partition_reference='my frontend',
    shared=True,
    partition_parameter_kw={
        "url":"https://[1:2:3:4:5:6:7:8]:1234",
    }
  )


Zope Example (default)
----------------------

Request slave frontend instance using a Zope backend so that
https://[1:2:3:4:5:6:7:8]:1234 will be redirected and accessible from the
proxy::
266

267
  instance = request(
268
    software_release=caddy_frontend,
269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285
    software_type="RootSoftwareInstance",
    partition_reference='my frontend',
    shared=True,
    partition_parameter_kw={
        "url":"https://[1:2:3:4:5:6:7:8]:1234",
        "type":"zope",
    }
  )


Advanced example 
-----------------

Request slave frontend instance using a Zope backend, with Varnish activated,
listening to a custom domain and redirecting to /erp5/ so that
https://[1:2:3:4:5:6:7:8]:1234/erp5/ will be redirected and accessible from
the proxy::
286

287
  instance = request(
288
    software_release=caddy_frontend,
289 290 291 292 293 294 295 296 297 298 299 300 301 302 303
    software_type="RootSoftwareInstance",
    partition_reference='my frontend',
    shared=True,
    partition_parameter_kw={
        "url":"https://[1:2:3:4:5:6:7:8]:1234",
        "enable_cache":"true",
        "type":"zope",
        "path":"/erp5",
        "domain":"mycustomdomain.com",
    }
  )

Simple Example 
---------------

304 305
Request slave frontend instance so that https://[1:2:3:4:5:6:7:8]:1234 will be::

306
  instance = request(
307
    software_release=caddy_frontend,
308 309 310 311 312 313 314
    software_type="RootSoftwareInstance",
    partition_reference='my frontend',
    shared=True,
    software_type="custom-personal",
    partition_parameter_kw={
        "url":"https://[1:2:3:4:5:6:7:8]:1234",

315 316 317 318
Simple Cache Example - XXX - to be written
------------------------------------------

Request slave frontend instance so that https://[1:2:3:4:5:6:7:8]:1234 will be::
319 320

  instance = request(
321
    software_release=caddy_frontend,
322 323 324 325 326 327 328 329 330
    software_type="RootSoftwareInstance",
    partition_reference='my frontend',
    shared=True,
    software_type="custom-personal",
    partition_parameter_kw={
        "url":"https://[1:2:3:4:5:6:7:8]:1234",
	"domain": "www.example.org",
	"enable_cache": "True",

331 332
Advanced example - XXX - to be written
--------------------------------------
333 334

Request slave frontend instance using custom apache configuration, willing to use cache and ssl certificates.
335
Listening to a custom domain and redirecting to /erp5/ so that
336 337
https://[1:2:3:4:5:6:7:8]:1234/erp5/ will be redirected and accessible from
the proxy::
338

339
  instance = request(
340
    software_release=caddy_frontend,
341 342 343 344 345 346 347 348 349 350 351 352
    software_type="RootSoftwareInstance",
    partition_reference='my frontend',
    shared=True,
    software_type="custom-personal",
    partition_parameter_kw={
        "url":"https://[1:2:3:4:5:6:7:8]:1234",
        "enable_cache":"true",
        "type":"zope",
        "path":"/erp5",
        "domain":"example.org",

    "ssl_key":"-----BEGIN RSA PRIVATE KEY-----
353 354 355 356 357 358 359 360 361 362 363
  XXXXXXX..........XXXXXXXXXXXXXXX
  -----END RSA PRIVATE KEY-----",
      "ssl_crt":'-----BEGIN CERTIFICATE-----
  XXXXXXXXXXX.............XXXXXXXXXXXXXXXXXXX
  -----END CERTIFICATE-----',
      "ssl_ca_crt":'-----BEGIN CERTIFICATE-----
  XXXXXXXXX...........XXXXXXXXXXXXXXXXX
  -----END CERTIFICATE-----',
      "ssl_csr":'-----BEGIN CERTIFICATE REQUEST-----
  XXXXXXXXXXXXXXX.............XXXXXXXXXXXXXXXXXX
  -----END CERTIFICATE REQUEST-----',
364 365 366
    }
  )

367 368 369 370 371 372 373 374 375 376 377
Promises
========

Note that in some cases promises will fail:

 * not possible to request frontend slave for monitoring (monitoring frontend promise)
 * no slaves present (configuration promise and others)
 * no cached slave present (configuration promise and others)

This is known issue and shall be tackled soon.

378 379 380 381 382 383 384
KeDiFa
======

Additional partition with KeDiFa (Key Distribution Facility) is by default requested on the same computer as master frontend partition.

By adding to the request keys like ``-sla-kedifa-<key>`` it is possible to provide SLA information for kedifa partition. Eg to put it on computer ``couscous`` it shall be ``-sla-kedifa-computer_guid: couscous``.

385 386
Also ``-kedifa-software-release-url`` can be used to override the software release for kedifa partition.

387 388 389 390 391 392
Notes
=====

It is not possible with slapos to listen to port <= 1024, because process are
not run as root.

393 394
Solution 1 (iptables)
---------------------
395 396

It is a good idea then to go on the node where the instance is
397
and set some ``iptables`` rules like (if using default ports)::
398 399 400

  iptables -t nat -A PREROUTING -p tcp -d {public_ipv4} --dport 443 -j DNAT --to-destination {listening_ipv4}:4443
  iptables -t nat -A PREROUTING -p tcp -d {public_ipv4} --dport 80 -j DNAT --to-destination {listening_ipv4}:8080
401 402
  ip6tables -t nat -A PREROUTING -p tcp -d {public_ipv6} --dport 443 -j DNAT --to-destination {listening_ipv6}:4443
  ip6tables -t nat -A PREROUTING -p tcp -d {public_ipv6} --dport 80 -j DNAT --to-destination {listening_ipv6}:8080
403

404
Where ``{public_ipv[46]}`` is the public IP of your server, or at least the LAN IP to where your NAT will forward to, and ``{listening_ipv[46]}`` is the private ipv4 (like 10.0.34.123) that the instance is using and sending as connection parameter.
405

406 407 408 409 410 411 412 413 414
Additionally in order to access the server by itself such entries are needed in ``OUTPUT`` chain (as the internal packets won't appear in the ``PREROUTING`` chain)::

  iptables -t nat -A OUTPUT -p tcp -d {public_ipv4} --dport 443 -j DNAT --to {listening_ipv4}:4443
  iptables -t nat -A OUTPUT -p tcp -d {public_ipv4} --dport 80 -j DNAT --to {listening_ipv4}:8080
  ip6tables -t nat -A OUTPUT -p tcp -d {public_ipv6} --dport 443 -j DNAT --to {listening_ipv6}:4443
  ip6tables -t nat -A OUTPUT -p tcp -d {public_ipv6} --dport 80 -j DNAT --to {listening_ipv6}:8080

Solution 2 (network capability)
-------------------------------
415

416
It is also possible to directly allow the service to listen on 80 and 443 ports using the following command::
417

418
  setcap 'cap_net_bind_service=+ep' /opt/slapgrid/$CADDY_FRONTEND_SOFTWARE_RELEASE_MD5/go.work/bin/caddy
419 420 421
  setcap 'cap_net_bind_service=+ep' /opt/slapgrid/$CADDY_FRONTEND_SOFTWARE_RELEASE_MD5/parts/6tunnel/bin/6tunnel

Then specify in the master instance parameters:
422

423 424
 * set ``port`` to ``443``
 * set ``plain_http_port`` to ``80``
425

426 427 428 429 430 431 432 433 434 435 436 437
Authentication to the backend
=============================

The cluster generates CA served by caucase, available with ``backend-client-caucase-url`` return parameter.

Then, each slave configured with ``authenticate-to-backend`` to true, will use a certificate signed by this CA while accessing https backend.

This allows backends to:

 * restrict access only from some frontend clusters
 * trust values (like ``X-Forwarded-For``) sent by the frontend

438 439 440
Technical notes
===============

441 442 443 444 445 446 447 448 449
Profile development guidelines
------------------------------

Keep the naming in instance profiles:

 * ``software_parameter_dict`` for values coming from software
 * ``instance_parameter_dict`` for **local** values generated by the instance, except ``configuration``
 * ``slapparameter_dict`` for values coming from SlapOS Master

450 451 452 453 454 455 456
Instantiated cluster structure
------------------------------

Instantiating caddy-frontend results with a cluster in various partitions:

 * master (the controlling one)
 * kedifa (contains kedifa server)
457
 * frontend-node-N which contains the running processes to serve sites - this partition can be replicated by ``-frontend-quantity`` parameter
458

459
It means sites are served in ``frontend-node-N`` partition, and this partition is structured as:
460 461 462 463 464

 * Caddy serving the browser [client-facing-caddy]
 * (optional) Apache Traffic Server for caching [ats]
 * Haproxy as a way to communicate to the backend [backend-facing-haproxy]
 * some other additional tools (6tunnel, monitor, etc)
465

466 467 468 469 470 471 472 473 474
In case of slaves without cache (``enable_cache = False``) the request will travel as follows::

  client-facing-caddy --> backend-facing-haproxy --> backend

In case of slaves using cache (``enable_cache = True``) the request will travel as follows::

  client-facing-caddy --> ats --> backend-facing-haproxy --> backend

Usage of Haproxy as a relay to the backend allows much better control of the backend, removes the hassle of checking the backend from Caddy and allows future developments like client SSL certificates to the backend or even health checks.
475 476 477 478 479 480

Kedifa implementation
---------------------

`Kedifa <https://lab.nexedi.com/nexedi/kedifa>`_ server runs on kedifa partition.

481
Each `frontend-node-N` partition downloads certificates from the kedifa server.
482 483 484

Caucase (exposed by ``kedifa-caucase-url`` in master partition parameters) is used to handle certificates for authentication to kedifa server.

485
If ``automatic-internal-kedifa-caucase-csr`` is enabled (by default it is) there are scripts running on master partition to simulate human to sign certificates for each frontend-node-N node.
486 487 488 489

Support for X-Real-Ip and X-Forwarded-For
-----------------------------------------

490
X-Forwarded-For and X-Real-Ip are transmitted to the backend, but only for IPv4 access to the frontend. In case of IPv6 access, the provided IP will be wrong, because of using 6tunnel.
491 492 493 494 495 496 497 498 499 500 501 502 503 504 505 506 507 508 509 510 511 512 513 514 515 516 517 518 519

Automatic Internal Caucase CSR
------------------------------

Cluster is composed on many instances, which are landing on separate partitions, so some way is needed to bootstrap trust between the partitions.

There are two ways to achieve it:

 * use default, Automatic Internal Caucase CSR used to replace human to sign CSRs against internal CAUCASEs automatic bootstrap, which leads to some issues, described later
 * switch to manual bootstrap, which requires human to create and manage user certificate (with caucase-updater) and then sign new frontend nodes appearing in the system

The issues during automatic bootstrap are:

 * rouge or hacked SlapOS Master can result with adding rouge frontend nodes to the cluster, which will be trusted, so it will be possible to fetch all certificates and keys from Kedifa or to login to backends
 * when new node is added there is short window, when rouge person is able to trick automatic signing, and have it's own node added

In both cases promises will fail on node which is not able to get signed, but in case of Kedifa the damage already happened (certificates and keys are compromised). So in case if cluster administrator wants to stay on the safe side, both automatic bootstraps shall be turned off.

How the automatic signing works
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

Having in mind such structure:

 * instance with caucase: ``caucase-instance``
 * N instances which want to get their CSR signed: ``csr-instance``

In ``caucase-instance`` CAUCASE user is created by automatically signing one user certificate, which allows to sign service certificates.

The ``csr-instance`` creates CSR, extracts the ID of the CSR, exposes it via HTTP and ask caucase on ``caucase-instance`` to sign it. The ``caucase-instance`` checks that exposed CSR id matches the one send to caucase and by using created user to signs it.