Hardcoded regular expressions with backtracking issues:
-[Repository name validation](https://gitlab.com/gitlab-org/gitlab/-/issues/220019)
-[Link validation](https://gitlab.com/gitlab-org/gitlab/-/issues/218753), and [a bypass](https://gitlab.com/gitlab-org/gitlab/-/issues/273771)
-[Entity name validation](https://gitlab.com/gitlab-org/gitlab/-/issues/289934)
-[Validating color codes](https://gitlab.com/gitlab-org/gitlab/commit/717824144f8181bef524592eab882dd7525a60ef)
Consider the following example application, which defines a check using a regular expression. A user entering `user@aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa!.com` as the email on a form will hang the web server.
...
...
@@ -141,22 +163,32 @@ class Email < ApplicationRecord
defdomain_matches
errors.add(:email,'does not match')ifemail=~DOMAIN_MATCH
end
end
```
### Mitigation
GitLab has `Gitlab::UntrustedRegexp` which internally uses the [`re2`](https://github.com/google/re2/wiki/Syntax) library.
By utilizing `re2`, we get a strict limit on total execution time, and a smaller subset of available regex features.
#### Ruby
GitLab has [`Gitlab::UntrustedRegexp`](https://gitlab.com/gitlab-org/gitlab/-/blob/master/lib/gitlab/untrusted_regexp.rb)
which internally uses the [`re2`](https://github.com/google/re2/wiki/Syntax) library.
`re2` does not support backtracking so we get constant execution time, and a smaller subset of available regex features.
All user-provided regular expressions should use `Gitlab::UntrustedRegexp`.
For other regular expressions, here are a few guidelines:
- Remove unnecessary backtracking.
- Avoid nested quantifiers if possible.
- Try to be as precise as possible in your regex and avoid the `.` if something else can be used (e.g.: Use `_[^_]+_` instead of `_.*_` to match `_text here_`).
- If there's a clean non-regex solution, such as `String#start_with?`, consider using it
- Ruby supports some advanced regex features like [atomic groups](https://www.regular-expressions.info/atomic.html)
and [possessive quantifiers](https://www.regular-expressions.info/possessive.html) that eleminate backtracking
- Avoid nested quantifiers if possible (for example `(a+)+`)
- Try to be as precise as possible in your regex and avoid the `.` if there's an alternative
- For example, Use `_[^_]+_` instead of `_.*_` to match `_text here_`
- If in doubt, don't hesitate to ping `@gitlab-com/gl-security/appsec`
#### Go
An example can be found [in this commit](https://gitlab.com/gitlab-org/gitlab/commit/717824144f8181bef524592eab882dd7525a60ef).
Go's [`regexp`](https://golang.org/pkg/regexp/) package uses `re2` and isn't vulnerable to backtracking issues.
## Further Links
...
...
@@ -466,7 +498,7 @@ where you can't avoid this:
characters, for example).
- Always use `--` to separate options from arguments.
### Ruby
#### Ruby
Consider using `system("command", "arg0", "arg1", ...)` whenever you can. This prevents an attacker
from concatenating commands.
...
...
@@ -475,7 +507,7 @@ For more examples on how to use shell commands securely, consult
[Guidelines for shell commands in the GitLab codebase](shell_commands.md).
It contains various examples on how to securely call OS commands.
### Go
#### Go
Go has built-in protections that usually prevent an attacker from successfully injecting OS commands.