Commit 5ffb0718 authored by Rémy Coutable's avatar Rémy Coutable

Merge branch 'autolink-filter-text-parse' into 'master'

Improve AutolinkFilter#text_parse performance

## What does this MR do?

This MR improves the performance of `AutolinkFilter#text_parse` by using XPath queries for filtering out most text nodes.

## Are there points in the code the reviewer needs to double check?

Mostly the styling of things.

## Why was this MR needed?

Parsing text nodes is slow, mostly because most of this happens in Ruby.

## What are the relevant issue numbers?

https://gitlab.com/gitlab-org/gitlab-ce/issues/18593

## Does this MR meet the acceptance criteria?

- [x] [CHANGELOG](https://gitlab.com/gitlab-org/gitlab-ce/blob/master/CHANGELOG) entry added
- [x] ~~[Documentation created/updated](https://gitlab.com/gitlab-org/gitlab-ce/blob/master/doc/development/doc_styleguide.md)~~
- [x] ~~API support added~~
- Tests
  - [x] ~~Added for this feature/bug~~
  - [ ] All builds are passing
- [x] Conform by the [style guides](https://gitlab.com/gitlab-org/gitlab-ce/blob/master/CONTRIBUTING.md#style-guides)
- [x] Branch has no merge conflicts with `master` (if you do - rebase it please)
- [x] [Squashed related commits together](https://git-scm.com/book/en/Git-Tools-Rewriting-History#Squashing-Commits)

See merge request !5629
parents 7b94f23b dd35c3dd
...@@ -16,6 +16,7 @@ v 8.11.0 (unreleased) ...@@ -16,6 +16,7 @@ v 8.11.0 (unreleased)
- Add support for using RequestStore within Sidekiq tasks via SIDEKIQ_REQUEST_STORE env variable - Add support for using RequestStore within Sidekiq tasks via SIDEKIQ_REQUEST_STORE env variable
- Optimize maximum user access level lookup in loading of notes - Optimize maximum user access level lookup in loading of notes
- Add "No one can push" as an option for protected branches. !5081 - Add "No one can push" as an option for protected branches. !5081
- Improve performance of AutolinkFilter#text_parse by using XPath
- Environments have an url to link to - Environments have an url to link to
- Remove unused images (ClemMakesApps) - Remove unused images (ClemMakesApps)
- Limit git rev-list output count to one in forced push check - Limit git rev-list output count to one in forced push check
......
...@@ -31,6 +31,14 @@ module Banzai ...@@ -31,6 +31,14 @@ module Banzai
# Text matching LINK_PATTERN inside these elements will not be linked # Text matching LINK_PATTERN inside these elements will not be linked
IGNORE_PARENTS = %w(a code kbd pre script style).to_set IGNORE_PARENTS = %w(a code kbd pre script style).to_set
# The XPath query to use for finding text nodes to parse.
TEXT_QUERY = %Q(descendant-or-self::text()[
not(#{IGNORE_PARENTS.map { |p| "ancestor::#{p}" }.join(' or ')})
and contains(., '://')
and not(starts-with(., 'http'))
and not(starts-with(., 'ftp'))
])
def call def call
return doc if context[:autolink] == false return doc if context[:autolink] == false
...@@ -66,16 +74,11 @@ module Banzai ...@@ -66,16 +74,11 @@ module Banzai
# Autolinks any text matching LINK_PATTERN that Rinku didn't already # Autolinks any text matching LINK_PATTERN that Rinku didn't already
# replace # replace
def text_parse def text_parse
search_text_nodes(doc).each do |node| doc.xpath(TEXT_QUERY).each do |node|
content = node.to_html content = node.to_html
next if has_ancestor?(node, IGNORE_PARENTS)
next unless content.match(LINK_PATTERN) next unless content.match(LINK_PATTERN)
# If Rinku didn't link this, there's probably a good reason, so we'll
# skip it too
next if content.start_with?(*%w(http https ftp))
html = autolink_filter(content) html = autolink_filter(content)
next if html == content next if html == content
......
Markdown is supported
0%
or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment