IT技术 - 用 <a> 标签包裹 http 文本 - 吾爱随笔录

用 <a> 标签包裹 http 文本

IT技术 javascript jquery regex string

2021-03-11 02:40:08

如何找到页面上以 http:// 开头的每个单词并用标签环绕它？

我可以使用正则表达式之类的东西吗？

3个回答

我非常不同意 jQuery 可以在这里找到解决方案。当然，您必须处理一些 textNode 元素属性，但是在拆分匹配的节点后将 DOM 重新组合在一起可以使用 jQuery 库变得更容易一些。

以下代码内联记录以解释所采取的操作。我已经把它写成一个 jQuery 插件，以防你只是想把它移到别处。通过这种方式，您可以确定要为其转换 URL 的元素的范围，或者您可以简单地使用 $("body") 选择器。

(function($) {
    $.fn.anchorTextUrls = function() {
        // Test a text node's contents for URLs and split and rebuild it with an achor
        var testAndTag = function(el) {
            // Test for URLs along whitespace and punctuation boundaries (don't look too hard or you will be consumed)
            var m = el.nodeValue.match(/(https?:\/\/.*?)[.!?;,]?(\s+|"|$)/);

            // If we've found a valid URL, m[1] contains the URL
            if (m) {
                // Clone the text node to hold the "tail end" of the split node
                var tail = $(el).clone()[0];

                // Substring the nodeValue attribute of the text nodes based on the match boundaries
                el.nodeValue = el.nodeValue.substring(0, el.nodeValue.indexOf(m[1]));
                tail.nodeValue = tail.nodeValue.substring(tail.nodeValue.indexOf(m[1]) + m[1].length);

                // Rebuild the DOM inserting the new anchor element between the split text nodes
                $(el).after(tail).after($("<a></a>").attr("href", m[1]).html(m[1]));

                // Recurse on the new tail node to check for more URLs
                testAndTag(tail);
            }

            // Behave like a function
            return false;
        }

        // For each element selected by jQuery
        this.each(function() {
            // Select all descendant nodes of the element and pick out only text nodes
            var textNodes = $(this).add("*", this).contents().filter(function() {
                return this.nodeType == 3
            });


            // Take action on each text node
            $.each(textNodes, function(i, el) {
                testAndTag(el);
            });
        });
    }
}(jQuery));

$("body").anchorTextUrls(); //Sample call

请记住，鉴于我编写它来填充textNodes数组的方式，该方法将查找所有后代文本节点，而不仅仅是直接子文本节点。如果您希望它仅在特定选择器内的文本中替换 URL，请删除添加所选元素的所有后代的 .add("*", this) 调用。

这是一个小提琴示例。

@Tim 不要看得太近，否则你会被消耗掉！其实我没有写完，只是写了后面的边界检查。该基地是keevkilla在Snipplr上从工作中借来的。应该早点到账。

2021-04-21 02:40:08

啊，是的，那好多了 - 之前的正则表达式规则是将任何字符串与 . 进入链接 :) 现在就像一个魅力。非常感谢！

2021-05-03 02:40:08

@Tim 对于正则表达式的value，它试图尽可能地考虑常见的标点符号或空格，因为我认为它可能会出现在 URL 的末尾，因此它不必专门用空格分隔。

2021-05-04 02:40:08

考虑到您已经使用了 jQuery（+less 代码）并编写了 REGEX FROM HELL - 我会将其标记为正确答案。感谢您的时间和 JS 和 jQ 的卓越！:D

2021-05-06 02:40:08

@Tim 我刚刚对 textNodes 选择器进行了一些小编辑，这很重要。我需要使用 .add() 而不是 .find() 来确保顶级的文本节点子节点包含在标记中。如果您只想从所选元素中获取文本节点而不是其子元素，只需删除 .add() 调用。此外，事实证明我的边界正则表达式优于长期设计的混乱，因为它很好地找到了 URL 的限制，并且只需要起点。

2021-05-12 02:40:08

这是 jQuery 没有直接帮助您的少数事情之一。您基本上必须遍历 DOM 树并检查文本节点 ( nodeType === 3)；如果您找到包含要换行的目标文本的文本节点（“http://.....”，无论您想应用什么规则），然后将文本节点（使用splitText）分成三部分（部分在字符串之前，作为字符串的部分，以及在字符串之后的部分），然后将a元素放在其中的第二个周围。

这听起来有点复杂，但其实并没有那么糟糕。它只是一个递归下降 walker 函数（用于处理 DOM），一个正则表达式匹配来查找你想要替换的东西，然后调用几次splitText, createElement, insertBefore, appendChild。

这是一个搜索固定字符串的示例；只需为“http://”添加正则表达式匹配：

walk(document.body, "foo");

function walk(node, targetString) {
  var child;

  switch (node.nodeType) {
    case 1: // Element
      for (child = node.firstChild;
           child;
           child = child.nextSibling) {
        walk(child, targetString);
      }
      break;

    case 3: // Text node
      handleText(node, targetString);
      break;
  }
}

function handleText(node, targetString) {
  var start, targetNode, followingNode, wrapper;

  // Does the text contain our target string?
  // (This would be a regex test in your http://... case)
  start = node.nodeValue.indexOf(targetString);
  if (start >= 0) {
    // Split at the beginning of the match
    targetNode = node.splitText(start);

    // Split at the end of the match
    followingNode = targetNode.splitText(targetString.length);

    // Wrap the target in an element; in this case, we'll
    // use a `span` with a class, but you'd use an `a`.
    // First we create the wrapper and insert it in front
    // of the target text.
    wrapper = document.createElement('span');
    wrapper.className = "wrapper";
    targetNode.parentNode.insertBefore(wrapper, targetNode);

    // Now we move the target text inside it
    wrapper.appendChild(targetNode);

    // Clean up any empty nodes (in case the target text
    // was at the beginning or end of a text ndoe)
    if (node.nodeValue.length == 0) {
      node.parentNode.removeChild(node);
    }
    if (followingNode.nodeValue.length == 0) {
      followingNode.parentNode.removeChild(followingNode);
    }
  }
}

活生生的例子

更新：如果在同一个文本节点中有多个匹配项（doh！），上面没有处理它。噢究竟发生了什么，我做了一个正则表达式匹配-你将不得不调整正则表达式，也可能做一些后期处理上的每一场比赛，因为这里有什么是过于简单化。但这是一个开始：

// The regexp should have a capture group that
// will be the href. In our case below, we just
// make it the whole thing, but that's up to you.
// THIS REGEXP IS ALMOST CERTAINLY TOO SIMPLISTIC
// AND WILL NEED ADJUSTING (for instance: what if
// the link appears at the end of a sentence and
// it shouldn't include the ending puncutation?).
walk(document.body, /(http:\/\/[^ ]+)/i);

function walk(node, targetRe) {
  var child;

  switch (node.nodeType) {
    case 1: // Element
      for (child = node.firstChild;
           child;
           child = child.nextSibling) {
        walk(child, targetRe);
      }
      break;

    case 3: // Text node
      handleText(node, targetRe);
      break;
  }
}

function handleText(node, targetRe) {
  var match, targetNode, followingNode, wrapper;

  // Does the text contain our target string?
  // (This would be a regex test in your http://... case)
  match = targetRe.exec(node.nodeValue);
  if (match) {
    // Split at the beginning of the match
    targetNode = node.splitText(match.index);

    // Split at the end of the match.
    // match[0] is the full text that was matched.
    followingNode = targetNode.splitText(match[0].length);

    // Wrap the target in an `a` element.
    // First we create the wrapper and insert it in front
    // of the target text. We use the first capture group
    // as the `href`.
    wrapper = document.createElement('a');
    wrapper.href = match[1];
    targetNode.parentNode.insertBefore(wrapper, targetNode);

    // Now we move the target text inside it
    wrapper.appendChild(targetNode);

    // Clean up any empty nodes (in case the target text
    // was at the beginning or end of a text ndoe)
    if (node.nodeValue.length == 0) {
      node.parentNode.removeChild(node);
    }
    if (followingNode.nodeValue.length == 0) {
      followingNode.parentNode.removeChild(followingNode);
    }

    // Continue with the next match in the node, if any
    match = followingNode
      ? targetRe.exec(followingNode.nodeValue)
      : null;
  }
}

活生生的例子

@Tim：这就是我的意思，您必须如何扩展它以支持进行正则表达式匹配或类似操作。我只是在讨论如何查找文本并将其包装在元素中；如评论中所示，为您想要的 http 模式（可能使用正则表达式）进行匹配留给读者作为练习。:-)

2021-04-18 02:40:08

@TJ：哈哈，好吧，很公平:) 我想无论如何我都可以从这里解决 - 感谢您的帮助。

2021-04-19 02:40:08

那么，呃，我们可以在这个问题上获得更多的选票吗？这比我的小公吨好一公吨，“这已经解决了。”

2021-04-24 02:40:08

@TJ：对不起，伙计......改变这个只会围绕“http：//”文本而不是链接的其余部分： walk(document.body, "http://"); 请指教..！非常感谢

2021-04-27 02:40:08

蒂姆：我给了你时间来扩展它，你做到了 :) 太棒了。谢谢TJ！！如果我可以投票更多，我会。

2021-04-28 02:40:08

我实际上不是，但你可以试试

$('a([href^="http://"])').each( function(){
        //perform your task
    })

其它你可能感兴趣的问题

上一篇如何为 Google Charts 中的特定列添加颜色下一篇我们在 javascript 中有 getElementsByClassName 吗？