I am using the following to detect links and their range in a string:
let toParse = """
<td>Here's a few links which I want to test on:<p>https://news.ycombinator.com/ask</p>
<p>https://news.ycombinator.com/show</p>
<p>https://news.ycombinator.com/lists</p>
Will it work?
</td>
"""
print("Parse: \(toParse)")
do {
let detector = try NSDataDetector(types: NSTextCheckingResult.CheckingType.link.rawValue)
let matches = detector.matches(in: toParse, options: .reportCompletion, range: NSMakeRange(0, toParse.count))
for match in matches {
print("url: \(match.url!) at \(match.range.lowerBound), \(match.range.length)")
}
} catch let error {
debugPrint(error.localizedDescription)
}
This prints:
url: https://news.ycombinator.com/ask%3C/p%3E at 50, 36
url: https://news.ycombinator.com/show%3C/p%3E at 90, 37
url: https://news.ycombinator.com/lists%3C/p%3E at 131, 38
As you can notice, it's picking the </p> as part of the link. How can I prevent this?
The documentation https://developer.apple.com/documentation/foundation/nsdatadetector says « You should only use NSDataDetector on natural language text »
You can load your HTML in a web view and then enable Data Detectors on it, or parse the HTML with NSAttributedString first.