T" /> T" /> T"/>

Java / Android HTML custom tag parser

961 Views Asked by At

I'm trying to figure out a way to parse a html file with custom tags in the form:

[custom tag="id"]

Here's an example of a file I'm working with:

<p>This is an <em>amazing</em> example. </p>
<p>Such amazement, <span>many wow.</span> </p>
<p>Oh look, a wild [custom tag="amaze"] appears.</p>
We need maor embeds <a href="http://youtu.be/F5nLu232KRo"> bro

What I would like (in an ideal world) is to get back is a list of elements):

List foundElements = [text, custom tag, text, link, text]

Where the element in the above list contains:

Text:

<p>This is an <em>amazing</em> example. </p>
<p>Such amazement, <span>many wow.</span> </p>
<p>Oh look, a wild [custom tag="amaze"] appears.</p>
We need maor embeds

Custom tag:

[custom tag="amaze"]

Link:

<a href="http://youtu.be/F5nLu232KRo">

Text:

 appears.</p>We need maor embeds

What I've tried:

  1. Jsoup
    Jsoup is great, it works perfectly for HTML. The issue is I can't define custom tags with opening "[" and closing "]". Correct me if I'm wrong?
  2. Jericho
    Again like Jsoup, Jericho works great..except for defining custom tags. You're required to use "<".
  3. Java Regex
    This is the option I really don't want to go for. It's not reliable and there's a lot of string manipulation that is brittle, especially when you're matching against a lot of regexes.

Last but not least, I'm looking for a performance orientated solution as this is done on an Android client.

All suggestions welcome!

0

There are 0 best solutions below