Separate list element while scraping from web with commas

83 Views Asked by At

I'm scraping data from web, there are li element in one div, the interface in the web like this

 Job Description:
• Developing application programming interfaces (APIs) to support mobile functionality
• Keeping up to date with the terminology, concepts and best practices for coding mobile apps
• Using and adapting existing web applications for apps
• working closely with colleagues to constantly innovate app functionality and design

this is part of my scraping code for those section are shown below ( job and jobTtle is JSON Array)

Elements ele3=doc.select("div.job-sections div[itemprop=description] section#st-jobDescription");
for (Element element3 : ele3.select("div[itemprop=responsibilities] ul")) {
     String job_description=element3.select("li").text();
     job.put(jobTitle.put(new JSONObject().put("description",job_description)));
}

The output like this

{"description" : "Developing application programming interfaces (APIs) to support mobile functionality Keeping up to date with the terminology, concepts and best practices for coding mobile apps Using and adapting existing web applications for apps Working closely with colleagues to constantly innovate app functionality and design"}

but I want to separate every li element with comma, so the output should be like this

{"description" : ["Developing application programming interfaces (APIs) to support mobile functionality", "Keeping up to date with the terminology, concepts and best practices for coding mobile apps", "Using and adapting existing web applications for apps", "Working closely with colleagues to constantly innovate app functionality and design"]}

how could I solve this ? anyone can help? Thanks

1

There are 1 best solutions below

3
On

You need to change the way you are storing the job responsibilities. You are creating JSON Object where your desired type is a JSON Array.

// JSON Array

Elements responsibilityElements = ele3.select("div[itemprop=responsibilities] ul li");

JSONArray responsibilities = new JSONArray();

for (Element responsibilityElement : responsibilityElements) {
     String description = responsibilityElement.text();

     responsibilities.put(description);
}

job.put("description", responsibilities);

// In A Single String

Elements responsibilityElements = doc.select("ul li");
//        Elements responsibilityElements = ele3.select("div[itemprop=responsibilities] ul li");

List<String> lines = new ArrayList<>();

for (Element responsibilityElement : responsibilityElements) {
    lines.add(responsibilityElement.text());
}

String description = String.join(", ", lines);
job.put("description", description);