How can I scrape, crawl, collect results from Google search with JSoup

33 Views Asked by At

It already worked a few weeks before, but now I am getting a page with the choice to accept or reject cookies.

My code:

String useragent = "Mozilla/5.0 (Windows NT 6.1) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/31.0.1650.57 Safari/537.36";
    String urlbas = "https://www.google.com/search";
    String url = urlbas + "?q=Vienna&num=100&start=1";
    Map<String, String> cookies = null;
    try {
        // connect one time.                
        Connection con = Jsoup.connect(urlbas);
        Connection.Response res = con.execute();
        cookies = res.cookies();
        
        String ckey = "CONSENT";
        String cval = "YES+shp.gws-" + LocalDate.now().toString().replace("-", "") + "-0-RC2.en+FX+374";
        cookies.replace(ckey, cval);
        Document doc = Jsoup
                .connect(url)
                .userAgent(useragent)
                .timeout(5000)
                .cookies(cookies)
                .ignoreHttpErrors(true)
                
                .get();
        
        String pattern1 = ";url=";
        String pattern2 = "&amp;ved=";
        String txt = doc.body().html();
        Pattern p = Pattern.compile(Pattern.quote(pattern1) + "(.*?)" + Pattern.quote(pattern2));
        Matcher m = p.matcher(txt);
        // arres will contain the result
        arres = new ArrayList<String>();
        while (m.find()) {
            // adding to the result
            arres.add (m.group(1));

        }

    } catch (IOException e) {
        e.printStackTrace();

    }
}     

What the variable txt should contain is a list of the search results, but I get a request to accept cookies.

The solution does not need to include Jsoup, whatever works will be fine with me.

0

There are 0 best solutions below