python googlesearch library - how to remove + OR %20 signs at backend for using googlesearch library?

71 Views Asked by At

i am using googlesearch library in python for scraping google search results . although if i type normal words in chrome i get results..but while sending the search string thriugh python library googlesearch i am unable to get same result it adds unwantingly %20 or + signs in backend due to which i am unable to scrape

from googlesearch import search
from urllib.parse import quote
query = input("Enter the company name:")
z='''-intitle:"profiles" -inurl:"dir/ " site:ch.linkedin.com/in/ OR site:ch.linkedin.com/pub/ Current: ''' + '''"''' + query
y=quote(z)
search_results=list(search(y,num_results=10))

for i,result in enumerate (search_results,start=1):
    print(f"Result{i}:{result}")
2

There are 2 best solutions below

1
Alessio Liu On

I met this problem recently with handling file names and fetching them using php. Because when i fetched a file name it returned " examplefilename.txt" with a space at the beginning of the string. I essentially had to replace every " " or space with a "-" or dash, since when your string gets encoded, it replaces the space with "%20".

I believe you can fix this by obtaining the search query, check for spaces in the string and replacing them with "+" and then combining the url with the query to achieve the desired result.

e.g:

url="https://www.google.com/search?q="
raw_query = str(input("Search query>"))
processed_query = raw_query.replace(" ", "+")
#Rest of your code goes here
#OUTPUT: https://www.google.com/search?q=stack+overflow

This applies the same if you want to obtain a url. you can split the query into the url and the query and then splitting the query and removing all the "%20" of "+" in between.

Alternatively, you can use urllib.parse.

0
HaX.Alvin On

You don't need to use quote() by yourself.
In the googlesearch module, they implement the function using request.get(url, params={"q": term}) instead of request.get(url + f"?q={term}").
This means the query has already used quote_plus() for you.