Open Clueweb warc file with python 3

291 Views Asked by At

I would like to open the ClueWeb09 warc file in Python3, i was able to open it in python2 using this library, but I need to open it in the other python version since i need other library that are present just in python3.

I have tried to adapt this code to python 3 but I didn't obtain a working solution. I have tried as well to use warcio library and warc3-wet but none of this two works with ClueWeb09 format.

My final goal is to extract some features from this collections

0

There are 0 best solutions below