how can I get the Freebase Easy dataset as one structured file?

153 Views Asked by At

I downloaded the Freebase Easy dataset (3.3GB). I want to investigate this dataset in typing some entities. e.g: German (types in freebase: location, country, land.....). enter image description here

How can I CONCATENATE these three files to have full dataset?

1

There are 1 best solutions below

1
Tom Morris On

The files (facts.txt freebase-links.txt scores.txt) are all in the same format, so they can be simply concatenated. On a Unix-like system, you could use the command:

cat facts.txt freebase-links.txt scores.txt > all.txt

or you could keep everything compressed by doing something like

unzip -ca freebase-easy-latest.zip \*.txt | gzip > freebase-easy-all.txt.gz

an example entry would look like

$ unzip -ca freebase-easy-latest.zip \*.txt | grep $"^B\t" 
B   prominence-score    1758.0  .
B   freebase-entity <http://rdf.freebase.com/ns/m.0560cf>   .
B   Transit System  New York City Subway    .
B   is-a    Topic   .
B   is-a    Transit Line    .
B   kg/object_profile/prominent_type    Transit Line    .

where the first line is from scores.txt, the second line from freebase-links.txt, and the remainder from facts.txt.