I am using pylucne to build a search system. I am using TREC data to test my system. I have successfully written the indexer and searcher code. Now I want to use TREC topics to evaluate my system. To do this there is a class named TrecTopicsReader() which reads the queries from the TREC formatted topics file. But readQueries(BufferedReader reader) of that class needs a BufferedReader topics file object passed to it.
How to do this in pylucene. BufferedReader is not available in pylucene JCC.
After waiting for some one to answer, I also asked this question on pylucene developer mailing list.
Andi Vajda replied there. I am answering this question on Andi's behalf.
Quoting Andi:
More information:
In the Makefile of pyLucene you will find this line
GENERATE=$(JCC) $(foreach jar,$(JARS),--jar $(jar)) \. In this there should be a line like--package java.io, add the class(BufferedReader) you want to add to JCC so that it will be available to the python code.Then compile and install the pylucene again. (You can find the info about compilation & installation at PyLucene's documentation or you can also use this).
Also, for making a
BufferedReaderobject from a file you will needFileReader. So add that also.Just for Completenes: After adding this line my
GENERATEwill look like:Doing this doesn't suffice, you also have to compile the lucene benchmark lib, which is not included in the installation libs by default, because
TrecTopicsReaderis present in benchmark api. To compile and install benchmark: You have to modify the build.xml inside the main lucene folder, where the benchmark folder is present and then you have to include this jar in main Makefile to install it into python libs as egg.build.xml: You have to three modifications. For simplicity follow the
jar-test-frameworkand wherever this is present try to create the similar pattern forjar-benchmark.The three changes you have to do are:
1)
<target name="package" depends="jar-core, jar-test-framework, build-modules, init-dist, documentation"/>replace it with<target name="package" depends="jar-core, jar-test-framework, jar-benchmark, build-modules, init-dist, documentation"/>2) For the rule
replace it with
3) Add the following target/rule after the target named
jar-test-frameworkMakeFile: Here also you have to do three modifications. For simplicity follow
HIGHLIGHTER_JARand add similar rules forBENCHMARK_JAR. The three changes you have to are:1) Find
JARS+=$(HIGHLIGHTER_JAR)and addJARS+=$(BENCHMARK_JAR)after that in similar manner.2) Find
HIGHLIGHTER_JAR=$(LUCENE)/build/highlighter/lucene-highlighter-$(LUCENE_VER).jarand addBENCHMARK_JAR=$(LUCENE)/build/benchmark/lucene-benchmark-$(LUCENE_VER).jarafter this line in similar manner.3) Find the rule
$(ANALYZERS_JAR):and another rule for$(BENCHMARK_JAR):after that.For completeness here are my final Mkaefile and build.xml files.