sbt-assembly and Lucene "An SPI class of type org.apache.lucene.codecs.Codec with name 'Lucene94' does not exist.¨ exception

293 Views Asked by At

OS: Ubuntu 22.10 java: openjdk version "19.0.1" 2022-10-18 scala: 2.13.10 Apache Lucene: 9.4.2

I took the Lucene documentation example and convert it to Scala program:

package test

import org.apache.lucene.analysis.standard.StandardAnalyzer
import org.apache.lucene.document.{Document, Field, TextField}
import org.apache.lucene.index.{DirectoryReader, IndexWriter, IndexWriterConfig}
import org.apache.lucene.queryparser.classic.QueryParser
import org.apache.lucene.search.{IndexSearcher, Query, ScoreDoc}
import org.apache.lucene.store.FSDirectory

import java.nio.file.{Files, Path}

object Test extends App {
  val analyzer: StandardAnalyzer = new StandardAnalyzer()
  val indexPath: Path = Files.createTempDirectory("tempIndex")
  val directory: FSDirectory = FSDirectory.open(indexPath)

  val config: IndexWriterConfig = new IndexWriterConfig(analyzer)
  val iwriter: IndexWriter = new IndexWriter(directory, config)
  val doc: Document = new Document()
  val text: String = "This is the text to be indexed."
  doc.add(new Field("fieldname", text, TextField.TYPE_STORED))
  iwriter.addDocument(doc)
  iwriter.close()

  // Now search the index:
  val ireader: DirectoryReader = DirectoryReader.open(directory)
  val isearcher: IndexSearcher = new IndexSearcher(ireader)

  // Parse a simple query that searches for "text":
  val parser: QueryParser = new QueryParser("fieldname", analyzer)
  val query: Query = parser.parse("text")
  val hits: Array[ScoreDoc] = isearcher.search(query, 10).scoreDocs

  assert (hits.length == 1)

  // Iterate through the results:
  for (i <-  hits.indices) {
    val hitDoc = isearcher.doc(hits(i).doc)
    assert("This is the text to be indexed.".equals(hitDoc.get("fieldname")))
  }

  ireader.close()
  directory.close()

  println("The end!")
}

If I use the following sbt file:

ThisBuild / version := "0.1.0-SNAPSHOT"

ThisBuild / scalaVersion := "2.13.10"

lazy val root = (project in file("."))
  .settings(
    name := "Test"
  )

val luceneVersion = "9.4.2"

libraryDependencies ++= Seq(
  "org.apache.lucene" % "lucene-core" % luceneVersion,
  "org.apache.lucene" % "lucene-queryparser" % luceneVersion
)

The compilation gives me the error:

[error] Deduplicate found different file contents in the following:
[error]   Jar name = lucene-core-9.4.2.jar, jar org = org.apache.lucene, entry target = module-info.class
[error]   Jar name = lucene-queries-9.4.2.jar, jar org = org.apache.lucene, entry target = module-info.class
[error]   Jar name = lucene-queryparser-9.4.2.jar, jar org = org.apache.lucene, entry target = module-info.class
[error]   Jar name = lucene-sandbox-9.4.2.jar, jar org = org.apache.lucene, entry target = module-info.class 

So I included in the sbt file:

assembly / assemblyMergeStrategy  := {
  case PathList("META-INF", xs @ _*) => MergeStrategy.discard
  case _ => MergeStrategy.first
}

After that the compilation and execution of the program were ok:

sbt "runMain test.Test"

But if I want to create a fat jar file and execute it, I got the following exception:

plugins.sbt :

addSbtPlugin("com.eed3si9n" % "sbt-assembly" % "2.1.0")
java -cp target/scala-2.13/Test-assembly-0.1.0-SNAPSHOT.jar test.Test
Exception in thread "main" java.lang.ExceptionInInitializerError
    at org.apache.lucene.codecs.Codec.getDefault(Codec.java:141)
    at org.apache.lucene.index.LiveIndexWriterConfig.<init>(LiveIndexWriterConfig.java:128)
    at org.apache.lucene.index.IndexWriterConfig.<init>(IndexWriterConfig.java:145)
    at test.Test$.delayedEndpoint$test$Test$1(Test.scala:17)
    at test.Test$delayedInit$body.apply(Test.scala:12)
    at scala.Function0.apply$mcV$sp(Function0.scala:42)
    at scala.Function0.apply$mcV$sp$(Function0.scala:42)
    at scala.runtime.AbstractFunction0.apply$mcV$sp(AbstractFunction0.scala:17)
    at scala.App.$anonfun$main$1(App.scala:98)
    at scala.App.$anonfun$main$1$adapted(App.scala:98)
    at scala.collection.IterableOnceOps.foreach(IterableOnce.scala:575)
    at scala.collection.IterableOnceOps.foreach$(IterableOnce.scala:573)
    at scala.collection.AbstractIterable.foreach(Iterable.scala:933)
    at scala.App.main(App.scala:98)
    at scala.App.main$(App.scala:96)
    at test.Test$.main(Test.scala:12)
    at test.Test.main(Test.scala)
Caused by: java.lang.IllegalArgumentException: An SPI class of type org.apache.lucene.codecs.Codec with name 'Lucene94' does not exist.  You need to add the corresponding JAR file supporting this SPI to your classpath.  The current classpath supports the following names: []
    at org.apache.lucene.util.NamedSPILoader.lookup(NamedSPILoader.java:113)
    at org.apache.lucene.codecs.Codec$Holder.<clinit>(Codec.java:58)
    ... 17 more

So, what did I do wrong? Thanks.

1

There are 1 best solutions below

1
Dmytro Mitin On

case PathList("META-INF", xs @ _*) => MergeStrategy.discard means that you're ignoring all META-INF directories (the whole their content). This is dangerous. The dependencies lucene-core and lucene-sandbox have service files in their META-INF. You should be more selective in what you ignore. Try to ignore only Java 9+ files module-info.class

assembly / assemblyMergeStrategy := {
  case x if x.endsWith("module-info.class") => MergeStrategy.discard
  case x =>
    val oldStrategy = (assembly / assemblyMergeStrategy).value
    oldStrategy(x)
}

or at least unignore META-INF/services subdirectories

assembly / assemblyMergeStrategy  := {
  case PathList("META-INF", "services", xs @ _*) => MergeStrategy.concat
  case PathList("META-INF", xs @ _*) => MergeStrategy.discard
  case _ => MergeStrategy.first
}

Drools fat jar nullpointer KieServices

Run Drools Kie project from fat jar