Auto-Labeling in Document AI with Custom Extractor: Schema Requirement Issue

18 Views Asked by At

I am using Document AI with a Custom Extractor. When I create a new Custom Extractor, it offers to manage my dataset.

I expect that doing so will automatically create label names for the documents I upload for this task.

Also, it offers "Auto-label". I hope that this will even automatically generate the label names for me, guaranteeing some kind of consistency between different Custom Extractors.

I checked the "hint" button shown next to it, and it confirmed my thoughts:

enter image description here

When I check auto-labeling, I am asked to select a 'Version'. The only 'version' that I am able to select in this case is 'pretrained-foundation-model-v1.0-2023-08-22'."

enter image description here

I do this because I expect the foundation model to be capable of assigning label names to my documents automatically.

My documents upload fine, but then I am shown this message:

{
  "name": "projects/xxxxxxxxx/locations/xxxxxxx/operations/xxxxxxx",
  "done": true,
  "result": "error",
  "response": {},
  "metadata": {
        "@type": "type.googleapis.com/google.cloud.documentai.uiv1beta3.ImportDocumentsMetadata",
        "commonMetadata": {
          "state": "FAILED",
          "createTime": "202x-xxx-xxT01:xx:45.367220Z",
          "updateTime": "202x-xxx-xxT01:xx:57.243001Z",
          "resource": "projects/xxxxxxx/locations/xxxxxx/processors/xxxxxxxxx/dataset"
        },
        "totalDocumentCount": 142
      },
      "error": {
        "code": 3,
        "message": "No valid schema provided for processing.",
        "details": []
      }
    }

What do I have to do there?

0

There are 0 best solutions below