I read this blogpost about the recently published Document AI - BigQuery Integration. I want to configure this setup completly using terraform. An important step in the blog post is the configuration of BigQuery remote models (see screenshot 1).
Since there is no extra Terraform resource for this (as far as I know), I want to create these models using a BigQuery job.
The simplified terraform I have written looks like this:
resource "google_bigquery_dataset" "dataset" {
dataset_id = "my_dataset"
project = "<PROJECT_ID>"
location = "EU"
}
resource "google_bigquery_connection" "connection" {
connection_id = "my-connection"
project = "<PROJECT_ID>"
location = "EU"
cloud_resource {}
}
# This grants the previous connection IAM role access to the bucket.
resource "google_project_iam_member" "connection_permission_grant" {
for_each = toset([
"roles/storage.objectViewer",
"roles/documentai.viewer",
])
role = each.key
project = "<PROJECT_ID>"
member = "serviceAccount:${google_bigquery_connection.connection.cloud_resource[0].service_account_id}"
}
# create a remote model to register DocAI processor in BigQuery
resource "google_bigquery_job" "job" {
depends_on = [
google_project_iam_member.connection_permission_grant,
]
job_id = "my-register-model-job"
project = "<PROJECT_ID>"
location = "EU"
labels = {}
query {
query = <<EOT
CREATE OR REPLACE MODEL `${google_bigquery_dataset.dataset.dataset_id}.my-remote-model`
REMOTE WITH CONNECTION `${google_bigquery_connection.connection.id}`
OPTIONS (
remote_service_type='CLOUD_AI_DOCUMENT_V1',
document_processor='<PROCESSOR_ID>'
);
EOT
default_dataset {
dataset_id = "my_dataset"
project_id = "<PROJECT_ID>"
}
create_disposition = ""
write_disposition = ""
}
}
First I create a connection, which I then give the role "role/documentai.viewer". Then I create a BigQuery job that uses the connection to create remote models.
All resources are created successfully, but the job returns an error (see screenshot 2).
Allegedly the service account is missing the "roles/documentai.viewer" role. However, if I now search for the service account in IAM, this authorization is displayed (see screenshot 3).
How can this be? Is there a synchronization problem here? Apparently the job is executed before the authorization could be successfully assigned ... If I redeploy the job with a new job ID after the initial terraform apply (i.e. the service account has already been created), everything works as expected and the models are created correctly.
Thanks in advance for any help!
br, Brian


