How to manage costs by department

45 Views Asked by At

We are in the process of developing a data lake to consolidate and manage data across various departments using a single GCP project.

Our objective is to ingest and process JSON data through Pub/Sub and DataFlow, subsequently storing it in specifically designed tables for each data within a single BigQuery dataset. For instance, our configuration is like:

project-id.common_dataset.table_for_datatype1
project-id.common_dataset.table_for_datatype2

We're interested in exploring methods to allocate the costs incurred by Pub/Sub, DataFlow, and BigQuery usage to individual departments.

Should the JSON data ingested by Pub/Sub include department-specific labels (for instance, {"department": "marketing",...}), is there good way to estimate the costs incurred for processing and storing each department's data through Pub/Sub, DataFlow, and BigQuery?

example

Messages received by Pub/Sub should include the department, as follows.

{"department":"marketing", "log_type":"type1", "value":1}
{"department":"product1", "log_type":"type1", "value":1}
{"department":"marketing", "log_type":"type2", "amount":10, "date":"2024-03-06"}
{"department":"product1", "log_type":"type2", "amount":20, "date":"2024-03-05"}
{"department":"product2", "log_type":"type2", "amount":20, "date":"2024-03-04"}

For log_type=type1, the bigquery table is like project-id.common_dataset.type1

department value
marketing 1
product1 1

For log_type=type2, project-id.common_dataset.type2

department amount date
marketing 10 2024-03-06
product1 20 2024-03-05
product2 20 2024-03-04

What we ultimately want is this kind of estimated monthly cost for Pub/Sub, DataFlow, and BigQuery.

department Pub/Sub DataFlow BigQuery
marketing 100 USD 100 USD 100 USD
product1 1000 USD 1000 USD 1000 USD
product2 500 USD 500 USD 500 USD
0

There are 0 best solutions below