describe
As a Python beginner, I sincerely seek guidance
i use opentelemetry-python to metrics my python project HTTP meter manually, it runs with gunicorn, and send metrics data to opentelemetry-exporter
but i find opentelemetry-exporter receive mutiple data with different gunicorn workders
with the following content:
1、OTLP exporters
OTLP exporters , receives data and export to prometheus
a file called collector-config.yaml with the following content:
receivers:
otlp:
protocols:
grpc:
endpoint: 0.0.0.0:4317
processors:
memory_limiter:
check_interval: 5s
limit_percentage: 85
spike_limit_percentage: 25
batch/metrics:
timeout: 200ms
send_batch_size: 8192
send_batch_max_size: 8192
exporters:
prometheus:
endpoint: "0.0.0.0:8073"
resource_to_telemetry_conversion:
enabled: false
service:
pipelines:
metrics:
receivers: [otlp]
processors: [memory_limiter, batch/metrics]
exporters: [prometheus]
2、my flask web demo
sdk :
class OtlpConfig:
def __init__(self):
endpoint = os.getenv(OTLP_EXPORTER_ENDPOINT) or os.getenv(
OTLP_EXPORTER_ENDPOINT.lower()) or os.getenv(
OTLP_EXPORTER_ENDPOINT.lower().replace('_', '.')) or "http://127.0.0.1:4317"
metrics_exporter = OTLPMetricExporterGRPC(endpoint)
reader = PeriodicExportingMetricReader(exporter=metrics_exporter, export_interval_millis=15,
export_timeout_millis=5)
meter_provider = MeterProvider(
metric_readers=[reader]
)
metrics.set_meter_provider(meter_provider)
meter = metrics.get_meter(name="flyme-fcop-python-meter-sdk")
self.meter = meter
logging.info("Starting OTLP successfully")
def get_meter(self):
return self.meter
Use gunicorn start the parameter, use two workers
gunicorn --workers=2 -b 0.0.0.0:8080 main:app
it will run with 2 workers:
2024-03-21 10:15:55 +0000] [1] [INFO] Starting gunicorn 21.2.0
[2024-03-21 10:15:55 +0000] [1] [DEBUG] Arbiter booted
[2024-03-21 10:15:55 +0000] [1] [INFO] Listening at: http://0.0.0.0:8080 (1)
[2024-03-21 10:15:55 +0000] [1] [INFO] Using worker: gthread
[2024-03-21 10:15:55 +0000] [8] [INFO] Booting worker with pid: 8
[2024-03-21 10:15:55 +0000] [9] [INFO] Booting worker with pid: 9
and i request my metrics http address 3 times, OtlpConfig was initialized twice because of 2 workers .
i request my app 3 time totally , 1 request fell on the workers pid: 8 , 2 request fell on the workers pid: 9
but opentelemetry-exporter receive mutiple data with different gunicorn workers
The prometheus data display may different , depending on the number of visits to the same address , as follow:
http://127.0.0.1:8073/metrics the first time requests :
http_server_request_seconds_sum{job="unknown_service",url="/three/app/process"} 0.30804229132064576
http_server_request_seconds_count{job="unknown_service",url="/three/app/process"} 1
http://127.0.0.1:8073/metrics the second time requests
http_server_request_seconds_sum{job="unknown_service",url="/three/app/process"} 0.5080454164143
http_server_request_seconds_count{job="unknown_service",url="/three/app/process"} 2
when prometheus pull the opentelemetry-exporter data, it will has some miscalculation, time series look like up and down,
normal data like this :
question
1、Does opentelemetry-exporter receives data groupy by workers by default ? Does opentelemetry-exporter has some aggregations plugin to resolve ?
2、If do not have solution , i will try to use workers id append to my metrics tag , so i can distinguish this data, but data will delay reporting , because prometheus server only pull {workders} * {interval} times can get all metrics.
3、prometheus python client suggest use multiprocess https://prometheus.github.io/client_python/multiprocess/
and opentelemetry suggest https://opentelemetry.io/docs/specs/otel/metrics/data-model/#overlap-resolution
have everybody encountered?
I am eager to learn from the seasoned individuals here and would be grateful for any guidance or solutions provided. Thank you all, esteemed experts, for your invaluable assistance!