Is it a bad idea to manage 1000+ alert configs in bazel?

46 Views Asked by At

I have something like 1000 alerts in YAML files that we parse with python and then spit out some machine readable files which are then ingested downstream by compilers. I want to update the system to be easier to work with, and I think there are benefits to be had by moving config from yaml into bazel (already used extensively by others working on the project).

I figured bazel would be good since the rules/providers would offer clear and documented inputs and we wouldnt need to invoke some kind of additional generator. A lot of people I talk to seem to think this is abusing bazel in some way, but Im confused by that. Bazel just takes pieces of data and manipulates them, similar to what a generator would do, with the added benefit of caching that data when it doesnt change. It also just integrates nicer with the rest of the build system and should allow us to do more complicated/comprehensive checks sooner.

Am I wrong for thinking I can use bazel for this? It seems to feel right.

1

There are 1 best solutions below

0
justhecuke On

TL;DR I don't suggest using bazel for this, but if you really want to you can. While it is good that you are asking a "should I do this?" question, the details of the question seem to focus on "can I do this?" more.

The issue here is not "can bazel do the work" -- it probably can depending on the specifics of the code generation -- but "would the average developer be confused by what is going on" and "what happens if we scale it up".

Bazel is a build orchestration tool. It tells other tools what to do. You expect bazel "code" to be focused on build configurations: connecting tools with input source files to generate output files in DAGs to foster parallel execution. It typically says "take this data from here, put it through this transformation, and make this output artifact available to others" -- this is a brief summary of an action, the basic building block of bazel's execution. You don't expect for bazel to contain the input data itself which is then directly provided to tools to generate output files.

It is probably possible to put your data directly into bazel with something like:

# Modular (recommended)
my_alarm(
  name = "alarm",
  trigger = "expr",
  alert_emails = ["[email protected]", "[email protected]"],
)

my_alarm_binary(
  name = "alarm_binary",
  deps = [":alarm", "//some/other/product:alarm"],
)

# Monolithic (avoid)
my_alarm_binary(
  name = "alarm_binary",
  config = [
    # you can create an `alarm` wrapper if you want
    struct(
      name = "something",
      trigger = "expr",
      alert_emails = ["[email protected]", "[email protected]"],
    ),
    ...
  ]
]

But you run into a distinct problem of complexity creep as you add more and more features and your config becomes more nested and more vertical. I've seen this sort of thing happen before and it typically results in unreadable BUILD files and the creation of new, empty packages simply to organize what is going on into separate BUILD files. You'll then create a set of .bzl files to collect common logic and constants together. Over time, you'll add specialized logic to help with certain groups of alarms that isn't common. And then you'll migrate your alarm framework and re-implement your configs.

Without knowing precisely what you are trying to accomplish, it is hard to say whether or not it is something I would support were you to write me a design doc or send a code review my way. A lot depends on implementation details, your use case, where the use cases might grow in the future, and what the other developers are willing to do.

Some questions to consider:

  • What is the problem you are solving? From your question, this sounds like a solution in search of a problem.
    • There's nothing about a YAML file which prevents documentation of inputs. You can consider these alternatives if you don't like YAML:
      • Migrate from YAML to text protobuf
      • Compile YAML to protobuf at build-time
    • You are invoking a new generator: the one you're implementing in bazel
  • How will you test these configurations?
  • If there is an error somewhere, will you get a good error message out that makes it easy for the developer to figure out a fix?
  • Who will maintain these alarms?
  • Are the other developers capable of maintaining, modifying, and testing non-trivial Bazel and Skylark logic?
  • Is Skylark flexible enough to handle changes in alarm frameworks in the future? It would be quite a backfire if Skylark's limitations prevented you from changing your alarm backend and you had to migrate out of bazel.

For the point about thousands of targets in a single package, that actually does introduce performance concerns. Yes, you have a cache, but if you perform a query on that package, it can still take a bit for bazel to perform the operation. Or if you have a ... pattern like bazel test //my/code/here/... which also touches your package it can cause additional work. Basically, anything that requires you to look at the list of targets in that package can cause performance issues.

While it is unlikely that a few thousand new targets will make an appreciable difference, things can quickly get out of hand if you don't have some discipline with the number of targets you are adding, use overly broad patterns, or use a lot of aspects.

Also, the cache would apply to building the YAML file in the first place, no? You wouldn't need to re-build the YAML file if it wasn't changed so you get caching benefits without having to re-implement things in Skylark/Bazel.