I'm reading/watching anything I can about color management/color science and something that's not making sense to me is the scene-referred and display-referred workflows. Isn't everything display-referred, because your monitor is converting everything you see into something it can display?
While reading this article, I came across this image:
So, if I understand this right to follow a linear workflow, I should apply an inverse power function to any imported jpg/png/etc files that contain color data, to get it's gamma to be linear. I then work on the image, and when I'm ready to export, say to sRGB and save it as a png, it'll bake in the original transfer function.
But, even while it's linear, and I'm working on it, is't my monitor converting everything I see to what I can display? Isn't it basically applying it's own LUT? Isn't there already a gamma curve that the monitor itself is applying?
Also, from input to output, how many color space conversions take place, say if I'm working in the ACEScg color space. If I import a jpg texture, I linearize it and bring it into the ACEScg color space. I work on it, and when I render it out, the renderer applies a view transform to convert it from ACEScg to sRGB, and then also what I'm seeing is my monitor converting then from sRGB to my monitor's own ICC profile, right (which is always happening since everything I'm seeing is through my monitor's ICC profile)?
Finally, if I add a tone-mapping s curve, where does that conversion sit on that image?

I'm not sure your question is about programming, and the question has not much relevance to the title.
In any case:
light (photons) behave linearly. The intensity of two lights is the sum of the intensity of each light. For this reason a lot of image mangling is done in linear space. Note: camera sensors have often a near linear response.
eyes see nearly as with a gamma exponent of 2. So for compression (less noise with less bit information) gamma is useful. By accident also the CRT phosphors had a similar response (else the engineers would have found some other methods: in past such fields were done with a lot of experiments: feed back from users, on many settings).
Screens expects images with a standardized gamma correction (now it depends on the port, setting, image format). Some may be able to cope with many different colour spaces. Note: now we have no more CRT, so the screen will convert data from expected gamma to the monitor gamma (and possibly different value for each channel). So a sort of a LUT (it may just be electronically done, so without the T (table)). Screens are setup so that with a standard signal you get expected light. (There are standards (images and methods) to measure the expected bahavious, but so ... there is some implicit gamma correction of the gamma corrected values. It was always so: on old electronic monitor/TV technicians may get an internal knob to regulate single colours, general settings, etc.)
Note: Professionals outside computer graphic will use often opto-electronic transfer function (OETF) from camera (so light to signal) and the inverse electro-optical transfer function (EOTF) when you convert a signal (electric) to light, e.g. in the screen. I find this way to call the "gamma" show quickly what it is inside gamma: it is just a conversion between analogue electrical signal and light intensity.
The input image has own colour space. You now assume a JPEG, but often you have much more information (RAW or log, S-log, ...). So now you convert to your working colour space (it may be linear, as our example). If you show the working image, you will have distorted colours. But you may not able to show it, because you will use probably more then 8-bit per channel (colour). Common is 16 or 32bits, and often with half-float or single float).
And I lost some part of my answer (after last autosave). The rest was also complex, but the answer is already too long. In short. You can calibrate the monitor: two way: the best way (if you have a monitor that can be "hardware calibrated"), you just modify the tables in monitor. So it is nearly all transparent (it is just that the internal gamma function is adapted to get better colours). You still get the ICC, but for other reasons. Or you get the easy calibration, where the bytes of an image are transformed on your computer to get better colours (in a program, or now often by operating system, either directly by OS, or by telling the video card to do it). You should careful check that only one component will do colour correction.
Note: in your program, you may save the image as sRGB (or AdobeRGB), so with standard ICC profiles, and practically never as your screen ICC, just for consistency with other images. Then it is OS, or soft-preview, etc. which convert for your screen, but if the image as your screen ICC, just the OS colour management will see that ICC-image to ICC-output will be a trivial conversion (just copying the value).
So, take into account that at every step, there is an expected colour space and gamma. All programs expect it, and later it may be changed. So there may be unnecessary calculation, but it make things simpler: you should not track expectations.
And there are many more details. ICC is also use to characterize your monitor (so the capable gamut), which can be used for some colour management things. The intensions are just the method the colour correction are done, if the image has out-of-gamut colours (just keep the nearest colour, so you lose shade, but gain accuracy, or you scale all colours (and you expect your eyes will adapt: they do if you have just one image at a time). The evil is in such details.