I am writing a software which does a lot of image operations / compositing on a lot of (potential big) images.
Multi threading helps a lot speed wise, but QT does not allow using multiple QPainter on the same image at the same time.
So i would have to do my image operation / compositing in each thread on a copy and then blit it back, which reduces the performance by a lot (depending on the use case obviously).
So i came up with an idea which seem to work, but feels extremely hacky.
I get the target image data (QImage::bits) pointer, and provide this to the worker thread.
In the worker thread i recreate a new QImage from the provided pointer. This means, no copy, no blitting. It seem to work fine, as long as i make sure each pixel/tile is only worked on in one thread and i don't detach the target image.
My question is: Is this safe and are there any other issues that could arise from this approach ?
Example code
QImage source = ...;
QImage target = ...;
QPainter::CompositionMode compositionMode = QPainter::CompositionMode_SourceOver;
// calculate tiles
QList<QRect> tiles;
for(int y = rect.top(); y < rect.top() + rect.height(); y += tileSize){
for(int x = rect.left(); x < rect.left() + rect.width(); x += tileSize){
QRect tile(
x, y,
x + tileSize > rect.left() + rect.width() ? rect.left() + rect.width() - x : tileSize,
y + tileSize > rect.top() + rect.height() ? rect.top() + rect.height() - y : tileSize
);
tiles.append(tile);
}
}
// Get target pixel pointer and do threaded operation on each tile
uchar *targetPix = target.bits();
auto target_size = target.size();
auto targetFormat = target.format();
QList<int> lol = QtConcurrent::blockingMapped(tiles, [&target_size, &targetFormat, &source, targetPix, &compositionMode](const QRect &r){
QImage tile_target(targetPix, target_size.width(), target_size.height(), targetFormat);
QPainter p(&tile_target);
p.setCompositionMode(compositionMode);
// do you image operations here. For now we just do a simple draw
p.drawImage(r.topLeft(), source, r);
return 1; // In reallity this would return sensible data ;)
});
(This example increased the speed in my test by about 4.6 times btw. Depends on the operation and system of course.)
Short answer
This is indeed tricky (but that's often needed when you want bleeding edge performance), but it should be (possible to make it) thread safe (for certain operations). Off course this depends on the operations you perform on the
tile_target.It's up to you that you don't even access the bits outside the assigned tile (i.e. the portion of
tile_targetoutside the rectr).Some considerations
Ensure you only access the bits of the assigned tile
As
tile_targetrefers the whole image, it's up to you to ensure you don't access bits outside this target tile. Some problematic cases:Possible solution?: One option to allow accessing and/or writing to neighboring bits of a tile, is to divide your image in stripes and process your image in two steps:
This procedures allows you to modify half of the neighboring stripes (useful for anti-aliasing) or (if nobody writes) to access all the bits of the next and/or previous stripe (useful for filtering purposes).
This shouldn't decrease the efficiency significantly, if you create enough stripes to keep all your cpu(s) busy (i.e. typically twice the number of threads your cpu(s) support).
Should I worry about detaching?
That shouldn't be an issue with your current implementation.
QImage::bitsalready detaches (if needed) the image (target) from any other possibly existing copy. As you perform the concurrent operation by blocking the calling thread. The original image (target) will exist at least as long as thetile_targetimages exist.Safer approach
Use a library that is dedicated to multi-threaded image processing, or at least allow referencing a subimage.
Pass a copy of the tile (see
QImage::copy) to each thread and write the result back in the original image (either using mutexes or by performing this operation in the calling thread). Depending on the calculation insensitivity of the image operation, this extra copy may or may not be negligible. For the OP, this doesn't seem a viable option.Note that those safer approaches may generate (slightly) different results from the single threaded result in case of anti-aliasing or filtering. Those artifacts may be minimized by taking the tiles as large as possible (i.e. create not more tiles than the number of threads your cpu supports)
Use GPU
Image processing is often much faster when using your GPU, especially for things like filtering. But that's not something
QImagesupports out of the box.