@autoreleasepool with no foundation or corefoundation (libobjc only)

158 Views Asked by At

NOTE: All references to libobjc are referring to Apple's runtime. I'll deal with the GNU runtime later.


I'm trying to get a handle on what exactly @autoreleasepool does at runtime so that I can use it in my foundation-less framework.

I've been able to hook into @(#), @[] and @{} to return instances of my number, array and dictionary classes — all derived from my root class and easier to hack as all the plumbing happens in objc — but @autoreleasepool seems to be handled differently by the compiler.

Rather than just injecting calls to [[NSAutoreleasePool alloc] init] and [pool release] which I could maybe swizzle in the objc layer, the compiler injects calls to two private C functions in the runtime: objc_autoreleasePoolPush() and objc_autoreleasePoolPop()... and for whatever reason, those C functions do not in turn call [[NSAutoreleasePool alloc] init] and [pool release].

I had previously thought that calling objc_autoreleasePoolPush() was actually creating a new pool and pushing it onto the pool stack, but the return value is 0x01, which is maybe some sort of sentinel/placeholder value — it's definitely not an instance of NSAutoreleasePool.

Anyway, what I need is to either:

A: intercept allocation/initialization/deallocation of NSAutoreleasePools injected during compilation of @autoreleasepool so that I can alloc/init/dealloc instances of my own autorelease pool class

or

B: implement a separate, post-compilation binary patching step that overwrites calls to these C functions with calls to my own corresponding function addresses

Any ideas on either of these options?

1

There are 1 best solutions below

0
The Dreams Wind On

...those C functions do not in turn call [[NSAutoreleasePool alloc] init] and [pool release].

A decent part of Objective-C runtime library was re-written in C++ in the latest versions, the autorelease pool block is not an exception - the use of NSAutoreleasePool was completely replaced with the functions you found, which under the hood call static methods of AutoreleasePoolPage C++ class. It means that nowadays @autorelease has nothing to do with NSAutoreleasePool apart from common past.

I had previously thought that calling objc_autoreleasePoolPush() was actually creating a new pool and pushing it onto the pool stack, but the return value is 0x01...

According to the clang documentation objc_autoreleasePoolPush() returns an opaque “handle” to a new autorelease pool. An "opaque handle" in this case is just a void pointer to an actual autorelease object (which is not an instance of NSAutoreleasePool in this case), so you don't have any access to its interface (but you can later use it to pass as an argument to other functions of the same section).

what I need is to either:

A: intercept allocation/initialization/deallocation of NSAutoreleasePools injected during compilation of @autoreleasepool so that I can alloc/init/dealloc instances of my own autorelease pool class

The only way to enforce use of NSAutoreleasePool by @autoreleasepool block is to compile the Objective-C code with legacy runtime (macosx-10.6 or earlier).

or

B: implement a separate, post-compilation binary patching step that overwrites calls to these C functions with calls to my own corresponding function addresses

I'm not an expert with that, but it seems to be the only viable option if you want to inject your own functionality in place of @autoreleasepool blocks.

However for debugging purposes you may get use of how clang handles undefined behaviour scenario with ODR violations for C functions (it doesn't fail at linking step, but which function gets called is not certain). You surely should not use such a code for any serious project.


@import ObjectiveC;
#include <stdio.h>

#pragma mark Autorelease functions

void *objc_autoreleasePoolPush(void) {
    printf("AR New init\n");
    static void *handle = &handle;
    return handle;
}
void objc_autoreleasePoolPop(void *handle) {
    printf("AR New drain\n");
}

#pragma mark Autorelease object

@interface NSAutoreleasePool : NSObject

- (void)drain;

@end

@implementation NSAutoreleasePool

- (instancetype)init {
    if (self = [super init]) {
        printf("AR Old init\n");
    }
    return self;
}

- (void)drain {
    printf("AR Old drain\n");
}

@end

#pragma mark Main

int main(int argc, const char * argv[]) {
    @autoreleasepool {
        NSObject *object = [[NSObject new] autorelease];
    }
    return 0;
}

To summarise, if you run this code with some legacy runtime, it will use the custom NSAutoreleasePool in place of @autoreleasepool:

% clang -ObjC -fmodules -lobjc -fobjc-runtime=macosx-10.6 -o main main.m
% ./main 
AR Old init
AR Old drain

With the latest runtime, the program will use the redefined functions (again, this is just for demo purposes, avoid using such an approach in your code):

% clang -ObjC -fmodules -lobjc -o main main.m                          
% ./main                                     
AR New init
AR New drain

EDIT

For the ODR part, I actually was convinced that upon linking lld may end up with undefined behaviour, since two symbols with the same name exist in a program, however after digging a little deeper into the question, it looks like the rules are well-defined (at least for llvm linker):

SymbolTable

SymbolTable is basically a hash table from strings to Symbols with logic to resolve symbol conflicts. It resolves conflicts by symbol type.

  • If we add Defined and Undefined symbols, the symbol table will keep the former.
  • If we add Defined and Lazy symbols, it will keep the former.
  • If we add Lazy and Undefined, it will keep the former, but it will also trigger the Lazy symbol to load the archive member to
    actually resolve the symbol.

Whether it's reliable outside of macOS / llvm ecosystem is way beyond my expertise