Reassemble Python bytecode to source (CTF challenge)

1.1k Views Asked by At

I have a text file with python bytecode which is part of the output you'd get when issuing python -m dis file.py. My goal is to reassemble the source from the bytecode.

I've seen a similar question asked here but the answers provided are focused on tools that (from my understanding) should solve the problem only if my bytecode file had all the necessary info (python bytecode version, timestamp, flags, etc).

.pyc code:

##################################################
 15           0 LOAD_GLOBAL              0 (print)
              2 LOAD_CONST               1 ('loading application')
              4 CALL_FUNCTION            1
              6 POP_TOP

 17           8 LOAD_GLOBAL              1 (magic)
             10 LOAD_CONST               2 ('8934')
             12 LOAD_GLOBAL              2 (get_flag)
             14 CALL_FUNCTION            0
             16 CALL_FUNCTION            2
             18 STORE_FAST               0 (d)

 19          20 LOAD_GLOBAL              0 (print)
             22 LOAD_FAST                0 (d)
             24 CALL_FUNCTION            1
             26 POP_TOP
             28 LOAD_CONST               0 (None)
             30 RETURN_VALUE
None
##################################################
  4           0 LOAD_CONST               1 ('k\\PbYUHDAM[[VJlVAMVk[VWQE')
              2 RETURN_VALUE
None
##################################################
  7           0 LOAD_CONST               1 (b'')
              2 STORE_FAST               2 (out)

  9           4 LOAD_GLOBAL              0 (range)
              6 LOAD_GLOBAL              1 (len)
              8 LOAD_FAST                1 (f)
             10 CALL_FUNCTION            1
             12 CALL_FUNCTION            1
             14 GET_ITER
        >>   16 FOR_ITER                46 (to 64)
             18 STORE_FAST               3 (i)

 10          20 LOAD_FAST                2 (out)
             22 LOAD_GLOBAL              2 (bytes)
             24 LOAD_GLOBAL              3 (ord)
             26 LOAD_FAST                1 (f)
             28 LOAD_FAST                3 (i)
             30 BINARY_SUBSCR
             32 CALL_FUNCTION            1
             34 LOAD_GLOBAL              3 (ord)
             36 LOAD_FAST                0 (k)
             38 LOAD_FAST                3 (i)
             40 LOAD_GLOBAL              1 (len)
             42 LOAD_FAST                0 (k)
             44 CALL_FUNCTION            1
             46 BINARY_MODULO
             48 BINARY_SUBSCR
             50 CALL_FUNCTION            1
             52 BINARY_XOR
             54 BUILD_LIST               1
             56 CALL_FUNCTION            1
             58 INPLACE_ADD
             60 STORE_FAST               2 (out)
             62 JUMP_ABSOLUTE           16

 12     >>   64 LOAD_FAST                2 (out)
             66 RETURN_VALUE
None

What I've tried
I've tried some of the tools suggested in similar questions such as uncompyle6, pycbc and pyc-xasm.

However, from my understanding these tools expect a .pyc/python disassembled file with all the 'header information' (python bytecode version, timestamp, flags, etc) to work, which my file does not have, so I was not able to use the tools as they give me errors. I also point this out because I don't fully understand how to use these tools so I might have missed something that would help solve my problem. I would love some help here as well if I did miss something.

My current solution
I'm currently trying to reassemble the source by figuring out how the opcodes work following the docs at https://docs.python.org/3/library/dis.html and writing the corresponding python code. So far I've been able to reproduce the code up untill the second return statement with the python code bellow.

test.py

def bla():
    print("loading app")
    d = magic("8934", get_flag())
    print(d)

def magic():
    return "k\\PbYUHDAM[[VJlVAMVk[VWQE"

Output from python -m dis test.py:

Disassembly of <code object bla at 0x7fea8a11e240, file "test.py", line 5>:                                                                           
  6           0 LOAD_GLOBAL              0 (print)                                                                                                    
              2 LOAD_CONST               1 ('loading app')                                                                                            
              4 CALL_FUNCTION            1                                                                                                            
              6 POP_TOP                                                                                                                               
                                                                                                                                                      
  7           8 LOAD_GLOBAL              1 (magic)                                                                                                    
             10 LOAD_CONST               2 ('8934')                                                                                                   
             12 LOAD_GLOBAL              2 (get_flag)                                                                                                 
             14 CALL_FUNCTION            0                                                                                                            
             16 CALL_FUNCTION            2                                                                                                            
             18 STORE_FAST               0 (d)                                                                                                        
                                                                                                                                                      
  8          20 LOAD_GLOBAL              0 (print)                                                                                                    
             22 LOAD_FAST                0 (d)                                                                                                        
             24 CALL_FUNCTION            1                                                                                                            
             26 POP_TOP                                                                                                                               
             28 LOAD_CONST               0 (None)                                                                                                     
             30 RETURN_VALUE                                                                                                                          
                                                                                                                                                      
Disassembly of <code object magic at 0x7fea8a11e2f0, file "test.py", line 11>:                                                                        
 12           0 LOAD_CONST               1 ('k\\PbYUHDAM[[VJlVAMVk[VWQE')                                                                             
              2 RETURN_VALUE                                                                                                                          

However, I'm having issues reproducing the python code that matches the opcodes on the last block of code (blocks are separeted by multiple '#' ). I've matched some of the opcodes to the correct python instructions but still, the argument counts are incorrect, and the python code obviously makes no sense...so far.

function get_flag:

def get_flag():
    out = b""
    for i in range(len(f)):
        out += bytes([ord(f[i]) ^ ord(k[i % len(k)])])
    return out

dis output of function get_flag

Disassembly of <code object get_flag at 0x7fea8a11e3a0, file "test.py", line 15>:                                                                     
 16           0 LOAD_CONST               1 (b'')           
              2 STORE_FAST               0 (out)

 17           4 LOAD_GLOBAL              0 (range)
              6 LOAD_GLOBAL              1 (len)
              8 LOAD_GLOBAL              2 (f)
             10 CALL_FUNCTION            1
             12 CALL_FUNCTION            1
             14 GET_ITER
        >>   16 FOR_ITER                46 (to 64)
             18 STORE_FAST               1 (i)

 18          20 LOAD_FAST                0 (out)
             22 LOAD_GLOBAL              3 (bytes)
             24 LOAD_GLOBAL              4 (ord)
             26 LOAD_GLOBAL              2 (f)
             28 LOAD_FAST                1 (i)
             30 BINARY_SUBSCR
             32 CALL_FUNCTION            1
             34 LOAD_GLOBAL              4 (ord)
             36 LOAD_GLOBAL              5 (k)
             38 LOAD_FAST                1 (i)
             40 LOAD_GLOBAL              1 (len)
             42 LOAD_GLOBAL              5 (k)
             44 CALL_FUNCTION            1
             46 BINARY_MODULO
             48 BINARY_SUBSCR
             50 CALL_FUNCTION            1
             52 BINARY_XOR
             54 BUILD_LIST               1
             56 CALL_FUNCTION            1
             58 INPLACE_ADD
             60 STORE_FAST               0 (out)
             62 JUMP_ABSOLUTE           16

 19     >>   64 LOAD_CONST               2 ('')
             66 RETURN_VALUE

Specifically, I need help understanding how bytecode argument count can alter the corresponding python code, so I can better reverse the bytecode. Hope my question and goals are clear. All help will be appreciated.

1

There are 1 best solutions below

0
Prikshit kamal On

Answer To Security Valley's "Weird Code" CTF

You just need to convert the .pyc file to .py file. However you have to do it manually as uncompyle6 and other libraries does not work because of incomplete .pyc file.

Here is the .py code for .pyc file :-

def get_flag():
    return "k\\PbYUHDAM[[VJlVAMVk[VWQE"

def magic(k,f):
    out = b""
    for i in range(len(f)):
        out += bytes([ord(f[i])^ord(k[i%len(k)])])
    return out
    
def hello():
    print("loading application")
    d = magic('8934',get_flag())
    print(d)

hello()

I Hope this helps :)