In Linux Kernel when processor context switches from one thread to another , the state of the registers are saved into PCB and some more bookkeeping is done to ensure that the exact state can be loaded again.
This whole saving and loading of the registers from kernel memory might take some CPU cycles. So does this times comes under User CPU / System CPU or somewhere else
That time is definitely supposed to be under System CPU. Any time spent in System Calls and Interrupts should be under System CPU, not User CPU. User CPU is time spent running assembly in the ELF that's actively executing and any supporting libraries - nothing else. Even I/O counts as System CPU.
Looking at the documentation in Section 1.8, we see
Of course, context switches access kernel-level data and not userland data. Thus this code is run under kernel mode, and we can be sure up to the legitimacy of their documentation that this is counted as system time.