all api's from a dll are rerouted (with care taken into effect for forwarded exports etc..).. new rerouted api has a 'pre' and 'after' portion, between it is the 'live' portion...
like this
pre code -> output param info, stack, whatever (if its an api you're interested in and you've coded the handler for it)
real code -> simply pushes the params again (if any) and calls api
after code -> api has been called, log info or run handler etc.. (preserving registers), code then returns (fixing up stack and cleaning up)
problems -> multi threading.. your handler must handle the case of multi threading, so use locals on the stack, also generally needs to be done in asm (so its clean and tidy.. and small)..
after you've got it all working, its pretty damned nice
every export should have its own unique address, making import table fixing etc a doddle
once you've then 'targeted' the api's you're after you can then code record/playback portions, making the call do whatever you like...
simple in theory, hard to get done, once done its probably the most powerful system you can handle- requires no anti debugging and pretty much has complete control of the process (code wont use debug api's for example.. in 2k or higher all the handlers and rerouting is local, ie: not global on the system, so it wont be 'seen' by anti debug code and so on.. provided its coded well of course...)
the record/playback does work for some protections...
many methods to get it done...
1. dll injection (can get very messy)
2. 'fixing' windows file protection, and patching dlls and so on.. and using events/flags to enable/disable handler code.. tons of work but pretty damned safe.. generally if you're doing this, you've got to be good.., and if you want to be safe use vmware

3. both of the above
only shit thing is when you have it done, lots of lamers ask you for it..... (and none should get it)