This isn't my area, just some armchair UI design thoughts - feel free to rip them apart!
When it was just "tap and drag" text blinking. Test subjects kept taping center of the screen and asking why it wont move.
When i added some text and indicator where to tap (bottom corners). Two of text subjects eventually understood what to do. Others said "i dont like reading i want to understand right away".
After that, i made current sketch and programmed it. I also made it so, some enemies spawn behind you, and indicate it with red arrows, so player knows that you have to move back in some cases and you can move in limited area.
I tested it with two test subjects. One of them kept saying "I don't understand what to do" and kept swiping the screen. Second test subject, who was given to test it out for the first time, observed bottom corner and understood what to do perfectly, without any guidance, also killed all the spawned enemies that where indicated with arrows.
What about anticipating common user behaviour and building animations to help train them? For example, if they tap or swipe at the centre of the screen, show them a dedicated animation to try get them to tap the corner, e.g. flashing arrows of some kind.
I didn't want to limit player to just to one side. You can touch anywhere on screen and it becomes a "virtual joystick".
But yes, i highlighted that area and put a dragging finger there, because people kept tapping and dragging "character" and blocking field of view with their hands.
I think it might be a poor idea to allow the user to control the character with the centre of the screen, given that this will result in them being unable to tell what is happening due to their hand obscuring the screen. I'd strongly consider limiting the touchable area to the outskirts, otherwise you're just reinforcing their preconceived ideas by giving positive feedback in the centre.
So now, everyone puts their finger there.
Where? Do you mean your new animation is testing much better?