Your plan sounds good, nothing to suggest here. Incrementing your game along with your skills and experience is always a good strategy to learn. This is how I did it with game engine tech.
What I did once in a prototype for one of the indie studios I worked for, was a responsability system. Even if that sounds too complicated for you right now, think about it. The system decoupled everything from a god-class approach to something flexible where it was easy to add new actions to the game. We looked at the player as an authority which has certain desires, like for example using a door, picking up an item or purchasing something. It tells that desire to the system and the system looks for interaction points in reach. Then the system tells those interaction point, that something wants to interact with them. The interaction point then gets the instance of the authority who wish to interact and the kind of interaction (left mouse button, right mouse button) to evaluate if the interaction is valid and then either fulfills or rejects the desire. This worked for a simple door as well as picking up an item but a merchant was more difficult. The merchant needed to open an inventory first and then accept or reject an offer, this involved an ongoing interaction but you could start simple first.
If I'd have to implement that system again today, I'd consider using some reactive pattern wrapped into an actor model. It makes everything quiet easy as you don't have to implement all possible interactions in code on the player and could also have NPCs perform the same interactions as well. https://gist.github.com/staltz/868e7e9bc2a7b8c1f754
For the engine, you should use something which is either capable of handling 2D and 3D at the same time in order to not have to rewrite the entire source code or you start within a specific language without a game engine and make sure that an engine also uses that langueg so you could copy your gameplay code over. C# for example could do that as you can start using plain C# to write a console application (text driven game) and later copy over your code to Unity 3D. There are however a lot of possible alternatives so have a look at the language you want to use first