Non-Markov Decision Processes and Reinforcement Learning