Neural Combinatorial Optimization with Reinforcement Learning In PyTorch
PyTorch implementation of Neural Combinatorial Optimization with Reinforcement Learning. I have implemented the basic RL pretraining model with greedy decoding from the paper. An implementation of the supervised learning baseline model is available here. Instead of a critic network, I got my results below on TSP from using an exponential moving average critic. The critic network is simply commented out in my code right now. From correspondence with a few others, it was determined that the exponential moving average critic […]
Read more