![]() The wglSwapBuffers hook first locks the games execution. The function wglSwapBuffers is called every time the game finishes rendering a frame in order to swap the back and front buffer. This DLL hooks into the OpenGL function wglSwapBuffers as well as the system calls timeGetTime, GetTickCount, GetTickCount64, and RtlQueryPerformanceCounter. To do so, the library injects a DLL into the game's process. Secondly, it intercepts the system calls used to get the system time such that the game can be run at a desired speed. This library serves two functions:įirstly, it efficiently retrieves the frames and sends them to the python process. In order to efficiently train the agent, a C++ library was written. However, the noise is turned off after 500,000 training iterations.Noisy networks facilitate the exploration process.Distributional RL with quantile regression gives similar results.The distributional approach significantly increases the performance of the agent.However, after roughly 300,000 training steps the agent trained without prioritized experience replay performs better Prioritized experience replay at first performs better,.n-step significantly decreased the performance.Double Q-Learning and Dueling Networks did not improve the performance.All six Rainbow extensions have been evaluated.The used hyperparameters can be found at the bottom of trainer.py below if _name_ = '_main_':.See superhexagon.SuperHexagonInterface._preprocess_frame for more implementational details.Such that the walls and the player belong to the foreground and everything else belongs to the background Additionally, a threshold function is applied to the frame.See utils.Network for more implementational details.For the fully connected part of the network the feature maps of both streams are flattened and concatenated.Since the small triangle which is controlled by the player is barely visible for the first stream,Ī zoomed-in input is given to the second stream.For the first stream the frame is cropped to a square and resized to 60 by 60 pixel. ![]() The network has two convolutional streams.The reinforcement learning agent receives a reward of -1 if it dies otherwise a reward of 0.
0 Comments
Leave a Reply. |