Archived● Publicpythonaigenetic

Using AI in Video Games (1024)

This is an older project which started February 8th, 2020.

Updated over 6 years ago


Problem

The goal of this project was to develop a deep learning model capable of autonomously playing the game 1024. This was used to test an advanced genetic algorithm (GA) in a dynamic, structured environment. The idea was that an agent would reach and exceed the target tile value of 1024, simulating effective problem-solving skills through machine learning.

Preview

Investigation Notes

The project spanned from February 8th, 2020, to February 23rd, 2020. The development process encountered several technical limitations.

  1. The core challenge involved operating without a standardized simulation environment (simulator) at the time. This meant that testing model performance was difficult, as there was no reliable method to check if an agent's weights successfully translated into gameplay success or failure outside of direct execution attempts.
  2. The entire system was restricted by hardware constraints (an Intel i5-6500 CPU). The available GPU capacity precluded the use of large, complex deep learning models. Model parameters and sizes had to be carefully constrained to fit within the limited RAM.
  3. Automated interaction with the game required using geckodriver for web automation (Selenium), ensuring that agents could interact with the browser environment as if a human user were present.
  4. The developed models were linear Convolutional Neural Networks (CNNs) designed to process 34 different input features, utilizing only two layers of complexity while maintaining minimal file sizes (approximately 172 kb per model).
  5. To evaluate the robustness of the system, a large initial population consisting of 1000 distinct agents was tested within the simulation cycles.
  6. Future Development Path: Subsequent efforts were logged for August 8th, 2020, during which time development focused on constructing a dedicated simulator environment to overcome the real-time testing limitations encountered during the initial phase.

Process

The overall navigation mechanism operates as a combined Genetic Algorithm and deep learning system.

A genetic algorithm is an optimization method inspired by biological evolution. It starts with a population which make randomized decisions for some solution, evaluates how good each solution is using a fitness function, and then repeatedly combines and mutates the best solutions to create improved generations. Over time, the algorithm tends to evolve solutions that perform well for the given problem.

The core process involves initializing models and running iterative training cycles:

  1. Initialization: A population of 1000 agent models are generated with random starting weights.
  2. Iterative Training Loop (Epochs): For a set number of epochs, the following cycle repeats:
  • State Capture: The current state of the game board is captured by reading input features (34 features) via geckodriver.
  • Prediction/Action: This feature array is fed through the Linear CNN model (2 layers). The network outputs determine the agent's movement parameters (e.g., speed, angular velocity, or tile merging logic).
  • Fitness Evaluation: A score is calculated based on the actions taken and the resulting board state changes.
  1. Selection and Mutation: After the agents execute a round of actions, the GA selects the best-performing models ("parents"). New models are generated through mutation (introducing small, random changes to weights) and crossover (combining weights from two successful parents).
  2. Convergence: This cycle continues until the performance stabilizes or maximum iterations are reached, resulting in an optimized set of weights representing the evolved agent strategy.

The following algorithm was created:

1. 1000 agents are initialized. 
	Some models are given their own image which is easily identifiable.
2. for i = 1, MAX_ITERATIONS do
	For agent in models:
		The current state of the game board is captured (by reading input 
		features (34 features) via geckodriver).
		
		This feature array is fed through the Linear CNN model (2 layers).
		
		The network outputs determine the agent's movement parameters.
		
		A new state is captured via the output above, including a score.
		
	Each model is compared against each other and spliced.

Note on implementation details: The system manages model weights (board.wboard) and fitness scores (board.score). Weight updates and mutations include random probability checks and selection logic to ensure that superior genetic traits (higher scores) are passed down, guiding the overall population toward higher performance.

Conclusion

The lack of an integrated simulation framework was the main limitation for this project, which restricted continuous, repeatable testing and validation outside of actual gameplay sessions. As a result, the model was not able to evolve despite numerous attempts.

Future work is focused on developing a dedicated simulator to decouple model training from real-time game execution. This would allow for massive-scale parallel testing, enabling the study of larger population sizes and more complex neural network architectures, ultimately advancing the understanding of AI application in structured environments.