Archived● Publicpythonaigenetic

Pathfinding Genetic AI (I)

This is an older project that was created to better understand how AI path finding works.

Updated almost 7 years ago


Problem

In different environments, the only data that a robot has access to is sensory data. The challenge was to determine if a simple supervised Convolutional Neural Network could be used to navigate a highly complex, structured environment. The environment in this case is a urban map fill with dynamic obstacles and traps.

The goal was to reach a designated coordinate without explicit, continuous human supervision (i.e., end-to-end learning).

The environment used for testing consisted of an 800px by 600px simulated city grid. The map included intricate boundaries and "trap" zones designed to trap an agent. The environment is simulated in a library called pygame.

An example of the environment is provided for reference below.

Preview

Investigation Notes

The initial profiling and development cycle revealed several critical technical milestones concerning the architecture and computational requirements.

  1. The computer was limited to an Intel i5-6500 CPU. The GPU could not run large models. Models were limited to the RAM. This created a significant constraint for the project.
  2. The output of each model used a steering angle and acceleration (matching a car robot).
  3. A convolutional neural network (CNN) structure was successfully created using a simple 1D array in numpy. Data from the agent learned features suitable for navigation decisions.
  4. Incorporating sensory data (simulating LiDAR (light detection and ranging)) in 4 directions greatly enhanced model performance for this problem. Distance measurements provided the network with contextual information about nearby obstacles and traversable space, allowing agents to make decisions based on immediate environmental geometry rather than just global map coordinate data itself.
  5. The integration of a Genetic Algorithm (GA) proved effective for this problem. Using a specific network configuration (six convolutional layers with an internal width parameter of nine), the algorithm was able to achieve successful pathfinding to a selected destination within approximately 11000 epochs.
  6. The underlying CNN architecture was flexible, accommodating training across multiple different activation equations (e.g., ReLU, Sigmoid, Tanh).
  7. (Important finding): The current model architecture developed and trained on a single map instance had poor generalization performance. The model had issues on entirely different map layouts or starting coordinates without training for the specific environment.

Process

The overall navigation mechanism combines a genetic algorithm with a convolutional neural network (CNN).

A genetic algorithm (GA) is an optimization method inspired by biological evolution. It starts with a population which make randomized decisions for some solution, evaluates how good each solution is using a fitness function, and then repeatedly combines and mutates the best neural networks with the goal to create improved generations. Over time, the algorithm tends to evolve solutions that perform well for the given problem.

Instead of relying on classical pathfinding heuristics (like an A* algorithm which is expensive but finds an exact path each time), each agent uses a "fitness" score that is determined by the agent moving around the environment.

The following algorithm was created:

1. 1000 agents are initialized. 
	Some models are given their own image which is easily identifiable.

2. for i = 1, MAX_ITERATIONS do
	Each model is loaded into an initial position (NW part of the map).

	For agent in models:
		The algorithm fires a series of rays at various angles.
		
		Distance data from the rays as well as the distance to the target are
		concatenated into an array.

		The array is fed through a series of neural network matrices.

		The output from the series of matrices determines the speed and angular
		velocity of the agent.

		A score is calculated.
	Each model is compared against each other and spliced.

When the algorithm starts, models are created in the top left portion of the map. There are many different agents which are unable to leave the starting area. There are many different "types" of agents (fully randomized). Different neural networks are defined as their own neural network family tree.

This is what the environment and agents look like when the algorithm begins:

Preview

When the algorithm converges, agents converge to one single "type". There are still a significant number of agents that are not able to traverse the environment, however, the best model is able to navigate the environment.

This is what the environment and agents look like when the algorithm finds a solution:

Preview

Results

  1. The successful training cycle used a population of 1000 agents, with each individual model taking 5kB of RAM. 1000 agents were sufficient for this problem.
  2. The GA algorithm was able to converge, which resulted in successfully finding a functional navigation route within an estimated 10,000 total iterations. The initial score was roughly 0 and the final score was 841.731 (average of 499.487).
  3. The model strongly depends on its training environment and does not support generalization. It currently operates as an "environmental expert" for one specific map. Generalizing the learned weights to function correctly on entirely different maps or starting points remains a significant technical limitation which requires more robust forms of learning.

Conclusion

On July 15th, 2019, this project successfully established a working proof-of-concept, demonstrating the feasibility of using GA-driven CNNs for complex, non-linear pathfinding tasks. The algorithm proved that deep learning can encode sophisticated environmental understanding far beyond simple reactive rulesets.

To transition this research into deployable systems, future work must focus on solving the generalization constraint. This requires developing modular architectures capable of:

  1. Utilizing pre-trained weights from one map and fine-tuning them with minimal data from a new map instance.
  2. Incorporating meta-learning techniques to allow the model to learn how to adapt its pathfinding strategy based on the known characteristics (e.g., urban density, presence of water) of an unseen map.