| --- |
| language: en |
| tags: |
| - machine-learning |
| - reinforcement-learning |
| - sokoban |
| - planning |
| license: apache-2.0 |
| --- |
| |
| # Trained learned planners |
|
|
| This repository contains the trained networks from the paper ["Planning behavior in a recurrent neural network that |
| plays Sokoban"](https://openreview.net/forum?id=T9sB3S2hok), presented at the ICML 2024 Mechanistic Interpretability |
| Workshop. |
|
|
| To load and use the NNs, please refer to the [learned-planner |
| repository](http://github.com/alignmentresearch/learned-planner), and possibly to the [training code |
| ](https://github.com/AlignmentResearch/train-learned-planner). |
|
|
| # Model details |
|
|
| ## Hyperparameters: |
|
|
| See `model/*/cp_*/cfg.json` for the hyperparameters that were used to train a particular run. |
|
|
| ## Best Models: |
|
|
| The best models for each of the model type are stored in the following directory: |
| - DRC(3, 3): `drc33/bkynosqi/cp_2002944000` |
| - DRC(1, 1): `drc11/eue6pax7/cp_2002944000` |
| - ResNet: `resnet/syb50iz7/cp_2002944000` |
|
|
| ## Parameter counts: |
|
|
| - DRC(3, 3): 1,285,125 (1.29M) |
| - DRC(1, 1): 987,525 (0.99M) |
| - ResNet: 3,068,421 (3.07M) |
|
|
| ## Training dataset: |
|
|
| The [Boxoban set of levels by DeepMind](https://github.com/google-deepmind/boxoban-levels). |
|
|
| # Citation |
|
|
| If you use these neural networks, please cite our work: |
|
|
| ```bibtex |
| @inproceedings{TODO: add your citation here, |
| title={Planning behavior in a recurrent neural network that plays Sokoban}, |
| author={Your Authors}, |
| booktitle={ICML 2024 Mechanistic Interpretability Workshop}, |
| year={2024}, |
| url={https://openreview.net/forum?id=T9sB3S2hok} |
| } |
| ``` |
|
|