6 Comments
Mar 21, 2023Liked by Ethan

Kind of interesting that it couldn't understand your grid representation of the game state. It probably doesn't fully grok the physical locations of the characters on the screen, and that those locations can be used to convey information.

Is it any good at ascii art? If it can do good, original ascii art then it should also be capable of understanding your grid representation.

Anyway, great read. Hope you post more.

Expand full comment
author

Yeah I should have asked it to represent it graphically somehow that it created. Let me see about that...

Expand full comment
author

I made another update showing my attempts to get it to see the board graphically, rather than as a list of moves, but it seems it can't quite figure it out.

Expand full comment

Neat, based on that one screenshot it did seem to grok what you were trying to get it to do. Yes it was incomplete and on a quick skim it seems to have also made two additional errors, but it looks like it was accurate for the first several moves. Which is actually better than I thought it would be able to do.

It may be limited in how much it can do in one query -> answer cycle. Perhaps some sort of chain of thought method that stepped it through the game sixish moves at a time would allow it to do better.

I wonder what's going on those other times where the grid seems to bear no relationship to the game or even to any plausible game.

Expand full comment
author
Mar 24, 2023·edited Mar 24, 2023Author

I actually omitted an important part of that conversation which would have been helpful to add: I wasn't able to get GPT to print anything reasonable unless I also showed it the correct output for a given board state, and from there it would just parrot that original input with minor changes.

I edited the article to be more clear about my experimentation, and added an example of an output the GPT would send before any correct output had been provided to it by example.

Expand full comment

Ah, I did get the wrong impression then. Thanks for clarifying.

Expand full comment