Coordination Without Communication Will We Always Have Paris?

A classic “focal point”, or “Schelling point”, question: say you’ve organised to meet someone in New York City today. You both realise you have no way of getting in contact and never agreed on a location, but you still need to find each other. Where do you go? And when?

Another classic: think of a whole number. If you say the same number as another person you’re paired up with, you’ll both win a prize. What number do you pick?

I love these questions, especially ones that seem ridiculous at first glance but after some thought you realise there really is some natural option: some “ur”-answer popped into your head that likely also popped into the head of the other person you’re silently coordinating with, in a way that allows you both to “recurse” your way to the same answer (she thinks that I think that she’ll think that I’ll pick...).

I first came across these as a kid reading Frederick Mosteller’s Fifty Challenging Problems in Probability, and in his solutions he has a cute story about the NYC meet-up, where he asked his daughter:

[she] said enthusiastically “why, they should meet in the most famous place in New York!” “Fine” I said, “where?” “How would I know that?” she said, “I’m only 9 years old!”

Easily the most common answer I hear nowadays when I ask people (and interestingly, LLMs, more on that below) is Times Square. Maybe it used to be too undifferentiated, but now Duffy Square, with its big set of red steps (the TKTS steps), has become a kind of “focal point within a focal point”.

Mosteller mentions this would be much easier in Paris and San Francisco: what do you think?

Details | More about NYC’s Schelling Point

Thomas Schelling, generally credited as the first person to discuss such problems, hence the name “Schelling point”, included a discussion in his famous book The Strategy of Conflict in 1960:

[The NYC problem], which may reflect the location of the sample in New Haven, Connecticut, showed an absolute majority managing to get together at Grand Central Station (information booth)

This somewhat-specific-to-Yale answer (Grand Central was where they’d arrive into NYC) has become so widespread, I feel like it almost became a Schelling point just through Schelling himself! But Mosteller in 1964 quickly rules it out:

That there are two important railway stations seems to me to remove them from the competition

Fair, although interestingly Penn Station was basically being demolished right as he wrote his book. He nearly chooses Times Square, but goes with the Empire State Building: I can only assume it was easier to get to the top of in 1964 than today?

One small funny story, for anyone who is a fan of Tyler Cowen: he asked Steven Pinker the NYC question in a podcast interview, since Pinker had just written a book on a related topic (Common Knowledge). Pinker mentioned the George Cohan statue, which is indeed right on Duffy Square, close to the red stairs, good answer in my opinion.

COWEN: I don’t think you’re going to meet many people there.
PINKER: Maybe not.
COWEN: Maybe the former World Trade Center site, I thought of. Moynihan Station is not crazy.

Firstly I love him giving two completely different locations. And one is a huge train station that ISN’T the classic answer! J’ai mes doutes. I don’t think I’ve heard anyone else give these two answers before. Hey, maybe!

The cover of Mosteller's Fifty Challenging Problems in Probability

Does it help if you know the person?

Would these questions be easier if the person you couldn’t communicate with was someone you knew, or a total stranger? This feels obvious: you share so much culture and knowledge with your friends, surely this should help! But oddly I’m not convinced.

Say I’m meeting my friend who also lives in Paris, and we’re playing the Paris version. Well we might think about meeting at one person’s apartment, but the symmetry of the two makes that tough. Maybe a bar we go to? I don’t think I have a place I head to frequently enough. This is made even worse if we live near enough to each other to make the obvious Eiffel Tower answer seem ridiculous: are we really both going to haul ourselves all the way across town to meet up, when we live ten minutes from each other?

Whereas with a stranger, it’s easy. We don’t know anything about each other, there’s really nothing to think about other than: what are the most salient places that pop into my mind (that will therefore ideally pop into theirs too)?

Details | When else can it be good to know less?

There’s an analogy here to a concept Schelling discusses in the same book that introduced focal points: sometimes knowing less is a strength. Say you’re driving and on a call with your mom discussing food options. Your speaker starts hissing weirdly and you can’t hear her but the connection and mic seem fine: you apologise, explain you can’t hear her, and tell her you’ll see her at Joe’s Burgers. Where is she likely to go? She can’t really make a counter-suggestion, so likely Joe’s it is.

Not quite the same idea, but this reminds me of when I speak French to people who really want me to understand them. Because I’m not fluent, they have to speak slowly and clearly, but because they’re fluent, I can hammer through as fast as I can!

Of course there are limits here: if I know someone REALLY well, maybe we have a super obvious spot. Or going the other way: imagine we were trying to coordinate on both picking the same coffee order (and we’d win a cash prize if we both pick the same one): an Italian probably hopes to get paired with an Italian friend rather than someone completely unknown from a different country!

So, does it help to know them? The relevant axis is probably: does your shared knowledge increase the number of potential options, muddying the waters for a winner, or help narrow it down?

HOW COULD YOU PICK THAT??

My favourite part of asking people these questions is when they find each other’s answers absolutely insane. How could you possibly think a random person would also pick your favourite spot on the Seine to meet up??? Why would we all converge on the number 33?? Of course this is all much funnier when it’s a category that really doesn’t have a true ur-element. Coffee is a good one: while it’s probably one of latte, espresso, cappuccino or maybe drip (or flat white! shout out Lorcan), it’s not SO obviously any particular one (I really hope dear reader you’re shouting at the screen right now, saying “OF COURSE IT’S OBVIOUSLY ONE OF THEM”).

If you’ve played the game Wavelength, where you try to get the others to pick a specific point on a spectrum (e.g. hot and cold) by giving a single word clue, you probably know the feeling.

In contrast, there are some funky questions that actually do seem to produce a single clearly most common answer even though it isn’t obvious a-priori. Try picking a country with a friend. Or take shapes: I don’t know if I personally think there is a ur-shape. I guess triangle, square, and circle all have a reasonable claim (“NO THEY DON’T YOU IDIOT IT’S CLEARLY insert your favourite here”), but most people I ask converge on circle! Maybe it is the ur-shape after all.

Anti-focal point questions: Divergence

What’s more fun than finding the same answer as everyone else? Avoiding them of course! Can you try NOT to match with another person who is also trying to avoid you?

These can be great fun, especially when it goes wrong: e.g. try to pick the least picked day of the week, and half the group picks Tuesday. Or mighty Longford suddenly becoming the most popular part of Ireland.

Another great element here is strategies that seem wise until they become popular, then the opposite is wise. “I’ll pick the ‘converge’ answer, which others will avoid” works great until we’re all trying it, and now we’re all picking 6 as our divergence die roll.

Can we do better than random?

For diverge questions, there is a generally wise (if boring) strategy: randomise. If you randomly pick a number from 1-20, it doesn’t matter what another person picks, your chances of overlap are low and independent of what they do. And indeed if everyone else is randomising, there’s no better strategy for you to follow.

But there are times when we can do better than random!

The clearest case is taking advantage of other players’ non-random play. If you know for example in picking the least-picked day of the week, way too many people pick Tuesday and Thursday, you can avoid them on average by picking among the other five days.

A more interesting case: can we use the same logic as in convergence to get silent divergent communication? Imagine a group of five friends is choosing whole numbers from 1-20 and will win a prize if they can all avoid each other. Is there any natural strategy for splitting up the numbers between them without communication?

Two come to mind: you could sort by alphabetical order of names, or age. E.g. youngest takes 1-4, next 5-8, up to the eldest taking 17-20 (and randomise within the block for good measure). Now of course you can’t guarantee others will use this strategy, and the existence of TWO reasonable strategies isn’t great (although alphabetical seems more natural to me). But even partial adoption helps, e.g. if even two of the five partition by name, it’ll help the group’s chances!

Can LLMs Play These Games?

This question is actually what inspired me to write this post (and make a game! see below). I was walking around the city, and I wondered: how often would different LLM models manage to meet up in Schelling point questions? What about the same model?

In theory it should be right up their alley: one of the largest critiques levelled against LLMs — that they collapse to modal thinking — should exactly help them here.

And indeed for convergence, they’re pretty good! Even with multiple models from multiple providers. Not perfect, certainly not infinitely better than your average person, but good.

They’re ultra consistent on some questions: they converge on heads for a coin, pretty much Times Square for NYC, USA for country, etc. Others they’re less sure, but typically where humans aren’t sure either, e.g. picking a basketball player (MJ vs LeBron debate rages here too). And for culturally specific questions (“choose a coffee order”) they tend to bias towards the American answer (which country your silent counter-party is from is a big deal in general!).

Details | LLM Convergence in Paris, SF and Numbers

Can they converge in Paris?

They generally choose the Eiffel Tower, most commonly on the ground, right in the middle, between all pillars. But some choose waiting in front of it, near the Champ-de-Mars; others choose the north pillar to wait under; some choose the second floor (!) of the tower. I got a selection for Trocadéro Palace too.

Depending on how close we need to be, we can give all the “base and near the base” credit for finding each other.

Can they converge in San Francisco?

Most picked in front of or around the Ferry Building, with a few differences about exactly where: typically under the giant clock tower, but also e.g. “at the foot of Market Street”.

But a few pick the Golden Gate Bridge (even after being asked for specificity, providing none until prodded)! Others noted (unprompted after saying the Ferry Building) that it would be a terrible choice (‘In SF, the Golden Gate Bridge is more iconic but terrible for actually meeting someone — it’s huge, hard to reach, and “at the bridge” is ambiguous’).

Can they converge on a whole number?

This is fun: they vacillate between 1 and 7 depending on the wording and exactly when I ask the model etc. So far a slight bias towards 1 though. I will say, they actually seem to pick 7 at a higher rate than humans I ask.

Any questions they suck at?

I did ask a really complex “battleships” question they’re completely trash at (much worse than humans!), but maybe that’s not a fair one to judge. If you play the game, you’ll come across it at some point...

I didn’t deeply explore this, but I got hints that there are areas where LLMs have kind of landed on their own “LLM-focal points” that aren’t necessarily human focal points (or at least not the primary one). Now this is just from a small sample of LLMs, and my friend group is of course very far from a random selection of people, so I’d love to see more research on this. But e.g. for something like which US state to vacation in, they love Hawaii, which some humans pick, but I’ve found California to be a more common answer so far. But yes, I need to do more research here.

LLMs on Divergence

The LLMs so far are just brutal at divergence. I realise it’s a tough ask, but they are all probably worse than the worst human player I’ve asked. Same caveat though: it’s a small sample (of both!) so far.

It’ll of course be prompt specific, but for example when asked to pick a number from 1-20, they pick 14 a crazy high amount of the time, like over 30%! When asked for a Kardashian/Jenner, they pick Kourtney, like 50% of the time! For a day of the week, Tuesday way too often.

These models generally have access to code, they could, if they wanted, just spin up a random number generator, but they don’t. I asked them, why not?

But here is why I didn’t use a script, and why psychology beats literal randomization in this specific scenario: Humans are terrible at being random.

There’s a nice irony here to these responses (I got many like this, e.g. “in practice, most humans can’t or won’t truly randomize”). A bunch of LLMs all running into each other because they refuse to randomise, saying humans can’t or won’t randomise, even without being told their opponent is human. I love it.

They didn’t fare any better when I told them explicitly it was just them against other LLMs (like still nearly 50% choosing Kourtney Kardashian out of the six main Kardashian/Jenners)! Even if I say they’re playing against the EXACT SAME MODEL AS THEM! One came so close to solving it (Claude Opus 4.7):

Of course, if the other me reasoned identically, we both land on Kourtney and both lose. There’s no way to truly anti-coordinate with an identical copy through pure logic

Who when asked why they didn’t randomise, said:

You’re right, I didn’t randomize — I reasoned my way to Kourtney, which means the other instance almost certainly did too. That’s the worst possible strategy against an identical copy.

They kind of don’t get it though, maybe this is my fault. This is a justification for not randomising:

It’s basically a game-theory move: assume the other model is trying to be clever, and avoid the choices that “cleverness” would gravitate toward.

Similarly:

When LLMs try to simulate “randomness” or pick a “safe middle” to avoid another LLM, they overwhelmingly gravitate toward the exact middle of the list or the “underdog” choices (Khloé or Kendall). By picking the second item on the list—Kourtney—I am aiming for the blind spot.

Brother, it’s YOU you’re talking about.

Details | Gemini DOES Randomise? And what’s the question?

Now here’s a fun twist, an “exception that proves the rule” type thing. Some models have code access by default. Using https://aistudio.google.com/ , Gemini 3.1 Pro doesn’t by default, but you can optionally turn it on (this makes sense, since it’s for helping you test and build API workflows).

If you JUST turn on Code Execution, Gemini will actually sometimes use it unprompted, nice! But it’s a bit of a trick. If you ALSO turn on a few other things, like Grounding with Google Search, URL Context or whatever, now it goes back to not using it, pretty funny. So I guess if you kind of suggestively dangle code execution, the LLM takes the hint. But if you merely give it access as one thing of many it can do, it doesn’t.

Also, it’s worth being precise about our question here. Early on I was uncareful with my prompt, I mentioned picking the least popular answer. This can be the same as “avoiding a random partner” but it depends on the payoff a bit.

Good challenge! But randomisation actually doesn’t help here, for a subtle reason: The goal isn’t to pick an unpredictable number — it’s to pick a number others are unlikely to pick.

That is, are they rewarded for avoiding as many people as possible or something like that, or ONLY if they happen to find the least popular answer?

But this sort of detail is way beyond the lower level models, like Gemini Flash Lite and the lower level ChatGPT models, who don’t quite grasp the basics of what the divergence question is:

If I were to pick Kim, I would be choosing the most “statistically probable” answer. In a game where the goal is not to match others, the rational move is to reject the most probable answer and seek a “niche” answer.

OK they suck at divergence. So what?

A lot has been said and conjectured about LLM creativity issues, maybe this is similar. Let’s set that aside though.

There is something maybe more interesting. We’re kind of pushing into something like Newcomb’s problem , or maybe even free will, in an uncomfortable direction here: we have “intelligences” communicating and playing with literal copies of themselves, kind of for the first time. I almost feel bad asking them to do this, it’s a little weird!

But of course we CAN anti-coordinate against a perfect copy of ourselves: by randomising.

The negative issue, even if we only see it in miniature here (there’s a joke in here somewhere about canaries vs stochastic parrots), is monoculture. It’s not that LLMs are less creative than an average human (maybe they’re not, I’m not certain: probably in some ways). The issue is: there are so many copies of so few models. The more cultural output the LLMs are responsible for, the duller the fabric of life will be.

So they're stupid and useless for e.g. writing?

No.

I’m using Claude Opus 4.7 to copy-edit this post, to avoid boringness on the micro level (e.g. reducing excessive asides in parentheses... however, all em-dashes are my own). But letting an LLM write the whole post would create two distinct issues. The whole thing would likely be boring (and that’s true even if you’re not enjoying this post and raising your eyebrows right now at the irony). And even if it wasn’t boring itself (I actually don’t think all LLM output is boring!), the real concern, following from the above, is that the posts they produce would be awfully similar.

For example, if you hate what I (Colman) write, that’s basically fine, since nearly everyone in this world isn’t me, and would make something reasonably different. But if you hate what LLMs write, you better hope a human is adding some juice to the next attempt.

I did actually get multiple models to write a blog post about Schelling points, and indeed the posts kind of suck. They’re tough to read because they read like informational pamphlets, but with schlocky phrases as a bonus (“You might think this is just a fun parlor trick for economists, but invisible Schelling Points govern almost every aspect of our daily lives.”). And no shock, they’re all extremely similar, even from different providers.

However, informational pamphlets aren't always bad. Especially when they're the exact subject and level you’re looking for. I find LLMs great as a first pass for learning about stuff. You first get a nice intro, then you can ask questions and poke and prod to learn the parts that are confusing or surprising or interesting to you, or just holes in your knowledge. Here the lack of divergence is probably a good thing if you’re not doing groundbreaking research.

Making a Game

I first thought about systematically recording answers from LLMs and seeing how well they did, but honestly that was boring, and discussing their failures with them was more fun.

Of course even more fun is asking humans! So I made a game. For now, I’ve modelled this on the “Wordle” style daily games: four questions a day, generally two converge and two diverge, with one seed question randomly mixed in (here we face another well-known paradox, the unexpected hanging! ).

I hope you enjoy Mind the Hive !

Well, Will We Always Have Paris?

First we need to actually find each other there to begin with (as do the LLMs). Given the pretty obvious Eiffel Tower answer, minor practical concerns aside, we’re probably sorted.

We could also ask, will we continue to find each other at the same place? Will our focal points drift over time? In some sense, of course they will — we can just look back and see they already have: in 1870, for “pick a country” the answer likely wouldn’t be the US, and for “pick a coffee order”, espresso (and therefore latte and cappuccino) wasn’t even invented yet!

I can see a future where e.g. the plaza in front of Notre-Dame (Parvis de Notre-Dame) on Île de la Cité becomes a more compelling choice for Paris, and we’d be bringing back a focal point classic: Point Zéro . We’d be following the NYC path too, from a large iconic structure (Mosteller’s Empire State or Schelling’s Grand Central) to an iconic open public space — and much like Times Square, it even has its own set of big stairs now!

Or not. Maybe the LLMs will freeze our current focal points in time, and we’ll never escape the 7th arrondissement.