shd_

The anatomy of an Open Tibia server

· 8 min read

I spent a weekend bringing a dead game server back to life. The nostalgia wore off fast. What stuck was how much real, complete networked-systems engineering is packed into a binary smaller than a phone photo, and how much of it you can only see once you start pulling it apart.

Open Tibia servers are open-source reimplementations of the backend for Tibia, a 2D MMORPG from the late 90s. Strip away the medieval sprites and you’re left with an honest little game engine: a world held entirely in memory, a custom binary protocol, Lua-scripted creatures, and one process keeping track of everyone connected. I had the C++ source — 42 files, about 16k lines — so this isn’t guesswork. Let me walk through what’s in there.

The whole universe fits in five lines

The first thing I do with unfamiliar code is find where everything gets created. Here it’s almost comically compact — the entire world is a handful of globals:

Items   Item::items;     // every item type, loaded from items.xml
Map     gmap;            // the game world
Spells  spells(&gmap);   // the spell system, bound to the map
LuaScript g_config;      // config.lua is itself run through a Lua state

That tells you the design philosophy before you read a single function. No service layer, no dependency injection. There’s a world, a thing-dictionary, and some rules.

The world is just memory

There is no live database. The world is a 3D grid of tiles, loaded from XML at boot and held in RAM for the life of the process:

Map
 └─ Tile[x][y][z]
      └─ Thing            (base class)
           ├─ Item        (the sword on the floor)
           └─ Creature    (anything that acts)
                ├─ Player
                └─ Npc     (monsters + shopkeepers)

Everything that exists in the world is a Thing sitting on a Tile. A player and a dragon are the same kind of object at the root. Almost every interesting operation starts the same way: getTile(x, y, z) to find the square, then look at what’s on it. Hold that pattern in your head, because it’s also where the bodies are buried (more on that below).

Accounts and characters persist as flat XML files, read at login and written at logout. For a game whose real state is “who is standing where, holding what,” a database would be overkill.

A hand-rolled binary protocol

Client and server speak a custom binary protocol over plain TCP, traditionally on port 7171. No HTTP, no JSON. Every message looks like this:

┌────────────────┬──────────┬───────────────────────────┐
│ 2 bytes: length│ 1: opcode│ payload (opcode-specific)  │
└────────────────┴──────────┴───────────────────────────┘

Read two bytes for the length, read that many bytes, and the first one tells you what kind of message it is. The abstraction over all of it is a single class that’s really just a cursor over a byte buffer:

unsigned short x = msg.GetU16();   // read 2 bytes, advance the cursor
unsigned short y = msg.GetU16();
unsigned short item = msg.GetU16();
// ...
msg.AddByte(0x0A);                 // writing works the same way in reverse

That’s the whole protocol once it clicks: two parties agreeing on the order of bytes. Movement, turning, a line of speech, an attack — each is an opcode followed by a few of these reads.

Following one packet through the machine

Here’s the path a single action takes, which I pieced together from the call stack:

TCP byte stream


ConnectionHandler(thread)         one thread per connected client


Protocol70::ReceiveLoop()         read length, read body, peek opcode


dispatch on opcode  ──►  mutate the world (move/turn/say on the Map)


getSpectators(range)              who is close enough to see this?


send each of them an update

The shape never changes: decode the action, change the world, then tell everyone nearby. That last step is the whole multiplayer illusion — the server decides who can see what and ships each client only its slice of reality.

You log in twice (this is the good part)

Connecting isn’t one connection. It’s two, and almost nobody remembers that until it breaks.

Client ──(1)──►  LOGIN server   : check account, send back:
                                    • character list
                                    • the IP/port of the GAME world
Client ──(2)──►  GAME world      : new connection, actually enter the game

On the original game this let the company run logins and game worlds on different machines. On a private server it’s usually one process answering both, but the two-step handshake still happens. Which is exactly why a misconfigured server leaves you stuck forever on “connecting to the world”: step 1 succeeded, you saw your characters, and then the server handed your client a world address it couldn’t reach. The first connection reports success; the second one fails in silence. I watched it happen, and the fix had nothing to do with passwords — the server was advertising the wrong IP in step 1.

Encryption, eventually

The early protocol barely had any. Reading the engine, there is no RSA, no session cipher — the account number and password arrive as plain bytes. Anyone on the path could read them. Later versions bolted on RSA to encrypt the login packet and a lightweight block cipher (XTEA) for the in-game session. That bolt-on is why mismatched client and server versions fail so confusingly: the bytes are simply garbage to one another, with no error that explains why. Poke at old protocols long enough and you get a fossil record of software learning, in real time, that confidentiality is a feature.

The monsters think in Lua

Creature behaviour isn’t hardcoded. Each NPC gets its own Lua state, and the engine calls into it on a timer:

luaState = lua_open();
luaL_openlibs(luaState);
luaL_dofile(luaState, "data/npc/scripts/lib/npc.lua");
luaL_dofile(luaState, scriptname.c_str());   // this NPC's brain
// ... and C++ functions are exposed back to the script:
lua_register(luaState, "selfTurn", NpcScript::luaActionTurn);

Every game tick, the world calls each nearby creature’s onThink, which runs Lua, which calls back into C++ functions like selfTurn or selfSay. It’s a genuinely nice piece of design hiding in old code: the engine is the physics, Lua is the personality, and the boundary between them is a handful of registered functions.

The assumptions you can’t see until you move it

This is the reverse-engineering lesson I’ll keep. The code is full of assumptions that were invisible because they were always true on the machine it grew up on.

It builds file paths and then lowercases them:

std::string filename = "data/players/" + name + ".xml";
std::transform(filename.begin(), filename.end(), filename.begin(), tolower);

On Windows, where the original lived, the filesystem doesn’t care about case, so Cyclops.xml and cyclops.xml are the same file. Move it to Linux and that lookup quietly returns nothing. The file “doesn’t exist,” the load returns a null, and the code that assumed a valid object marches on with a null in hand.

Reading a crash like a map

Which brings me to my favourite moment. After fixing the file loading, the server booted, let me in, then died the instant a monster near me tried to turn around. A debugger backtrace told the whole story in one screen:

Map::checkPlayer()          ← periodic tick while a player is in-world
 └ Npc::onThink()
    └ NpcScript::onThink()   ← runs the monster's Lua brain
       └ lua_call()
          └ luaActionTurn()  ← Lua told the NPC to turn
             └ Npc::doTurn()
                └ Map::creatureTurn()   💥 SIGSEGV

And the exact line:

int stackpos = getTile(creature->pos.x, creature->pos.y, creature->pos.z)
                 ->getThingStackPos(creature);   // getTile() returned NULL

There’s the getTile(...)-> pattern again, with no check between the lookup and the dereference. The author assumed a creature is always standing on a valid tile. Usually true. Not always. And in C++ a null dereference isn’t a polite exception you can catch — it’s the whole process gone in an instant, taking everyone’s session with it. The fix is one guard:

Tile* t = getTile(creature->pos.x, creature->pos.y, creature->pos.z);
if (!t) return;               // the safety check the original never needed
int stackpos = t->getThingStackPos(creature);

Most of the work in reviving one of these isn’t features. It’s adding the checks that were never necessary until the ground shifted underneath the code — a newer compiler, a case-sensitive disk, an edge case the original data never produced.

Why bother at all

Because it’s a complete system you can actually finish reading. Real protocol design, real in-memory state, real networking, Lua scripting, and a cast of latent bugs that only a change of scenery could surface — all in a codebase small enough to understand in an afternoon. No framework is doing the hard parts and hiding them from you. You see the bytes, you see where they go, and when it breaks you can read the crash like a map and walk straight to the bug. That’s worth more than most tutorials.