๐Ÿ’ผDSN Traveller

Travelling the Matrix network, for Science!


The DSN Traveller traveled the Matrix network under the name of @dsn-traveller:dsn-traveller.dsn.scc.kit.edu and wrote a travel report of what it saw. ๐Ÿ“

In case you have encountered Travis Ralston's Matrix Traveler Bot, yes, they're quite similar.

Why? Who are you?

I want to find out whether the Matrix network scales, i.e. whether everything still works when the network is ten times bigger, or ten times more decentralized. I'm Florian Jacob, a scientific staff member at the Karlsruhe Institute for Technology, and this research question was the core of my Master's thesis at the Decentralized Systems and Network Services Research Group and ongoing research. ๐Ÿ“‘

Research Results

I presented my results in form of the poster A Glimpse of the Matrix: Scabalitiy Issues of a New Message-Oriented Data Synchronization Middleware at the 2019 International Middleware Conference at UC Davis, California. I also published an Extended Tech Report of the poster and the Anonymized Raw Data used in the publications.

Source Code

The source code is available under AGPLv3+.

The bot was written in Rust using ruma-client from the Ruma project.

What was happening?

The bot makes a measurement of โ€œhow bigโ€ matrix.org and other matrix servers are, and how centralized or disributed the matrix network currently is. To do that, it needs to be part of many rooms, as room membership can't be queried without being in the room.

So first, the DSN Traveller (bot) will try to join all public rooms it can find. This phase will probably finish around the end of May 2018.

The bot will then query which servers are in the rooms, and how many users are there from which servers. While that is still in progress, the bot will stay in the rooms until I'm sure everything has gone right. This is done to avoid needing to rejoin all rooms in case something went wrong, cluttering the room histories and stressing the federation .

Because it anonymizes the collected data, the bot can't do partial scanning of the network by joining and leaving batches of rooms. The resulting data could not be merged together, and therefore the bot needs to be in all rooms at once.

After data gathering is finished, the bot will leave all rooms again. If all goes well, this should happen by no later than the middle of June 2018.

How did this work?

The DSN Traveller tries to get a rough overview of how the Matrix network is structured today. It records how many rooms it finds, how many users and servers take part in those rooms, and how they relate to each other, meaning how many users a server has and of how many rooms it is part of.

All room, server and user IDs get pseudonymized immediately after receiving them using a randomly-chosen hash function on each trip. After the Traveller has finished a trip, and before storing it on disk, the resulting room-server-user network is anonymized by hashing each ID together with an individual random value (Salt) which is immediately thrown away. ๐Ÿง‚

This room-server-user network observed by the DSN Traveller will be fed into my simulation of the Matrix network as an anchor point to see what happens when I change the network's size and decentrality, to answer the main question of my thesis.

Why does the bot stay around?

For not having to to rejoin on every adjustment I do on the data collection part of the bot. After the bot development is finished, it could join all rooms, then collect data, then leave all rooms without idling in between.

How to make the bot join a room

If you want the bot to visit you, you can invite the bot into your room. As the DSN Traveller does not travel continuously, though, it will not join the room immediately, but on its next trip. ๐Ÿ’ผ

How to make the bot leave a room

If you don't want to have the Bot in your room, please pardon the inconvenience, and just throw it out! ๐Ÿ˜“

You can make it leave immediately by:

  1. Kicking it (someone could still invite it back) ๐Ÿ“ค
  2. Banning it (if you'd like it to stay gone forever) โŒ
The bot does record when it gets kicked/banned and will remove any applicable nodes from the gathered data.

As I don't require exact data at all, and an approximation to the Matrix network is sufficient for what I want to do, this won't affect my research significantly.


Any questions? ๐Ÿค” Feel free to ask me (@florian:dsn.tm.kit.edu) in #traveller:dsn.tm.kit.edu! ๐Ÿ’ก