About this website

I'm Diego Dorn, a french math student, actually (2021) in my third year of bachelor at the EPFL. I find it fascinating to see how order and meaning can emerge from just a sequence of random bytes.

I like randomness, and I've used it in quite a few projects since I've learned how to code. I started with a few guessing games, but what I really liked was to create unexpectedd things, thanks to randomness.

At some point, I learned a bit of computer graphics and decided that it would be very fun to generate random images, so I learned about noises (perlin, simplex...) and I also tried to use randomness to add value, rather than chaos in games I was coding. Last year, I found out that I could also generate fractals in a completely random fashion! This is how @TheFracalBot was created, a twitter bot that posts a new, unique, fractal every day. Later, I made thefractal.space, my first website, to showcase those random fractals.

Comming back to now, I was in recovery from the appendicitis, and a bit bored, so I tried to improve a simple random word generator I made long ago. The goal was threefold:

Sadly, I have not yet succeded the thrird goal, because I'm heaving trouble getting data from fanfiction websites.

During those few days, I coded a lots of different words or entences generators. Not all of them where fun (generating bible verses was less fun than expected) be we did have a hell of a good time with my friends and familly. That's the reason I wanted to share them ;)

This also gave me an excuse to learn a bit more about web technologies, since I know close to nothing about them. This how I discovered Tailwind, PythonAnywhere, the DNS, the workings of javascript promises and how to make a REST api... I don't fancy coding for the web that much, but it is so easy to share what you do that it is very gratifying, and there are losts of interesting things to learn.

This website is therefore a playground for me to discover an unknown world, so don't expect everything to work, to be coherent or well designed... I'm trying my best tho!

How does it work ?

All the generators work in the same way. The words or sentences are generated in three steps:

One important parameter is the length of the sequence. For each generator, I fixed an integer N, and fill the table with all the combinations of N consecutive words in the corpus, along with their count. I found out later, those are the N-grams of the corpus, and are used in a wide variety of areas, mostly in natural language processing, but also in computational biology (DNA/protein sequencing), linguistics, data compression, communication theory...

In order to generate interesting proverbs, it is important to chose the right N. The bigger N, the more we capture the rules of the french language, and the more our sentences are well constructed. However, we want to generate new proverbs, and this happens when two proverbs have N-1 words in common, but the N-th is different. This way, there is a choice, a possible bifurcation. The thing is, if N is large, this happens less often. Also, it takes more space to store N-grams.
On the other hand, if N is low, there are a lot of bifurcations, and most proverbs we generate will be new... but non sensical.

> Mieux vaut ne juge pas une chose bien pour faire naître ? ne me faire dire, le long sérieux.

Here, N = 2, and we see that every pair of words can be together, but overall, it is a mess. With N = 4, almost every proverb already exists, so I generate them with N-3. It is not perfect, but it works often enough to be fun!

For more details, you can find the source code on Gitlab. The idea is very simple, but the code is a bit more complex because I wanted it to work with any depth and be very generic (the exact same code generates words or sentences!). It means that the probability table is a N dimensionnal array (in practice, a dict), generic over N and most functions are recursive.

Credit where it's due

First of all, a big Merci to my brother, Félix, for his time and advice. Everything I know in front-end is thanks to him.

I'd like to thank @HommeViande for giving me this great idea.

A big thank to Python Anywhere for hosting the website, and being so nice and competent. If you want to host a python website, run bots or whatever experiment, you should definetly check them out! And they have a great free plan...

Thanks to Heroicons for the nice icons I use.

API

There is a very simple API to generate proverbs at www.therandom.space/api/proverb. You can replace proverb with alsace or film to generate different things.

$ curl https://www.therandom.space/api/proverb
>
{
    "quote": "L'hypocrisie est un fleuve.",
    "likes": 1
}

Note that all endpoint accept a raw argument and return a simple text, instead of json.

$ curl "http://www.therandom.space/api/film?raw=true"
>
Twelve o'clocked

It is also possible to get the full best of, with all kinds (films, proverbs...).

$ curl https://www.therandom.space/api/bestof
>
{
    "0": {
        "quote": "Une seule hirondelle ne fait pas cuire le riz.",
        "likes": 8,
        "kind": "proverb"
    },
    "1": {
        "quote": "Jimmy door wars: the seven years",
        "likes": 5,
        "kind": "film"
    },
    ...
}

It is also possible to filter the bestof to have only one kind:

$ curl https://www.therandom.space/api/bestof?kind=alsace

Contact

You can contact me on Twitter @LeQuanta or by email: diego at lama-corp.space.
If you had a good time, or wish to encourage me creating more random things, you can support me by donating on paypal:

Buy my next chocolate