What they are, what the options are, and why they matter

Image for post
Image for post
Image by Martin Thoma

Hash functions take arbitrary many bytes as input and produce a fixed-length string as output. The string typically looks completely random, but the same input always generates the same output. They also typically produce different outputs for different inputs, but more about that later.

After reading this article you will know three different applications of hash functions. All of them are crucial for modern software development. Let’s go!

A trivial hash function

Let’s say we want a hash function that takes arbitrary length input and generates a 128-bit output.

The trivial way to compute a hash would be to look at 128-bit blocks of data. If the input is not a multiple of 128 bit, we just pad it with zeroes. …

What they are and what the differences are

Image for post
Image for post
Image by Martin Thoma

HTTPS, SSL, and TLS are all related to encrypted (“secure”) internet connections. The problem they solve is that a man in the middle could read the data you receive or send. It is clearly an issue when you log in to your bank or when you send messages via Twitter / Facebook that should be private. Similarly, you might not want people to know what you are interested in or what you don’t know when you use Wikipedia.

SSL is short for Secure Sockets Layer. It was released in 1995 in version 2. …

An Introduction to Blockchain, Bitcoin ₿, and related concepts

Image for post
Image for post
An example of a blockchain. Image by Martin Thoma.

Bitcoin crossed $40,000 USD for the first time recently, so it’s again in the news. Bitcoin is just the most-known cryptocurrency. It is one application using a blockchain. In this article, I will walk you through some core concepts of blockchain and cryptocurrencies. This article is written for beginners and a bit fluffy in some areas. There will be follow-up articles to address that. Let’s start!

The Idea of a Ledger

Easy recipe, baking 25 cookies

Image for post
Image for post
Image by the Author

I love those simple cookies. They are not super dry like many others, they are easy to cook. Let's get started!

Ingredients for 25 cookies

  • 150g butter
  • 100g sugar
  • 1 package (ca. 8g) of vanilla sugar
  • 1 egg
  • 250g wheat
  • 2 big spoons of chocolate powder (cacao powder)


  • Oven
  • Fridge
  • Backing tray and baking paper
  • Mixing bowl and Hand mixer
  • Kitchen scale
  • 2 bowls/plates which fit into the fridge and can have at least half of the dough


  1. Mix butter, sugar, vanilla sugar, egg.
  2. Add the wheat and mix it until the dough is smooth
  3. Remove half of the dough. …

Are you up-to-date with Pythons Type Annotation development?

Image for post
Image for post
Image by Martin Thoma

Python keeps developing via Python Enhancement Proposals (PEPs). PEP 586 added a feature I love: Literal Types 😍

After reading this article you will know how to use that feature.

If you happen to be new to type annotations, I recommend reading my introduction to type annotations in Python.

Literal Annotation Example

The Literal type needs at least one parameter, e.g. Literal["foo"] . That means that the annotated variable can only take the string "foo" . If you want to accept "bar" as well, you need to writeLiteral["foo", "bar"] .

A complete example:

from typing import Literal
# If you are on Python 3.7 and upgrading is not an option,
# $ pip install typing_extensions
# to install the backport. Then import it:
# from typing_extensions import…

Learn the difference

Image for post
Image for post
Photo by JJ Ying on Unsplash

The terms “decentralized” and “distributed” sound extremely similar and people sometimes use them interchangeably. It might depend on the context, but I see “decentralized” mostly about decision making, control, and who is in charge. I connotate “distributed” with the location: Where is something stored/computed/done, how do we make latency low, create services which can deal with natural disasters.

A decentralized system has no single element that takes a decision. The decision is taken across more than one person or organization.

A distributed system is computing/storing things across more than one machine or data center.

Internet Services

Level-up your YAML knowledge to write cleaner YAML files

Image for post
Image for post

YAML is a file format commonly used for data serialization. There are a plethora of projects using YAML files for configuration, such as Docker-compose, pre-commit, TravisCI, AWS Cloudformation, ESLint, Kubernetes, Ansible, and many more. Knowing the features of YAML helps you with all of them.

Let’s cover the basics first: YAML is a superset of JSON (source). Every valid JSON file is also a valid YAML file. This means you have all of the types you expect: Integers, floats, strings, bool, null. Also sequences and maps. …

What it is and why it’s done

Image for post
Image for post
Photo by Waldemar Brandt on Unsplash

Vendoring in Software Engineering is the act of including 3rd party software directly in your product. The alternative is to let a dependency management tool install it.

One reason for vendoring is to avoid version conflicts when the dependencies are installed. For example, your amazing software could require the dependencyA in version 3.2.1 , but the user needs A==3.4.0 . Hence the version amazing needs conflicts with the version the user needs for something else. This is especially a problem if your software is a library. Maybe the user needs the library A==3.4.0 in his project AND amazing as well. …

Image for post
Image for post
Photo by Ray Hennessy on Unsplash

It’s new years eve and — as always — I try to finish some things and have some plans for next year.

Review of 2020

The year 2020 was crazy. It was mainly dominated by the COVID-19 pandemic, but also by much crazy news by Trump / from the US. There were also bushfires in Australia and California, the Atlantic Hurricane season, the Pacific typhoon season, a locust infestation in East Africa, severe floods in Jakarta. Oh, and there was Brexit, the Killing of George Floyd, and the Wirecard Scandal. It was an exhausting year.

There were also positive things in 2020:

  • Climate Change: Due to COVID-19 we had lockdowns. Due to the lockdowns, we got a 7% reduction in CO2 emissions. Some of those effects might be permanent: More people taking bikes, more people working from home. …

A data scientists yearly review

Image for post
Image for post
Photo by Micheile Henderson on Unsplash

Many banks allow you to get a CSV dump of your transactions. I’ve downloaded my transaction data for this year and I will walk you through my analysis. I will only look at what I spend money on. I will not look at my income and I will not write about my savings plan/investments.

All code can be found on Github (link), in case you’re interested in Streamlit / Pydantic.

About me: The Data Generating Process


Martin Thoma

I’m a Software Engineer with focus on Data Science, Machine Learning. I have over 10 years of experience with Python. https://www.linkedin.com/in/martin-thoma/

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store