Unique Identifiers In Event-Driven Architecture

October 27, 2020

In Event-driven architecture there are unique identifiers everywhere: at entity level, at event level, at process level etc. In this blog post, I will show you a de facto standard for how you could easily generate, persist, serialize and deserialize IDs natively through Kafka, Java and JPA.

Which Standard Exactly?

I am referring to version 4 of UUID in the standard RFC-4122. The UUID in its canonical, textual representation has 32 hexadecimal chars and 4 hyphens in the format 8-4-4-4-12 (for example: 7d444840-9dc0-11d1-b245-5ffdce74fad2).

Its payload is 16 bytes or 128 bits, where in UUID version-4 variant-1 there are 122 randomly generated bits. This guarantees the global uniqueness of the generated identifier.

Uniqueness Guaranteed By... A Random Number!? Really?

When I first heard about using randomly generated UUIDs for identifiers in entities I was concerned: I asked myself if I should expect a collision one day and if I should provide handling-logic to prevent it? 🤔

So, I went deeper into the subject: What, actually, are the odds?

A 122-bit random number offers 2^122 or 1:5.3 * 10^36 permutations, which seem a lot, but taking into consideration the birthday paradox from the cryptography, the 50% collision chance is suddenly reduced to the square number of all permutations!

To have 50% change of collision after 2^61 or 10^18 iterations could seem a bit scary: Hey, I don't want collisions at all, right?

Therefore, I correlated this probability with the odds of dying from an impact of an asteroid or comet every year, which is under 1:250.000 or ≈ 1:2^18...

So the formula to calculate the chance of collision, as written here, would be n ≈ square root of (2 * 2^122 * 2^-18) = square root of (2^105) ≈ 5,7 * 2^50 ≈ 6,4 * 10^15

🔔 ... And, I finally got it: An asteroid would likelier hit me 6 times this year than I would have a collision in my code after a quadrillion (10^15) generations using UUID v4! 🤣

How To Serialize It Natively In Kafka?

Since Apach Avro 1.9.0 Specification (published on 14 May 2019) it is possible to define the UUID natively in the event's schema through Avro IDL (AVDL):

@namespace("com.yourcompany.api.domainevent.yourdomain.dto")
protocol ItemProtocol {
    record ItemAvroDto {
        uuid id;
    }
}

🔔 If you use an older schema registry you could serialize it in its canonical textual representation as a 'string'!

How To Serialize It Natively In Java?

Since Java 5 there is support for UUID type:

UUID id = UUID.randomUUID();

🔔 Using 'UUID' type instead of 'String' type for unique identifiers in your java classes gives you additional code robustness: Strong types are always better than weak types!

How To Serialize It In PostgreSQL with JPA?

@Data
@Entity
public class Item {
    @Id
    private UUID id = UUID.randomUUID();
}
uuid-postgresql-columns
uuid-postgresql-data

🔔 The 'uuid' type is 16 bytes and it offers the most performant way for storing unique identifiers in PostgreSQL.

🔔 As you see, PostgreSQL shows UUIDs in a human-readable way, which is practical for copy-paste for example!

How To Serialize It In Oracle with JPA?

@Data
@Entity
public class Item {
    @Id
    private UUID id = UUID.randomUUID();
}
uuid-oracle-columns
uuid-oracle-data

🔔 The 'RAW' type is 16 bytes and it offers the most performant way for storing unique identifiers in Oracle (at the moment of writing of this article).

🔔 Unfortunately the values are not human-readable. Bellow you can find a solution for this.

How To Serialize It In MySQL with JPA?

@Data
@Entity
public class Item {
    @Id
    @Column(length = 16)
    private UUID id = UUID.randomUUID();
}
uuid-mysql-columns
uuid-mysql-data

🔔 The 'binary' type is 16 bytes and it offers the most performant way for storing unique identifiers in MySQL (at the moment of writing of this article).

🔔 You need to explicitly set the @Column length in JPA for MySQL. Without that the column size would be 255 chars and the value would have many '0'-s as suffix!

🔔 Unfortunately the values are not human-readable. Bellow you can find a solution for this.

How To Serialize It In Oracle And MySQL In Human-readable Format?

If it is not essential for you to have maximum possible database performance, you could serialize UUIDs in a human-readable format and get the advantage of being able to copy/paste or export them with ease.

@Data
@Entity
public class Item {
    @Id
    @Type(type = "uuid-char")
    @Column(length = 36)
    private UUID id = UUID.randomUUID();
}

oracle

uuid-hr-oracle-columns
uuid-hr-oracle-data

mysql

uuid-hr-mysql-columns
uuid-hr-mysql-data

Summary

🔔 The source code is available here:

About the author: Andrey Zahariev Stoev

Loves software craftsmanship and systems thinking. Passionate about travel, languages and cultural diversity exploration.

Comments
Join us