Unique Identifiers In Event-Driven Architecture
In Event-driven architecture there are unique identifiers everywhere: at entity level, at event level, at process level etc. In this blog post, I will show you a de facto standard for how you could easily generate, persist, serialize and deserialize IDs natively through Kafka, Java and JPA.
Which Standard Exactly?
I am referring to version 4 of UUID in the standard RFC-4122. The UUID in its canonical, textual representation has 32 hexadecimal chars and 4 hyphens in the format 8-4-4-4-12 (for example: 7d444840-9dc0-11d1-b245-5ffdce74fad2).
Its payload is 16 bytes or 128 bits, where in UUID version-4 variant-1 there are 122 randomly generated bits. This guarantees the global uniqueness of the generated identifier.
Uniqueness Guaranteed By... A Random Number!? Really?
When I first heard about using randomly generated UUIDs for identifiers in entities I was concerned: I asked myself if I should expect a collision one day and if I should provide handling-logic to prevent it? 🤔
So, I went deeper into the subject: What, actually, are the odds?
A 122-bit random number offers 2^122 or 1:5.3 * 10^36 permutations, which seem a lot, but taking into consideration the birthday paradox from the cryptography, the 50% collision chance is suddenly reduced to the square number of all permutations!
To have 50% change of collision after 2^61 or 10^18 iterations could seem a bit scary: Hey, I don't want collisions at all, right?
Therefore, I correlated this probability with the odds of dying from an impact of an asteroid or comet every year, which is under 1:250.000 or ≈ 1:2^18...
So the formula to calculate the chance of collision, as written here, would be n ≈ square root of (2 * 2^122 * 2^-18) = square root of (2^105) ≈ 5,7 * 2^50 ≈ 6,4 * 10^15
🔔 ... And, I finally got it: An asteroid would likelier hit me 6 times this year than I would have a collision in my code after a quadrillion (10^15) generations using UUID v4! 🤣
How To Serialize It Natively In Kafka?
Since Apach Avro 1.9.0 Specification (published on 14 May 2019) it is possible to define the UUID natively in the event's schema through Avro IDL (AVDL):
@namespace("com.yourcompany.api.domainevent.yourdomain.dto")
protocol ItemProtocol {
record ItemAvroDto {
uuid id;
}
}
🔔 If you use an older schema registry you could serialize it in its canonical textual representation as a 'string'!
How To Serialize It Natively In Java?
Since Java 5 there is support for UUID type:
UUID id = UUID.randomUUID();
🔔 Using 'UUID' type instead of 'String' type for unique identifiers in your java classes gives you additional code robustness: Strong types are always better than weak types!
How To Serialize It In PostgreSQL with JPA?
@Data
@Entity
public class Item {
@Id
private UUID id = UUID.randomUUID();
}
🔔 The 'uuid' type is 16 bytes and it offers the most performant way for storing unique identifiers in PostgreSQL.
🔔 As you see, PostgreSQL shows UUIDs in a human-readable way, which is practical for copy-paste for example!
How To Serialize It In Oracle with JPA?
@Data
@Entity
public class Item {
@Id
private UUID id = UUID.randomUUID();
}
🔔 The 'RAW' type is 16 bytes and it offers the most performant way for storing unique identifiers in Oracle (at the moment of writing of this article).
🔔 Unfortunately the
How To Serialize It In MySQL with JPA?
@Data
@Entity
public class Item {
@Id
@Column(length = 16)
private UUID id = UUID.randomUUID();
}
🔔 The 'binary' type is 16 bytes and it offers the most performant way for storing unique identifiers in MySQL (at the moment of writing of this article).
🔔 You need to explicitly set the @Column length in JPA for MySQL. Without that the column size would be 255 chars and the value would have many '0'-s as suffix!
🔔 Unfortunately the
How To Serialize It In Oracle And MySQL In Human-readable Format?
If it is not essential for you to have maximum possible database performance, you could serialize UUIDs in a human-readable format and get the advantage of being able to copy/paste or export them with ease.
@Data
@Entity
public class Item {
@Id
@Type(type = "uuid-char")
@Column(length = 36)
private UUID id = UUID.randomUUID();
}
oracle
mysql
Summary
- UUID v4 is practically collision free: do not be afraid of using it everywhere!
- Use Avro 1.9.0+ Schema for your Kafka Events to take advantage of UUID!
- Use UUID instead of String in your java code to have better code robustness!
- Enjoy the good support for UUID serialization in PostgreSQL!
- Choose wisely between performance and human-readability for UUID serialization in Oracle and MySQL!
🔔 The source code is available here: