In this article we are going to talk about designing a unique ID generator for distributed systems.
Unique ID Generator we are going to design will have following characteristis:
- IDs will only consist numerical values
- IDs must fit in 64-bit
- IDs should be ordered by date
- System should be able to generate 10k IDs/second
- Multi-master replication
- UUID (Universally Unique Identifier)
- Ticket Server
- Twitter snowflake approach
![](https://ankit-rana.com/wp-content/uploads/2023/02/1_MzAEsrReluu55EH3cYHpKw.jpeg)
Multi-master replication method
UUID (Universally Unique Identifier)
Ticket Server
![](https://ankit-rana.com/wp-content/uploads/2023/02/133216916-d3379b10-3fc5-4a90-b023-cc62654e2727-1024x393.png)
Twitter’s Snowflake approach
Twitter’s unique ID generation system is known as snowflake and it meets our unique ID generator’s requirements. Istead of generating ID directly, ID is divided into different sections.
![](https://ankit-rana.com/wp-content/uploads/2023/02/1_-C_nBTbEZaxcvxTr2cQ_nQ-1024x311.png)
Timestamp: Milliseconds since the epoch or custom epoch.
Worker ID: we can use the unique worker ID assigned to each computer for distinction. The number of bits is determined by the maximum number of workers in the cluster.
Thread ID: To maximize throughput, multi-threading is used to leverage the parallelism offered by modern multi-core CPU machines. Why not process ID? Inter-process communication is more expensive, we want to put all generated ID into a shared in-memory buffer that can be accessed by the RPC thread.
Local counter: even on a 20-year-old computer, a single thread can generate two timestamps in one millisecond. Hence, we need a local counter that further distinguishes two UIDs. When the counter is full, the thread goes to sleep throughout the current millisecond.
![](https://ankit-rana.com/wp-content/uploads/2023/02/Untitled-1.png)
For our use case we can also use the above mentioned design which consist of following:
- Sign bit: Will store either 0 or 1 1-bit
- Timestamp: Milliseconds 41-bits
- Data center Id: ID or the data center. 5-bits i.e. 2^5 = 32 datacenters
- MachineId : It of the machine 5-bits i.e 2^5 = 32 machines per datacenter
- Sequence Number: For every ID generated on machine/process, the sequence number is incremented by 1. The number reset to 0 every millisecond.