目录

如何通过伪造 UUID 来防止重复数据?

目录

在事件入库时,有时候需要重新导入数据,为了防止同样的数据被多次导入,可以采用伪 UUID 方式处理。

该处理方式的前提是可以保证在一个时间精度范围内,同一个用户只会产生一条日志。当然用户也可以和事件类型组合形成其它的唯一标识。

具体实现代码如下:

Ruby

基中各标识长度需要被限制在一定的范围内

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
require "time"

# 根据 时间 事件类型 领域 服务器 用户 生成 自己定义UUID
# event_time time 64bit
# event_sign int  12bit
# realm      int  12bit
# server     int  12bit
# user       int  28bit
def fakeUUID(event_time, event_sign, realm, server, user)
  ts = event_time.to_i * 10000000 + (event_time.nsec / 100000).to_i + 0x01B2_1DD2_1381_4000

  time_low = ts & 0xFFFFFFFF # 32b
  time_mid = ((ts >> 32) & 0xFFFF) #  16b
  time_hi_and_version = ((ts >> 48) & 0x0FFF) # 16b
  time_hi_and_version |= 0x1000 # Version 1
  
  time_hi_and_version |= ((realm & 0xFFFF & 0x3000) << 2) # Add realm[13:14] to version

  event_sign &= 0xFFF # 12bit
  realm &= 0xFFF # 12bit
  server &= 0xFFF # 12bit
  user &= 0xFFFFFFF # 28bit

  ary = [
    time_low, time_mid, time_hi_and_version,
    (event_sign << 4) | (realm >> 8) & 0xF, # 3 + 1
    ((realm & 0xFF) << 8) | ((server >> 4 & 0xFF)), # 2 + 2
    ((server & 0xF) << 28) | user, # 1 + 7
  ]

  "%08x-%04x-%04x-%04x-%04x%08x" % ary
end

def fakeUUIDInfo(uuid)
  segs = uuid.split("-").map! do |e|
    e.to_i(16)
  end
  ts = (segs[0] + (segs[1] << 32) + ((segs[2] & 0x0FFF) << 48) - 0x01B2_1DD2_1381_4000)
  nano = (ts % 1e7) * 100
  event_time = Time.at(ts / 1e7, nano)
  {
    "event_time" => event_time.iso8601(3),
    "event_sign" => segs[3] >> 4,
    "realm" => ((segs[2] & 0xC000) >> 2) | (segs[3] & 0xF) << 8 | (segs[4] >> 40 & 0xFF),
    "server" => (segs[4] >> 28) & 0xFFF,
    "user" => (segs[4] & 0xFFFFFFF),
  }
end


__END__

if __FILE__ == $0 then
	require 'test/unit'

	class UtilsTest < Test::Unit::TestCase
		def test_uuid
         time_str = "2020-09-16T08:05:36.545Z"
         event_time = Time.parse(time_str).utc.localtime("+08:00")
         uuid = fakeUUID(event_time, 92, 14621, 1, 123)
         puts uuid, time_str
         fakeUUIDInfo(uuid).each do |k, v|
         puts "%s: %s" % [k, v]
         end

         fakeUUIDInfo("76eaa97c-d5ce-11eb-00a6-9f07e0003570").each do |k, v|
         puts "%s: %s" % [k, v]
         end
		end
	end
end