update compat
[lttv.git] / tags / LinuxTraceToolkitViewer-0.10.0-pre-115102007 / doc / developer / lttng-atomic-up.txt
CommitLineData
d7d9a4ce 1
2Atomic UP test results.
3
4
5
6
7using test-time-probe2.ko
8
9Clock speed : cpu MHz : 3000.077
10
11Tracing inactive
12
13[ 125.787229] test init
14[ 125.787303] test results : time per probe
15[ 125.787306] number of loops : 20000
16[ 125.787309] total time : 204413
17[ 125.787312] test end
18[ 175.660402] test init
19[ 175.660475] test results : time per probe
20[ 175.660479] number of loops : 20000
21[ 175.660482] total time : 203468
22[ 175.660484] test end
23[ 179.337362] test init
24[ 179.337436] test results : time per probe
25[ 179.337440] number of loops : 20000
26[ 179.337443] total time : 204757
27[ 179.337446] test end
28
29Res : 10.21 cycles per loop
30
31Atomic UP, one trace, flight recorder.
32
33[ 357.983971] test init
34[ 357.988837] test results : time per probe
35[ 357.988843] number of loops : 20000
36[ 357.988846] total time : 12349013
37[ 357.988849] test end
38[ 358.718896] test init
39[ 358.723049] test results : time per probe
40[ 358.723053] number of loops : 20000
41[ 358.723057] total time : 12332497
42[ 358.723059] test end
43[ 359.422038] test init
44[ 359.426173] test results : time per probe
45[ 359.426179] number of loops : 20000
46[ 359.426182] total time : 12332535
47[ 359.426185] test end
48
49Res : 616.90 cycles per loop.
50205.63 ns per loop
51
52Atomic SMP, one trace, flight.
53
54
55[ 111.694180] test init
56[ 111.700191] test results : time per probe
57[ 111.700198] number of loops : 20000
58[ 111.700201] total time : 16925670
59[ 111.700204] test end
60[ 112.285716] test init
61[ 112.291321] test results : time per probe
62[ 112.291326] number of loops : 20000
63[ 112.291329] total time : 16766633
64[ 112.291332] test end
65[ 112.880602] test init
66[ 112.884739] test results : time per probe
67[ 112.884743] number of loops : 20000
68[ 112.884746] total time : 12358237
69[ 112.884748] test end
70
71Res : 767.51 cycles per loop
72255.83 ns per loop
73
74(205.63-255.83)/255.83 * 100% = 19.62 %
75
3fa56475 76
77Difference between
78cmpxchg 2967855/20000 = 148.39 cycles or 49.46 ns
79cmpxchg-up 540577/20000 = 27.02 cycles or 9.00 ns
80irq save/restore 12636562/20000 = 631.82 cycles 210.60 ns
81
82
83
7c5922fc 84* Memory ordering
85
86offset
87written by local CPU
88read by local CPU and other CPUs (reader)
89
90commit count
91written by local CPU
92read by local CPU and other CPUs (reader)
93
94consumed
95written by any CPU
96read by any CPU
97
98data
99written by local CPU
100read by any CPU
101
102
103test done in the reader :
104if ( consumed < offset )
105 if ( subbuf.commit_count == multiple of SUBBUFSIZE)
106 read data
107 inc consumed
108
109
110We must guarantee the following ordering :
111* offset
112Seen from the local CPU :
113offset must always be incremented before the data is written (already
114consistent)
115
116Seen from other cpus :
117offset and data can be written out of order
118(because offset is always incremented : in an out of order case, offset is lower
119than the actual data ready, but the commit_count _has_ to be incremented to read
120the data (and is preceded by a store fence)
121
122* commit_count
123commit_count must always be seen by other CPUs after the data has been written.
124Therefore, we must put a store fence before the commit_count write. (smp_wmb)
125
126* consumed
127Rarely updated, use LOCK prefix. Acts as a full memory barrier.
3fa56475 128
129
130
d7d9a4ce 131Mathieu Desnoyers, November 2006
This page took 0.038331 seconds and 4 git commands to generate.