root/doc/devel_notes

Revision 1639:fdd2f3e64e41, 16.7 kB (checked in by klai@…, 6 months ago)

Benchmark xml parsers

Line 
11) Make a new bank account:
2
3tycoon -o UserName=bank_admin bank create_account l.zhang@hp.com 1000 a __Bank
4
52) sync for debugging:
6debugsync machine_name
7
8*) Check all input from network.
9
10*) Auctioneer handles XMLRPC, argument checking, transferring of money.
11
12*) AuctioneerAllocator handles management of resources
13
14*) VMM handles specifics of a virtualization technology
15
16*) Use assertions to check things that should never happen
17
18*) Throw exceptions for things that could happen, but shouldn't, e.g.,
19   erroneous input from command line or network
20
21*) Don't use multiple inheritance.
22
23*) Avoid inheritance where convenient.
24
25*) Test all code using automated test harnesses.
26
27
282/21 Should minimum and contingency funds be kept separate?
29     a) Separate pools: Can have very strong assurances to meet minimums
30     b) Single pool: Can spend savings to get better expected performance
31
32   Separate pools are better
33
342/21 Should have separate or shared contingency fund?
35     Shared is better to spread risk, but disk is special case because of
36     importance of persistent data and ability to shutdown use of active
37     resources
38
39     Shared, with pool bid over 1 time interval according to weights
40
412/24 What should the maximum memory of Xen VMs be?
42  * Bid's maximum memory: too large. Other VMs may prevent this from
43  being possible 
44  * Actual available capacity based on bid, etc.
45    * need to write config file before boot.
46
474/24 Magnitude of bids must be close to that of preferences and each
48other.
49
505/2 Auctioneer API: return values are a dictionary. An "error" element
51indicates an error with the value being a human readable
52message indicating the type of the error. Additional machine readable
53data can be in "error_value".
54
555/2 Should probably avoid the use of exceptions in favor of explicit
56return values. exceptions should be for unexpected situations, not
57expected ones.
58
596/6 Error handling is significantly complicated by using explicit
60return values instead of Exceptions. Explicit return values have to be
61processed all the way up the call stack even if only the top level
62handles them. Exceptions only have to handled at the top level. The
63new policy is to relax the old policy and use exceptions to signal
64error conditions while still using return values to pass the expected
65value of a function.
66
67The actual exception used should be separate from the standard system
68exceptions to distinguish anticipated error conditions from
69unanticipated error conditions. Also, the name of exception should be
70different for each module.
71
725/7 Resources have a name and type. A name specifies a specific
73resource while a type specifies a class of resources.
74
755/24 Sample contingency bid:
76
77python tycoon host bid 67.164.68.18 0 "{'CPU': (1, 0.0, 1.0, 0.0)}"
78
79"{('TCPv4Port', 30000): (1, 1.0, 1.0, 0.0)}"
80
816/9 There are several disk allocation quantities:
82    bid_min and bid_max: specified by the account's bid
83    usage: how much is actually being used by the account. Includes
84           the file system size and all other files
85    allocation: amount that the allocator gives to the account
86
876/9 The relative values of these quantities trigger different events:
88
89    allocation < bid_min: delete the account.
90      This is the definition of bid_min
91 
92    allocation < usage: try to shrink file system, otherwise delete
93      Ideally, the auctioneer would call an application specific
94      handler to delete files.
95
966/13 Root must have a contingency ratio .9999999 instead of 1-epsilon
97so that resource prices are not zero. Zero prices causes
98discontinuities in allocation like allocating more than 1.0
99resources. This has been a problem for few users or when users all use
100CR of 1.0 for a resource.
101
1026/20 What happens when the expiration time has arrived, but there are
103still funds remaining? Here are the choices:
104  1) Delete account and refund remaining funds. Harsh.
105  2) Shutdown and reset with long interval. Much kinder. Account may
106  still be garbage collected if outbid.
107Decided to do 2), but not sure if this was the right decision.
108
1096/27 Testing strategy: high level to low level. Group tests by
110area. Each test should be completely independent. This takes more
111time, but is easier to diagnose. Tests should verify results.
112
1136/27 Problems with asynchronous error handling:
114  1) have to handle both exceptions and Failures because an exception
115  may occur before the first "yield"
116  2) have to handle failures at every level because twisted error loop
117  doesn't like failures propagated more than one level. May be able to
118  fix by fixing flow code.
119
1206/27 Users are prevented from setting bids which cause their own
121accounts to be deleted or shutdown. This includes changing of the
122interval, refunding of currency, and asking for high minimums.
123
1246/27 What happens if a setquota lowers the quota below the current
125allocation? Users are marked in repquota and a grace period counts
126down, but nothing happens when it expires.
127
1289/20 command line return value = number of errors
129
13010/24 Contingency Accounting: as per previous notes, the contingency
131fund should be separate from the expected value fund, and a single
132contingency fund should be shared by all resources. This provides
133maximum predictability in resource allocation, but could allow a
134malicious user to bid up one resource and drain the contingency fund,
135causing unpredictability in another resource. This raises the related
136question of how much the weighted resources bids with shared funding
137pool model makes sense.
138
139The question here is how the contingency fund should be managed. The
140issues are idempotency of bids and simplicity of bidding. Some
141possible solutions for when to reset the contingency fund:
142
1431) with every bid. Simple to understand, difficult to manage.
144
1452) with bids that change the duration or contingency ratio. Somewhat
146   difficult to understand, but allows for simpler idempotency.
147
1483) with funding operations. Too complex; introduces a third kind of
149   bid operation (others are bid and fund).
150
15111/14 Error handling revisited 3: there are several kinds of rpc error
152mechanisms without a clear policy on how they should be used:
153
1541) a TyException passed in the "result" field of the result.
1552) a Fault that is raised
1563) an unexpected Exception that occurs
157
158#2 and #3 should be retired because they cannot be authenticated
159
16012/18 Bid semantics are confusing because of the ability to use
161default values and the precedence of different values. Here are the
162rules:
163
1641) Any value can be "None". This means that the existing value is used.
1652) However, the min_resource and min_resource_amount cannot both be
166specified (the same goes for max_resource and max_resource_amount), so
167when one is specified, it takes precedence over the earlier value.
168
1691/17 Long standing auctioneer bugs:
170     1) balance and contingency balance need to take effect on
171     immediate account_and_allocate. They were waiting for next
172     accounting cycle
173     2) balance, contingency_balance, and allocation need to be
174     undone on failed bid, fund, and refund
175     3) allocation needs to be undone on failed create_account
176
177     Solutions:
178     *) For 2 and 3: rerun allocation cycle after undoing operation
179     *) For 1: Save balance, contingency balance information, restore
180     on failure. This opens a window when the balance is altered and
181     an RPC call could alter it. Need to preclude race conditions
182     anyway.
183
1841/17 Revisiting the question of when the contingency fund should be
185recalculated, the previous solution of only doing it when duration or
186contingency ratios change seems to allow incorrect values when the
187weights change. As a result, we should recompute the contingency fund
188with every bid change. Also, the contingency accounting counters
189should be reset. As a special case, if the bid does not change at all,
190we will keep all contingency information unchanged. This is useful if
191the client sends multiple identical bids to increase reliability.
192
1931/25 Account and allocate problem: solutions to simplify handling of
194failed bids:
195    1) Make all state ephemeral and write it back after checking for
196       errors
197    2) Run another allocation pass after undoing error
198    3) Hybrid: commit accounting immediately, make allocation ephmeral,
199       only re-run allocation on error, only commit allocation late.
200
2012/7 Should a failure to get a unique resource cause a
202bid/creation/fund/refund to fail? No, putting in a failed bid for
203unique resources is useful to raise the price of that resource.
204
2052/9 Resource model: There should be a more unified model of resources
206that encompasses both non-unique and unique resources. This should make
207it easier to think about and write code. Here is a new model:
208
209Resource Type: memory, IPv4Address, etc.
210{ identifier:
211  { 'capacity': c, 'price': p, 'time': t, 'price_statistics': ps,
212    'misc': m}
213  ...
214}
215
216identifier can be any scalar or a tuple (l, r) indicating a range.
217
218The issues:
219
2201) When to combine identifiers into a range?
221
2222) When to split them?
223
2243) How to express preferences?
225
226Possibilities:
227capacity                id              example
228min, max                any one         disk
2291.0, 1.0                specific one    port 80
2301.0, 1.0                any one         any IP address
231min, max                all             
232
233Use the existing interface and defer this issue.
234
2354) How to allocate based on preferences?
236
237Use the existing allocation algorithm and defer this issue.
238
2395/2/2007 Unique resource states:
240
2411) Allocated. These have individual entries in the status data
242   structure for keeping price statistics
2432) Recently Unallocated. These are the same as the allocated.
2443) Unallocated and old. These are coalesced with adjacent instances
245   and do not have price statistics.
246
2475/23/2007 Unique resource pricing
248
249The current second price auction for unique resources has problems:
250
2511) It is difficult to gauge demand. The price is zero most of the
252   time and only surges when someone else tries to acquire the
253   resource. In many cases, the other person will not even try to
254   acquire it because he can see that he will not meet the highest
255   bid. The inability to gauge demand prevents users from making
256   rational bids for the resource.
257
2582) A users who tries to outbid another user may not pay anything for
259   the attempt. This will happen if there are only two users. Bidding
260   should cost something to limit the damage that a malicious user can
261   cause. In this case, the damage is denying a user the use of a
262   unique resource.
263
2643) Paying zero most of the time for unique resources is unfair to the
265   provider. 
266
267Solutions:
268
2691a) Use the top bid to record current and historical prices. This will
270   give a better sense of demand. Unfortunately, this also allows an
271   attacker to easily outbid the top bidder. Publishing the top price
272   removes the sealed bid property.
273
2741b) The top bidder pays the maximum of his minimum bid, the second
275    highest bid, and the provider's minimum bid. This is the reported
276    price. Users who want to advertise a high bid must pay for that
277    privilege. The problem with this is that it is unclear what a
278    late-arriving bidder should do since he does not know what bid
279    will be required to outbid the existing bidder. He will probably
280    bid when his valuation is much greater than the advertised price.
281
2822) Failed bids always pay their bid instead of zero. This will limit
283   the damage that malicious attackers can do without overly charging
284   legitimate users.
285
2865/31/2007
287
288Bank bid:
289
290tycoon -u bank host bid tycoon-ext-02.hpl.hp.com 3y \
291  memory:1,256MiB,1GiB,10000.9 CPU:1,0.10,1,10000.9 \
292  disk:1,100GB,100GB,10000 NetworkInterfaceIncoming:1,0.5,1.0,10000.9 \
293  NetworkInterfaceOutgoing:1,0.5,1.0,10000.9 \
294  IPv4Address,204.123.34.92:1,10000
295
296sls bid:
297
298tycoon -u sls host bid tycoon-ext-03.hpl.hp.com 3y \
299  memory:1,256MiB,1GiB,10000.9 CPU:1,0.10,1,10000.9 \
300  NetworkInterfaceIncoming:1,0.5,1.0,10000.9 \
301  NetworkInterfaceOutgoing:1,0.5,1.0,10000.9 disk:1,1GiB,1GiB,10000 \
302  IPv4Address,204.123.34.101:1,10000
303
3046/15/2007
305
306Replay defense system: replay prevention is important in any system
307that does not have real time bi-directional communication. Previous
308work has used three basic techniques for replay prevention: 1)
309timestamps, and 2) random nonces with history buffers, and 3)
310incrementing nonces.
311
312With timestamps, the sender includes a timestamp in a message and the
313source verifies that the timestamp is after an offset from its own
314time. The disadvantages are that the clocks must be relatively
315synchronized and timestamped messages can still be replayed within the
316offset time.
317
318With random nones and history buffers, the sender includes a
319randomly-generated identifier in a message and the receiver compares
320it to entries in a buffer of received nonces. If the identifier is not
321in the buffer, then it is accepted. Otherwise, it is rejected. The
322disadvantage is that it requires unbounded state.
323
324With incrementing nonces, the sender simply keeps a counter that it
325increments after every communication. The receiver compares this
326counter to its largest seen counter and verifies that the new counter
327is larger than any previously received counter. The disadvantage is
328that the sender and receiver's state must be synchronized, which is
329difficult when sender use multiple machines or the sender's machines
330fails.
331
332Our solution is to combine the best properties of timestamps and
333history buffers. The sender sends a timestamp with each message. The
334receiver keeps a buffer of recently seen timestamps. If the
335transmitted timestamp is after the oldest timestamp in the buffer and
336is not in the buffer, then it is not a replay. Otherwise, it is either
337too old, or a duplicate of timestamp in the buffer. Both are
338suspicious, so the timestamp is rejected. As with a timestamp-only
339approach, this technique uses a constant amount of state. As with the
340history buffer approach, this technique can detect duplicates that are
341within the clock skew horizon.
342
343In Tycoon, this system is used for RPCs and bank receipts.
344
3459/12/2007
346
347Service location service authentication. The SLS needs to authenticate
348clients to prevent clients from hijacking other clients' addresses.
349
350This is complicated by web proxies and NATing. For example, the
351simplest solution is to require that operations on an IP address be
352from that IP address. This works for normal hosts and NAT-ed hosts,
353but not for hosts behind web proxies. Another solution is to use the
354address contained within the registration message. This works for
355normal and proxied hosts, but NAT-ed hosts usually do not know their
356externally routable address. In addition, this solution allows clients
357to "squat" on the addresses of others, preventing them from being
358registered by the real owner.
359
360There are variety of cases that the solution must handle:
361
362   A. Well-behaved, non-proxied, non-nat-ed clients.
363   B. Well-behaved, nat-ed clients. They can be contacted externally.
364   C. Well-behaved, proxied clients. They cannot be contacted externally.
365   D. Squatting clients.
366
367
368Here is a more robust (and complex) registration solution that allows
369NAT-ed and proxied clients to co-exist:
370
371   1. The client contacts the SLS with its self-known address and the
372      externally known address used at the network layer.
373
374   2. The SLS tries to connect back to both the self-known address and
375      externally known address to verify the hash.
376
377external              self-known
378conn, hash verified   conn, hash verified   if ex==self-known, accept ex
379conn, hash verified   conn, hash failed     accept ex, NAT w/external
380conn, hash verified   no conn               accept ex, NAT w/external
381conn, hash failed     conn, hash verified   accept self, proxy?
382conn, hash failed     conn, hash failed     reject
383conn, hash failed     no conn               accept self, NAT
384no conn               conn, hash verified   accept self, proxy?
385no conn               conn, hash failed     reject
386no conn               no conn               accept self, proxy
387
388In any case where a registration was accepted without callback hash
389verification, a new registration that passes callback hash
390verification supercedes it.
391
3924/1/2008 Migration
393
394Automatic VM migration is important for maintaining high reliability
395and reducing admin overhead. The basic design is for clients to
396request a migration.
397
3984/1/2008 Uploading a file system
399
400This is challenging because python's XML-RPC implementation is not
401designed to for streaming. More specifically, it assumes that all the
402call arguments can be held in memory. Since file system images are in
403the range of 300MB-2GB, this could cause paging and slow the entire
404system down. Some alternatives:
405
4061) Split the upload into multiple XML-RPC calls
407
4082) Do the upload as a separate scp operation. The process would be:
409   a) Create_account with a dummy file system name
410   b) Upload using scp, overwriting the dummy file system
411   c) Configure the file system
412
4135/30/2008 XML Parsing Benchmarks
4141.87 GHz laptop, 500,000 elements
415               Serialize Unserialize
416Unpatched         1.9878     10.8072
417cElemenTree       1.9327      7.0276
418lxml 2.0.5        1.9271      9.7098
419
420Local Variables:
421mode:text
422End:
Note: See TracBrowser for help on using the browser.