- Not used by the client software (zip)
- GUI RPC: add get_message_seqno() RPC. fixes #931
- client: error if a <file_info> in app_info.xml has any URLs
- client: don't write file_infos with no URLs to client_state.xml for anon platform project; they must be from app_info.xml
- client: restored code for project-wide backoff on file uploads and downloads. I originally added this on 30 Sept 2005 and disabled it 2 weeks later because there were reports of problems. However, we need this functionality (e.g. on GPU hosts with hundreds of files to upload, we need to back off after a few failures, not try all of them). I added messages (<file_xfer_debug>) so you can see what's going on. Fixes #932.
- client: if malloc fails in MFILE writes, exit. We don't check the return values of printf() anywhere, and it's dangerous for the client to continue if it thinks something got written that didn't. Fixes #281
- client: code cleanup for project-level file xfer backoff
- client/manager/GUI RPC: show project-level backoffs
- client: changed file upload logic
Old: each upload attempt consists of two HTTP requests: the 1st to get the current file size on server, the 2nd to upload the remainder of the file. Problem: a) if the upload server is overloaded and requests are succeeding with probability X, then the chance of both requests succeeding is X2. So e.g. a per-request success rate of 0.1 becomes an overall success rate of 0.01. b) the "get file size" request can be avoided in some cases. New: If we've already queried the file size and haven't uploaded any additional bytes, don't query the file size again.
- client: if file < 8KB, upload it in its entirety and skip size check
- client: (refinement to previous checkin) don't skip file size check if file has multiple upload URLs. We might have uploaded different amounts on different servers.
- client: change the way a resource's "estimated delay" (passed to server for crude deadline check) is computed.
Old: estimated delay is the interval for which the resource is fully used (i.e., all instances busy). Problem: this may cause unnecessary project starvation. example: 1 CPU machine, has a month-long CPDN job with a 1-year deadline (it's not in deadline trouble). Then the CPU estimated delay will be 1 month, and the client won't get any work from projects with deadlines shorter than 1 month. New: estimated delay is the latest time at which the resource is fully used and is being used by at least 1 job that is projected to miss its deadline under RR. Note: this isn't precise, but I don't think we can improve it much without getting a lot more complex. client: 2nd try on my last checkin. We need to estimate 2 different delays for each resource type: 1) "saturated time": the time the resource will be fully utilized (new name for the old "estimated delay"). This is used to compute work requests.
2) "busy time": the time a new job would have to wait to start using this resource. This is passed to the scheduler and used for a crude deadline check.
Note: this is ill-defined; a single number doesn't suffice. But as a very rough estimate, I'll use the sum of (J.duration * J.ninstances)/ninstances over all jobs that miss their deadline under RR sim.
- Quick Updates
- lib: gcc 4.4 fix; fixes #854