RccBandwidth < Rise

new web: http://bdml.stanford.edu/pmwiki

Rise Web>ClimbingRobot > SenseCompComm > RccExecSummary>RccBandwidth (10 Sep 2003, AlRizzi? )

-- ShaiRevzen? - 27 Aug 2003

Bus Bandwidth and Latency

This section concerns the bus bandwidth limitations and factors that would determine the required bandwidth in RiSE. We also discuss here the communication latency between remote modules and the on-board main CPU.

Recent Changes

Modification	Link(s)
ShaiRevzen - 27 Aug 2003 - 01	re. bandwidth limitation
HaldunKomsuoglu - 28 Aug 2003 - 01	rough signal list
UlucSaranli - 28 Aug 2003 - 01	latency issue
HaldunKomsuoglu - 29 Aug 2003 - 01	an architecture w/o latency
ShaiRevzen - 29 Aug 2003 - 01	designing away the latency and the arbiter
HaldunKomsuoglu - 29 Aug 2003 - 02	re. accessing remote nodes

ShaiRevzen (12 Aug 2003) - I2C? rates are either 100KHz ("Standard") 400KHz ("Fast") or 3.4MHz ("High-Speed"; values taken from the i2c 2.1 spec). The bus overhead for 7-bit addressing mode is 1 byte + conditioning, i.e. ~10bits, plus one ack bit / byte. This means that for "short" messages, e.g. 2 bytes, the theoretical bit-rate limits are 16/(16+10+2)=0.57 of the nominal bit rate. This means that several Mbps are completely out of the question even for Hs-mode devices (and most are only F-mode). For 1-byte messages this is actually 8/(8+10+1)=0.42 of the nominal rate.

- HaldunKomsuoglu (26 Aug 2003) - My initial design also foresaw a worst case 1Mb/sec net data rate upper bound which is much more conservative that the value cited above. It is my personal opinion that the basic sensors and actuators will not require that high bandwidth. Though, this is rather a short sided comment to make especially without a draft picture of what will be in the robot. However, one should note that in the physical implementation the data packets will most definitely be larger than a single byte and hence the bus efficiency will increase.

HaldunKomsuoglu (26 Aug 2003) - One solution for the limited bus bandwidth of the I2C? is to implement multiple parallel busses which would effectively increase the overall bandwidth. Of course, this would require introduction of bus arbitrar and cabling for each new bus. Note that all arbitrars will have the same design, and therefore, implementation of extra busses will have very small hardware design overhead. The cabling will not be a serious problem either and might even get simpler in multiple bus case where busses serve to groups of units that are physically close.

- ShaiRevzen (26 Aug 2003) - Here I again champion the cause of hiding the details of which device is on which physical bus somewhere deep inside our library's communication protocol implementation. That way no software need be modified if we decide to split a bus because of bandwidth limitation.
  - HaldunKomsuoglu (27 Aug 2003) - My comment above does not mention anything about the API. I was merely sketching an approach that would allow us to increase the effective communication bandwidth. Certainly, the communication API should hide all the details of the hardware. Like the story with the IDE controllers that support multiple parallel physical busses and their associated OS API in PCs.

AlRizzi (27 Aug 2003) Based on my back of the envelope calculations, I am going to guess that we are almost stuck having multiple buses to meet our overall data rate needs -- I'll go so far as to propose that we think about either 3 or 4 busses (3 = left, right, body; 4 = front, mid, hind, body). The added complexity should be concentrated on the central CPU and should only consist of more "bus arbiters" and a bit more software. Whatever the solution, this should be completely hidden fro the software.

- HaldunKomsuoglu (27 Aug 2003) - Agreed. In order to simplify the cabling these busses should be dispatched according to physical distribution of the remote nodes (as you suggest in your note) rather than functional groups. The communication layer API can present these modules in a uniform manner without any explicit reference to their communication hardware much like the device file system in UNIX which combines multiple IDE and SCSI and other communication channels in a standard environment.

- ShaiRevzen - 27 Aug 2003 - yup, this is my impression also; which is part of my reasoning for USB as the main

bus. I was envisioning the robot as a USB interconnected set of identical remote nodes, each of which has enough spare I/O capability to allow graceful upgrades and enough local processing muscle to hide most of the work from the CPU. The nice thing with USB is that each of these remote nodes can be hooked up to a development PC and it will expose its devices as peripherals for the PC. I2C? can be used within a remote node, e.g. to connect a tiny "foot MCU" which reads foot contact sensors locally. This architecture also allows us to use a powerful "off the shelf" main board.

HaldunKomsuoglu (28 Aug 2003) - At this point I believe a rough list of possible sources and sinks would be very useful in determining a back of the envelop figure for the bandwidth requirements. I have just started a page to develop a list of concivable signals which can be found in CommSourceSinkList. (It has a similar structure allowing comments and additions.)

UlucSaranli (28 Aug 2003) - I would like to points out one important concern regarding various central/distributed approaches that we have developed: the latency of data acquisition due to communication protocols between distributed entities.

- UlucSaranli (28 Aug 2003) - So far, data signals have been "immediately" available in higher levels of software due to periodic acquisition and buffering of all inputs from all sensors. This has been true for the enoders, gyros, accelerometers, analog inputs as well as PWM inputs. If, as part of the distributed approach, we choose a command/response paradigm rather than periodic bursts, the situation becomes a little more complicated where the higher level control entity may need to send a request for data and wait for its arrival. We either need to provide efficient mechanisms for addressing this (i.e. blocking calls of some sort, or "interrupts") or decide that we want to go with a periodic acquisition of a selected number of signal sources (could be configured on the fly) providing the "latest" sensor reading with timestamps.

- - HaldunKomsuoglu (29 Aug 2003) - Using a command/response paradigm in the communication protocol does not necessarily result in latency. Let me sketch out a structure to illustrate this. In my mind the bus arbitrar acts as an "automatic data exchanger unit" that operates at a pre-specified update rate independent of the main CPU. It periodically sweeps thru all the hardware modules (remote nodes) and performs data transactions with them. Each transaction of the bus arbitrar consists of sending the command structure from arbitrar to the remote node and getting the sensory structure from remote node into the arbitrar. In this scheme I propose to have a "blackboard" - a memory media thru which the command and sensory structures are exchanged between the bus arbitrar and the main CPU. After each transcation the corresponding fields in the blackboard are updated by the arbitrar. Hence, this mechanism results in periodic updates of all the sensory data in the blackboard. Furthermore, all the hardware modules get their respective commands periodically as well. As a result, in the main CPU all the sensory data, stored in the blackboard memory, will be "immediately" available to the software modules. Note that this asynchronous link between the arbitrar and the main CPU is very similar to the way RhIO operates as well.

- - - HaldunKomsuoglu (29 Aug 2003) - There are probably several implicit assumptions in my above sketch. One that I can think of is the need for synchronization of the sweep loop of the bus arbitrar with the ModuleManager loop (or whatever equivalent we will have) to ensure that update of the blackboard is well timed with the execution of the software module updates. However, I believe this can be achieved rather easily. The second one is the assumption that all the remote nodes can be swept in 1 msec (the update period of our choice). This one depends on the bandwidth of the communication system and the size of the data structures.

ShaiRevzen (29 Aug 2003) - I suspect you may be implicity assuming that we use interrupt based communication to read sensors. If the hardware suports it, it is probably better to poll in a deterministic way at a known position in the ModuleManager loop. (BTW: I think the RHexLib term Module is misleading, as it seems to imply a static subsystem rather than a periodic task -- how about using the term periodic task or ptask instead?). There are many good examples in designing high performance communication systems that show that polling is more efficient for cases where the communication load can be anticipated. Plus it saves all those ugly timing and concurrency bugs. Thus, if the remote nodes filter and "condense" the sensor data and provide it at approximately the right rate and the mainboard CPU has some sort of DMA engine for the TBD communication medium, we won't really need any bus arbiter. The following tables shows what I envision the remote node and the main board doing in the sensor related portion of their main loop most of the time:

Remote Node Loop		Main Board Loop
Read sensors		If an update arrived --> replace the old "remote state" with the new "remote state" by pointer swapping
Update filtered states		Request future updates as needed
If update was requested for this time --> build message and send to mainboard		Do high level work

NOTE: a critical part of this idea is that the main board requests "future" updates using some common notion of time, and by doing that it ensures that data will be ready when it is needed, and that it won't take up any bus bandwidth when it is not needed.

- - HaldunKomsuoglu (29 Aug 2003) - Nope, I was not thinking of an interrupt driven mechanism. I am not sure what made you think that but let me clarify myself. In the sketch above the bus arbitrar acts as a polling machine that is independent of the main loop (may be some how synchronized). Its transaction actions are timed but not interrupt driven. I sounds like you and I agree on this, i.e., we prefer an arbitration scheme that employs static time-division multiplexing which is both efficient and easy to code up. Furthermore, this approach would have predictable timing properties. This is the CBR ATM protocol business I talked about in one of my earlier emails.

- - HaldunKomsuoglu (29 Aug 2003) - One thing that I did not have in my original sketch was the "request for future updates" notion you describe in the NOTE under the table. Although, I understand that the bandwidth would be utilized much more efficiently if the arbitrar knows what to update in the next cycle I believe this approach will open a can of worms. First of all, the firmware on the bus arbitrar that performs the remote node sweep will need to be a more complicated that what it would have been if it was a dumb loop. I suppose there could be some simple ways to implement this. However, more importantly, I believe most developers will find this "requesting for future updates" business very inconvenient. I think it would be a much cleaner approach if the data is always there for the application on the main CPU to use. The drawback being a much more loaded communication network even when most data is not utilized. However, as long as the bandwidth is enough that should be ok.

- - HaldunKomsuoglu (29 Aug 2003) - Though, I don't think I understand how you concluded that this approach would allow us to get rid of the arbitrar.

- UlucSaranli (28 Aug 2003) - For actuators, this is less of an issue because actuator commands do not have to lock the central process. Once the command is sent, it is gone. The only potential issue here is status queries (e.g. is motor A operational?) related to self-monitoring of peripheral modules. This could be seen as a sensor and would require a blocking wait on the central CPU.

- UlucSaranli (28 Aug 2003) - This issue is present regardless of the protocol we choose for the underlying communication. It is more a choice of style for the sensory acquisition and we need to start thinking about the pros/cons of both approaches. Given that most of our signal sources (encoders, gyros, accelerometers, temperature sensors etc.) will be periodically read for control purposes anyway, period update for those seems best to me. Perhaps for other sensors "on-demand" schemes would be more useful. Once again, I am aware that the underlying protocol can be request based, but it may be beneficial to provide burst mode operation as well.

Add your comments here

Copyright &© by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
Ideas, requests, problems regarding TWiki? Send feedback