Summary of timing experiments with RHexLib (all averages were taken
over 60000 cycles ~ 1 minute)
Conclusions:
- There is significant variability in the computation time. All
modules, even the simplest ones, tend to have maximum execution
times higher than 70us. My belief is that this is due to poor
cache performance on our current CPU board. The execution times go
up as the executable size increases.
- The database access functions have a large number of internal copy
operations, which most likely result in significant CPU cycles
wasted. There are also a few places where there may be memory
leaks, which could cause unwanted mallocs happening, leading to
wasted CPU. These problems can in principle be fixed with some
reasonable effort.
- The basic motion control functionality, including rHio reads and
writes, encoder reads and PD control computations take an average
of 302us. With only the basic modules linked and activated, the
decomposition is as follows:
- encoderreader: 59.5us (for all 6 axes)
- positioncontrol: 73.2us (for all 6 axes)
- rhioreader: 67.1us
- rhiowriter: 45.1us
- timingtester: 7.4us
I am assuming the reason why these do not add up to 302us is
because of the profiling and module manager overhead. Also, note
that these includes a few very trivial optimizations I have made
along the way, which improved the final figure by about 50us
compared to our latest SWRI code.
- When the communications, database, logserver and the gyro,
together with their threads are activated, the average loop time
goes up to 351us, even though the updates of these modules do not
do anything significant. They only start their corresponding
threads. My impression is that this is again due to the cache
performance and possibly bus related issues due to the serial gyro
access. The timing decomposition becomes as follows:
- encoderreader: 62.7us (for all 6 axes)
- positioncontrol: 85.3us (for all 6 axes)
- rhioreader: 75.5us
- rhiowriter: 48.6us
- logserver: 2.7us
- dbmodule: 2.7us
- timingtester: 8.4us
Note that the resolution of timing is around 2us, so the small
values are not really reliable. Changing the priority of the gyro
thread does not effect the results above, which are obtained with
the gyro thread priority (27) above the main thread (25)
- The rest of the components are a lot harder to profile because
they are network related and are less deterministic. The next step
was to add the state proxy as well as all the associated state
estimator modules. In particular, the heatsink, battery SOC
monitor and the average power were activated. The most important
conclusion is that, the stateproxy::update() function takes about
100us, which I suspect is due to inefficiencies and/or bugs in the
underlying database access functions. This needs to be sorted out
before the timing data is reliable. The summary in this case is as
follows:
- stateproxy: 101.6us (period is 10us)
- heatsinkmodule: 4.9us
- battermonitor: 22.6us (includes some complex math functions)
- avgpower: 8.6us
- encoderreader: 59.3us (for all 6 axes)
- positioncontrol: 79.6us (for all 6 axes)
- rhioreader: 82.6us
- rhiowriter: 47.0us
- logserver: 3.2us
- dbmodule: 2.9us
- timingtester: 9.4us
The average loop time is 403us, but keep in mind that the state
proxy has a period of 10, so it only contributes to the average
with 0.1 weight. So, a loop time of 490us would be a more accurate
picture.
- The next experiment involves the basic walking machine with
inclination compensation (i.e. what gets run from WalkMode). The
execution time for the WalkMachine is hard to summarize due to
irregular recomputations of thr Beuhler clock profile at
transition points. However, the following list summarizes the
timings when new turning commands are issued to RHexWalker at
every cycle.
- stateproxy: 91.5us (period is 10us)
- heatsinkmodule: 5.1us
- battermonitor: 22.2us (includes some complex math functions)
- avgpower: 8.4us
- rhexwalker: 3.4us
- walkmachine: 30.3us (maxtime:256us. cache?.)
- walkpitchfilter: 32.9us (maxtime:116us)
- slopeestimator: 3.2us (maxtime:46us. almost empty update. weird!)
- encoderreader: 59.7us (for all 6 axes)
- positioncontrol: 68.0us (for all 6 axes)
- rhioreader: 82.2us (maxtime:347us cache or sync prob?)
- rhiowriter: 46.5us
- logserver: 3.2us
- dbmodule: 3.2us
- timingtester: 7.8us
The avg loop time becomes 445us, which is reasonable. However, the
maximum loop time is at 815us, which is a little distressing. Note
that the max time is not the sum of max times for all modules
because they may occur at different cycles. In fact, the sum of
max times comes up to be 2.3s, which is much larger. Final note,
the reason the position controller timing is smaller is probably
because motor drives were disabled and the legs were not moving,
reducing computation on the PD loop.
- Testing modes is significantly harder because of the
nondeterminism of the user interface. As a consequence, I have
done a much more limited version, where the mode supervisor is
added and activated. Calibration was then performed manually,
followed by a forced transition to OdometryMode, which supposedly
has a lot of state updates. At this point, the profiling was
started and run for 60000 cycles. A number of additional modules
are necessarily activated. The following list gives a summary.
- modeSupervisor: 13.2us
- stateproxy: 100.7us (period is 10us)
- heatsinkmodule: 6.0us
- battermonitor: 24.6us (includes some complex math functions)
- avgpower: 8.9us
- odometrymodule: 5.3us
- localizermodule: 90.9us
- rhexstatemodule: 5.7us
- rhexrunner: 2.4us
- walkmachine: 29.8us (maxtime:254us. cache?.)
- rotationint: 65.7us (maxtime:160us)
- encoderreader: 61.2us (for all 6 axes)
- positioncontrol: 69.6us (for all 6 axes)
- rhioreader: 75.4us
- rhiowriter: 50.4us
- logserver: 3.0us
- dbmodule: 2.9us
- timingtester: 6.0us
The average loop timing was 636us, but cycles were skipped a
number of times, with a maximum loop time of 995us (which forces
cycle skip to let the comm thread run). It looks to me like,
overall, the code is fairly well optimized (except the database
copies), but the memory architecture on the Geode is killing us.
--
UlucSaranli? - 07 Aug 2003