Trip Report
Berkeley February 24-25 2005
Paul Hubbard
hubbard at sdsc dot edu
This was a short 2-day visit on the way back from San Diego, at the
request of the site IT rep, Don Patterson. The main focus was on
assisting them with questions about NTCP, Data turbine and DAQ.
Site overview
NEES at Berkeley (http://nees.berkeley.edu/)
is a structural testing lab. From their web page:
The nees@berkeley Equipment Site is located at the University of
California, Berkeley, Richmond Field Station. The centerpieces of the
facility are the strong reaction floor and a reconfigurable reaction
wall that, together with a 4-million pound compression-tension machine,
enable testing of a great variety of structural components. The
powerful hydraulic supply, 7 long-stroke high-speed actuators,
128-channel data acquisition system and a portfolio of instruments
complete the traditional hardware side of the facility. The 1Gb/sec
network split into two sub-nets, each headed by a NEESpop, forms the
backbone of the computer hardware side of the facility. The video
sub-net support the telepresence functions of the NEES network enabling
users to remotely take part in the experiments at nees@berkeley. The
data sub-net supports the computers necessary to do geographically
distributed hybrid simulations. We developed and proof-tested a new
control algorithm that combines our 8-channel MTS controller and
digital signal processors with other NEES experimental and computer
facilities and enables conducting a continuous pseudo-dynamic test in
spite of random network delays.
Their web
page has more information and nice renderings of the lab and floors.
Here's a small picture of the press:

and one of a mixed-materials wall they're testing:

More pictures at this page.
Goals of the visit
Don wanted to
- Learn more about the interim data strategy
- Learn more about the data/metadata plan
- Learn about what tools were available and planned for 1 & 2.
- Get units working in his C-based Scramnet DAQ code
- Get Axis video sources working into the turbine
- Try and figure out why the NTCP C client code wouldn't run
- Try and solve the NTP stability problems that his NEES-POP has
been having since converting to Redhat Enterprise.
- Mahmoud Hachem wanted to get video from Firewire cameras under
LabVIEW into the turbine.
- Talk about file to RBNB uploads from their Pacific DAQ system.
- See if we could get their Canon EOS 1DS camera working with
JCamera.
1 and 2 were fairly easy, though #3 is tougher. Our (NEESit) plans
aren't very well documented or explained yet.
#4 was a question of updating the code with the current CVS version.
There were a couple of small bugs in his Scramnet code, but nothing
that took very long. Streaming data from Scramnet into the turbine was
quite easy. We might want to ask them for a copy of the code sometime
in the future for other sites with Scramnet or MTS.
#5 was also easy, as the AxisSource code is pretty stable. We had to
move a camera between networks. Their split data/video networks mean
that the turbine will have to span both nets to get video feeds. For
the test, we simply moved an Axis to the data network. Here's a
snapshot from the control room:

#6 took a lot of time and is still under way. (Case 426). It
looks like a bug in the Globus XIO library, and Lee Liming from MCS
kindly offered the help of the developer in resolving the problem. Not
yet solved as of 3/7/05.
#7 is also a puzzler. Don has similar machine with the same OS on
different VLANs, with identical NTP configurations. They both read from
stratum 1 and 2 servers at Berkeley. One has major problems, and loses
sync all the time, and the other is flawless. Same routers, etc. After
much diagnoses, googling, and editing, we narrowed it down but didn't
have a clear solution. Possible culprits:
- Router configuration
- Bad network card
- Motherboard failure (massive clock jitter may indicate
overheating according to Net lore)
- Bad crystal on the motherboard
#8 was a fun problem. Since LabVIEW can't invoke the Java API required
to feed the turbine directly, we setup WebDAV access to the data
turbine by running it under Tomcat. We then ran a second, child,
turbine on Mahmoud's machine, and used WebDAV to push data into the
local turbine. The parent-child connection meant that the JPG files
then arrived at the main turbine, and could be viewed with a file
browser, or normal RBNB client:

This was very cool. I think we should ask Mahmoud for a copy of his
code, and include it. It'll handle any firewire camera under Labview if
they've got the IMAQ-1394 toolkit. He simply saves a JPG every second
to the WebDAV, and that interfaces to the data turbine. Here's a
picture from the test:

#9 is basically a solved problem. They sample at 200Hz but don't want
to stream at that rate. Pacific can export ASCII, and Don has a Perl
script to convert their ASCII to NEES format, for use with the
FileToRbnb program.
#10 (Canon with JCamera) is a non-starter. The Canon has a firewire
interface, instead of USB, so the protocol stack would have to change.
Also, they'd like to run this from Windows, which isn't possible with
our current use of jphot/jusb. More on this below in the final section.
Notes, results and such
WebDAV
WebDAV is really, really slick. It allows programs with no RBNB
knowledge to read and write RBNB data. However, you have to run the
turbine from within Tomcat, so we have to change the init script to run
Tomcat, and then use wget/curl/Perl to start the turbine. You also have
to set an environment variable so that the turbine gets enough memory.
A bit more complicated but very possible. (This is now case 547 in
FogBUGZ.)
WebDAV also resulted in us finding and solving a problem in DaqToRbnb,
see case 524.
We also used the Tomcat-based URL access, as seen on the Creare demo page.
By default, Tomcat sets up WebDAV as read-only. You have to edit the
web.xml file to enable read-write access. This is sensible from a
security point of view, but keep it in mind when answering questions.
We were able to access the DAV from Linux, Mac and XP.
Clock synchronization
One turbine requirement is that of clock sync. If the clocks are off,
the data viewers display wrong data, or no data. NTP bugs like #8 above
are very difficult to solve, and we may just want to suggest that all
sites spend $1k on an inexpensive
GPS-based clock.
Server mirroring and robust routing
While experimenting with the turbine, I found that 2.5B3 has a new
'robust routing plugin'. This replaces the manual mirroring in
admin.jar, and is supposed to handle network failures gracefully. We
should test this and perhaps advocate it to sites like UMN/Iowa.
NTCP C client
I also ran into the OpenSEES crew while there, and like Don/Mahmoud
they also asked about the status of the NTCP C client API. They need it
for OpenSEES, and there's also a robot they could use with it. I and
Don spent a half-day working on this, and were able to build but not
run the NTCP client test suite. After talking to some of the developers
at MCS, Lee Liming there has volunteered some of their time to help fix
this. It looks like a bug in the underlying Globus XIO library, and
they are testing a fix now.
Jcamera, Canon and Windows
We're having similar problems at UIUC, where a newer Nikon is breaking
jusb. The real problem is that jusb and jphoto are abandoned projects,
so if we encounter problems there's no one else to fix them. Thus, we
have to decide what the development priority of jcamera is, and
allocate accordingly.
For the short term, I'm hacking jcamera as follows:
- Remove all platform-specific code (jusb, jphoto)
- Enable calling external command-line programs for zoom and
shutter control
This leaves jcamera performing
- NTCP to the ASCII plugin
- Data turbine uploads
Mahmoud is going to try and write command-line Win32 programs based on
the Canon SDK. For UIUC, gphoto2 is a possibility although it lacks
zoom control.
This code is not ready to release yet, but I'll commit to CVS here soon
as 'lim', meaning Less Is More.