Trip Report

Berkeley February 24-25 2005

Paul Hubbard
hubbard at sdsc dot edu

This was a short 2-day visit on the way back from San Diego, at the request of the site IT rep, Don Patterson. The main focus was on assisting them with questions about NTCP, Data turbine and DAQ.

Site overview

NEES at Berkeley (http://nees.berkeley.edu/) is a structural testing lab. From their web page:
The nees@berkeley Equipment Site is located at the University of California, Berkeley, Richmond Field Station. The centerpieces of the facility are the strong reaction floor and a reconfigurable reaction wall that, together with a 4-million pound compression-tension machine, enable testing of a great variety of structural components. The powerful hydraulic supply, 7 long-stroke high-speed actuators, 128-channel data acquisition system and a portfolio of instruments complete the traditional hardware side of the facility. The 1Gb/sec network split into two sub-nets, each headed by a NEESpop, forms the backbone of the computer hardware side of the facility. The video sub-net support the telepresence functions of the NEES network enabling users to remotely take part in the experiments at nees@berkeley. The data sub-net supports the computers necessary to do geographically distributed hybrid simulations. We developed and proof-tested a new control algorithm that combines our 8-channel MTS controller and digital signal processors with other NEES experimental and computer facilities and enables conducting a continuous pseudo-dynamic test in spite of random network delays.
Their web page has more information and nice renderings of the lab and floors.

Here's a small picture of the press:

thumbnail of big press

and one of a mixed-materials wall they're testing:

Wall wired up

More pictures at this page.

Goals of the visit

Don wanted to
  1. Learn more about the interim data strategy
  2. Learn more about the data/metadata plan
  3. Learn about what tools were available and planned for 1 & 2.
  4. Get units working in his C-based Scramnet DAQ code
  5. Get Axis video sources working into the turbine
  6. Try and figure out why the NTCP C client code wouldn't run
  7. Try and solve the NTP stability problems that his NEES-POP has been having since converting to Redhat Enterprise.
  8. Mahmoud Hachem wanted to get video from Firewire cameras under LabVIEW into the turbine.
  9. Talk about file to RBNB uploads from their Pacific DAQ system.
  10. See if we could get their Canon EOS 1DS camera working with JCamera.
1 and 2 were fairly easy, though #3 is tougher. Our (NEESit) plans aren't very well documented or explained yet.

#4 was a question of updating the code with the current CVS version. There were a couple of small bugs in his Scramnet code, but nothing that took very long. Streaming data from Scramnet into the turbine was quite easy. We might want to ask them for a copy of the code sometime in the future for other sites with Scramnet or MTS.

#5 was also easy, as the AxisSource code is pretty stable. We had to move a camera between networks. Their split data/video networks mean that the turbine will have to span both nets to get video feeds. For the test, we simply moved an Axis to the data network. Here's a snapshot from the control room:
single JPG from Axis

#6 took a lot of time and is still under way.  (Case 426). It looks like a bug in the Globus XIO library, and Lee Liming from MCS kindly offered the help of the developer in resolving the problem. Not yet solved as of 3/7/05.

#7 is also a puzzler. Don has similar machine with the same OS on different VLANs, with identical NTP configurations. They both read from stratum 1 and 2 servers at Berkeley. One has major problems, and loses sync all the time, and the other is flawless. Same routers, etc. After much diagnoses, googling, and editing, we narrowed it down but didn't have a clear solution. Possible culprits:
#8 was a fun problem. Since LabVIEW can't invoke the Java API required to feed the turbine directly, we setup WebDAV access to the data turbine by running it under Tomcat. We then ran a second, child, turbine on Mahmoud's machine, and used WebDAV to push data into the local turbine. The parent-child connection meant that the JPG files then arrived at the main turbine, and could be viewed with a file browser, or normal RBNB client:

Test setup

This was very cool. I think we should ask Mahmoud for a copy of his code, and include it. It'll handle any firewire camera under Labview if they've got the IMAQ-1394 toolkit. He simply saves a JPG every second to the WebDAV, and that interfaces to the data turbine.  Here's a picture from the test:

B&W 1394 image

#9 is basically a solved problem. They sample at 200Hz but don't want to stream at that rate. Pacific can export ASCII, and Don has a Perl script to convert their ASCII to NEES format, for use with the FileToRbnb program.

#10 (Canon with JCamera) is a non-starter. The Canon has a firewire interface, instead of USB, so the protocol stack would have to change. Also, they'd like to run this from Windows, which isn't possible with our current use of jphot/jusb. More on this below in the final section.

Notes, results and such

WebDAV

WebDAV is really, really slick. It allows programs with no RBNB knowledge to read and write RBNB data. However, you have to run the turbine from within Tomcat, so we have to change the init script to run Tomcat, and then use wget/curl/Perl to start the turbine. You also have to set an environment variable so that the turbine gets enough memory. A bit more complicated but very possible. (This is now case 547 in FogBUGZ.)

WebDAV also resulted in us finding and solving a problem in DaqToRbnb, see case 524.

We also used the Tomcat-based URL access, as seen on the Creare demo page.

By default, Tomcat sets up WebDAV as read-only. You have to edit the web.xml file to enable read-write access. This is sensible from a security point of view, but keep it in mind when answering questions.

We were able to access the DAV from Linux, Mac and XP.

Clock synchronization

One turbine requirement is that of clock sync. If the clocks are off, the data viewers display wrong data, or no data. NTP bugs like #8 above are very difficult to solve, and we may just want to suggest that all sites spend $1k on an inexpensive GPS-based clock.

Server mirroring and robust routing

While experimenting with the turbine, I found that 2.5B3 has a new 'robust routing plugin'. This replaces the manual mirroring in admin.jar, and is supposed to handle network failures gracefully. We should test this and perhaps advocate it to sites like UMN/Iowa.

NTCP C client

I also ran into the OpenSEES crew while there, and like Don/Mahmoud they also asked about the status of the NTCP C client API. They need it for OpenSEES, and there's also a robot they could use with it. I and Don spent a half-day working on this, and were able to build but not run the NTCP client test suite. After talking to some of the developers at MCS, Lee Liming there has volunteered some of their time to help fix this. It looks like a bug in the underlying Globus XIO library, and they are testing a fix now.

Jcamera, Canon and Windows

We're having similar problems at UIUC, where a newer Nikon is breaking jusb. The real problem is that jusb and jphoto are abandoned projects, so if we encounter problems there's no one else to fix them. Thus, we have to decide what the development priority of jcamera is, and allocate accordingly.

For the short term, I'm hacking jcamera as follows:
  1. Remove all platform-specific code (jusb, jphoto)
  2. Enable calling external command-line programs for zoom and shutter control
This leaves jcamera performing
  1. NTCP to the ASCII plugin
  2. Data turbine uploads
Mahmoud is going to try and write command-line Win32 programs based on the Canon SDK. For UIUC, gphoto2 is a possibility although it lacks zoom control.

This code is not ready to release yet, but I'll commit to CVS here soon as 'lim', meaning Less Is More.