I'm going to be traveling to Knoxville, Tenseness in about a week to go to a big all hands meeting for the Exascale Computing Project. While Knoxville seems like a fun city, I'm dreading the travel because of the time change and the difficulty in flying there from the Bay Area. Knoxville's Airport is tiny and doesn't have many flights from this side of the country. Last year when I went to ECP my SJC to ATL flight was delayed and I was lucky to get the last seat on the last plane for the night (I had visions of renting a car and driving from Atlanta to Knoxville in the middle of the night).
While making a poster for this trip, I started thinking it'd be fun to use some of the airplane flight data in an example for Kelpie. I dusted off my datasets, learned the basics of Boost's Geometry library, and wrote some simple C++ examples that digested and analyzed my data. I then wrote a simple tool to identify flights that landed at a particular airport, and then dumped the entire day's track for those planes. The idea was that I wanted to know how far I could get from an airport without changing planes. I plotted the data in matplotlib using the plotting tool I wrote a while back.
As the plots show, you don't have many options if you want to go west from Knoxville. I didn't put it on the poster, but if you wanted to minimize travel pain for this conference and host it near a national lab, the right place to do it is at Argonne near Chicago. They have massive direct flights and are at least closer to the middle of the country. However, Chicago on February doesn't sound like the best idea to me.
It's official: I'm renaming my main project at work to FAODEL: Flexible, asynchronous, data-object exchange libraries. FAODEL (pronounced fā-ō-del) comes from a simplification of the Gaelic term faodhail, which is a land bridge used to cross between islands. Here are two examples between the Monach Islands in Scotland:
Nessie, Kelpie, and Scottish Names
My main project at work for the last few years has been writing data management services for HPC applications. Sandia's I/O group previously built an RDMA portability layer called Nessie to support the Lightweight File System (LFS). I initially built a key/value store on top of Nessie. I wanted to keep the Scottish monster theme going, so I decided to call it Kelpie (a kelpie is a sea horse in Scottish mythology that drags people to a watery death). As Kelpie evolved we started adding more packages with Scottish/Gaelic terms. We named our memory manager Lunasa (a Gaelic harvest festival) and our boot services became Gutties (a cheap gym shoe in Scotland). It didn't take long for us to realize that there were a lot of issues with using Scottish/Gaelic terms to name things. First, the words are often difficult to spell and pronounce. Second, we've had trouble finding other Scottish mythical beasts we could swipe. And finally, it seems like every Scottish word has a slang meaning that would make us hesitant to use it at a conference. As such, when we talk about our different software packages, we've been referring to them by our project name, which is "Data Warehouse".
ATDM "Data Warehouse" Origins and Problems
Three years ago the labs realized that they needed to do something different if they wanted to be able to have their codes scale up to exascale computing platforms. The ATDM project was formed to develop new software infrastructure that would allow new codes to achieve better performance than MPI-based approaches. The main idea was to use overdecomposition and task-dag programming models (aka, "asynchronous many-task" or AMT) to overcome dynamic load balancing problems while improving developer productivity. Existing frameworks (e.g., Charm++, Legion) didn't fully meet our requirements, so the DARMA team set about building a new AMT API that leveraged modern C++ features and could be retargeted to run on top of different runtimes (eg, DARMA on Charm++). From an I/O perspective applications needed a way to allow dynamically-placed tasks to exchange data with existing mesh databases and storage tools (all of which are built on static distributions). Our project was started to serve as a way to manage AMT's data in this context. Given that other AMT's used the term "Data Warehouse" for their storage, we became ATDM's "Data Warehouse".
The problem with the term Data Warehouse is that it has a specific meaning for I/O people. In the 1970's Bill Inmon started using Data Warehouse to refer to the idea of centrally storing/indexing all of an organizations data, instead of spreading it out among many smaller databases. Inmon has written articles that point out how NoSQL people have hijacked his term (which I agree with), so I've always cringed at having to refer to ourselves as ATDM's DW. It's difficult to change a funded program's name though, once it's on the books.
Faodail and Faodhail
Recently, we've been reorganizing our code so that we can go through the official open-source release process. We've generalized our scope so we can serve more than just DARMA, so I decided it was a good time to revisit project names. I found the name "Faodail", which people say is a "lucky find, usually of a lost item". That seemed like a good fit for a key/blob service, so we started using it (it even made its way into a paper an intern wrote). There were a few problems with faodail though: (1) we found it was difficult to pronounce, (2) the internet already had some references to it, (3) taking the "aod" out of "faodail" makes it fail, and (4) google translates faodail to "maiden" (??). It seemed like a bad idea.
I went back to the naming game and searched some more. After a lot of misses (and general disgust from my team), I noticed in WikiSource that the term right after faodail is "faodhail". The definition for faodhail is:
faodhail, ford, a narrow channel fordable at low water, a hollow in the sand retaining tide water: from N. vaðill, a shallow, a place where straits can be crossed, Shet vaadle, Eng. wade.
Looking around more, I found maps of different faodhails around Scotland. These are regions where the tides go out and leave a land bridge that lets you cross between the islands. This meaning is perfect for what our software does: we build communication libraries that let you move data between different application islands.
Faodhail was longer than Faodail and had the same problems. I finally realized that what I needed was to ditch the actual name and just make an acronym that fixed the problems. Changing the name that was shorter and more phonetic helped me a good bit. I finally worked out some words that fit: flexible, asynchronous, object data-exchange libraries (FAODEL). It's not great, but the words do relate to what we're doing. Once I convinced myself it was the right thing, it was easy to tell the team what I wanted and have the confidence to make it stick.
Not to sound like an obsessed nut, but I went out and bought a special antenna for capturing airplane data. Specifically, I bought the FlightAware 1090MHz ADS-B Antenna from Amazon. Heh. I didn't look too closely at the pictures and thought it would be walkie-talke antenna size. When a three foot long box arrived with a 26 inch antenna, I realized the coax connector in the picture was actually a large N coax connector instead of the tiny SMA connector I had in mind. I didn't have the right connectors, so I had to order a special cable to try the antenna. The initial tests of the antenna inside the house were good (saw a few flights beyond 100 Miles), but naturally I wondered how well it would do outside the house.
One of the nice things about working with the Pi is that I can just hook it up to a USB battery pack and take the thing wherever I want. This turned out to be great for testing the antenna. I attached the antenna to some PVC pipe, cabled it into the battery-powered Pi, and then duct taped the whole thing to the top of a ladder. While the whole thing was wobbly, I could pick up the ladder and drag it to different spots in the back yard. I'd then go to the Pi's web page from my tablet and watch the map to see how many planes I was getting.
I wasn't very rigorous about the tests, but it seemed like I got better performance when I moved the antenna from our patio to the back yard. The results seem logical because on the patio the house is still in the way of most of our air traffic (which is west and south). The downside of all this is that there isn't a good place to mount the antenna (or route its cable) on the back of the house. Plus my wireless network evaporates a few feet into the backyard. In any case, I'm going to put the antenna and the Pi in the garage for now, until I get more time to mount it properly up top.
It's been almost half a year since I got fed up with the Intel Edison and decided to switch over to the RaspberryPI. Most of that time I've just been tinkering with it, doing things like check out RetroPI, hooking up simple led circuits, and using the built-in tools to get the kids interested in programming. The main thing I've wanted to do is get dump1090 running, but I haven't been too motivated to do that since the setup I have on Edison just works. A few weeks ago I downloaded the PiAware disk image from FlightAware.com and got everything running on a Pi3 board (w/ RTL-SDR dongle). I registered my box with them and you can now go to their web page to see my statistics.
FlightAware Online Visualizations
FA has several interesting visualizations, including the above position histogram to help you figure out where your traffic is. From the above I can see that most of the planes I catch are within 50 miles from me, and that planes that are farther out usually are North-West of me. The plot also shows I'm still getting some oddball points more than 250 miles away. The last time I looked at these points I found that they must be corrupt values (usually one bad lon or lat).
One of the cool things about using FA is that I can compare my stats to other people that are close to me. Right now there are 11 other people that are within 10 miles of me. If I make an antenna change I can look at my neighbors to see if I'm catching the same planes as them. Just eyeballing the data it looks like my antenna (which is directional and pointed at Sutro Tower in SF) is missing all kinds of flights to the south of me. Also, one or two of these people are getting nearly double the flights I am.
MLAT: Using the Crowd to Find the Unfindable
The coolest feature of using FA though is that your PiAware box can work with other PiAware boxes in the area to estimate the positions of planes that aren't transmitting their coordinates. A pet peeve of mine is that many planes have ADS-B hardware, but they don't transmit their locations because they want to have some privacy (never mind that they're buzzing over my house, peering into my backyard). PiAware has a mulilateration (MLAT) capability you can enable to find planes based on time differance of arrival (TDoA) information from four receivers. Basically, if four PiAware receivers hear the same message, they can use the position info for the receivers and the delay each of them reports for receiving the message to triangulate the plane. While it means you have to register your receiver's position with FlightAware and spend some network bandwidth transmitting data, many of those unknown planes get tagged.
Above is a snapshot of some of what the dump1090 webpage looks like now with mlat on. All of the tan (?) entries are planes that were tagged with mlat. It's very satisfying to see the added entries. Previously, it seemed like half of the planes were annonymous.
The PiAware setup was pretty straightforward. I just downloaded the image and wrote it out to a microsd card, and then updated settings at boot. They consolidated configuration options (eg, wireless network settings, receiver settings, etc) into a file called /boot/piaware-config.txt. It's a little odd to put config options in /boot, but it works ok since this is an appliance. I checked and the system will automatically try to reconnect to the network if it isn't available. That'll be handy when I want to run this fulltime, but shutdown my network link at night. I need to port my ADS-B logging scripts over to this platform (and update them to pull the mlat data), but that's a job for later.
After figuring out how to find a convex hull for my airplane data, I went back to see how much the range has varied over time. Here's a timeline that shows the daily observations on top, and the area of the daily convex hull on the bottom:
I've had a few changes to my RTL setup over the last few years that have had an impact on range. I've had basically three different configurations in the last two years: (1) an initial version connecting a NooElec RTL-SDR straight to the antenna, (2) an updated version that added a FlightAware filter between the NooElec and the antenna, and (3) a new version that uses a FlightAware Pro USB stick that has the filter built into it. Here's how big the convex hulls were for the data points collected in each day for the three settings:
In general, the filtering does seem to boost my range a good bit. The FlightAware Pro USB stick also seems to do better than a generic tuner that's plugged into an external filter. To be fair though- there were other things that changed in my setup over the years that may be skewing some of these numbers. At some point I inserted a splitter into my setup so I could also route the antenna to a USB DVB tuner. I think that explains the drop you see in the middle of the second setup (ignore the two gap periods- those were because the recorder wasn't running full days).
Now that my setup is a little more setup, I'd like to capture some data and then see how well my range corresponds with things like the weather (sounds like a fun science project for the kids).