Last year I wrote about how I built an ADS-B data logger to track planes using an RTL-SDR and an Intel Edison board. It was a fun project, but I eventually took it offline because I only had one Edison and running the logger meant I couldn't use the board to do other hardware experiments. I bought a second, smaller Edison board to fix the problem, but got side tracked with other projects and didn't get a chance to finish building the new logger until recently. This post goes through some of the hardware details of building the second version of my Edison-based ADS-B logger.
Intel Edison Breakout Board
In the previous version of the logger I used the Edison Arduino board. The board is large but has a number of useful built-in features (eg, a removable micro-sd storage, usb ports, Arduino pins, and a power jack). The other main Edison board Intel sells is a Breakout Board, which is very small and has just enough I/O for basic apps. I think I paid about $60 for it at Fry's. Both boards use the same Intel Edison module, which provides dual-core 32b x86, 1GB RAM, 2GB flash, 802.11, and BT. Similar to the Edison Arduino, the breakout board has two micro USB ports: one for Serial-to-USB and another that can be either a master or a slave (depending on power). The only other I/O on the breakout board is raw pads that you can solder wires to for interfacing with the Edison's micro connector.
The first problem I ran into was power. The board has four ways you can power it:
I initially hoped that I might be able to use a powered USB hub between the Edison and the RTL-SDR to power and connect both. Unfortunately, (1) hubs don't provide power to a USB master (duh) and (2) the breakout board makes the OTG USB port a slave unless you power from an external source. Since the battery port is only 3.7V (ie, not the 5V USB needs) and I didn't have a barrel connector, the only option for me was to rig something up to the J21 DC pins. I cut up the wires to an old wall wart, assembled an on/off switch for it, and then stuffed everything into an old film capsule.
The breakout board's limited I/O was also annoying because what I really wanted to do was plug in an external device for storing the data to prevent my Edison's internal storage from getting worn out. The breakout board lacked the micro-sd slot the Edison Arduino had, and with only one USB port I'd need to use a hub to plug in both the RTL-SDR and a thumb drive. I did some tests and found that when connected to DC, the breakout board could provide enough power to run a hub, the RTL-SDR, and a USB stick. However, it all felt kind of junky. I decided to go with compactness and just write the data to the Edison's internal storage. Given that Intel seems to have abandoned Edison, I'm not too concerned about the flash on these boards lasting forever anymore.
On the radio side of the project I decided to add a bandpass filter to see if it improved my range. FlightAware sells a $20 bandpass filter that attenuates everything outside of ADS-B's frequency range (1090MHz). Annoyingly, the filter has SMA connectors so I also had to buy SMA-to-MCX and SMA-to-Coax adapters. To make matters worse, I got the orientation of the filter backwards when I originally ordered the adapters, so I had to order a second set later with the genders reversed (a pair of adapters cost $10). I verified that the filter was attenuating the strength of other frequencies by booting up the gqrx app on my desktop and looking at radio stations. I don't have a way to get the real frequency response of the filter right now, but others have reported that it is wide enough to capture the family of frequencies plane watchers usually want.
I used two RTL-SDR dongles to do some visual comparisons between the filtered and unfiltered ADS-B results. I pulled up the web interfaces for dump1090 on both SDRs and then compared the number of planes each observed. In my core visibility range, both systems seemed to capture the same plane info. The planes I tracked at the edge of my visibility tended to be seen more by the filtered line. For example, the filtered line followed one plane for an additional 20 miles (and recorded more data points than the unfiltered line when it could see it). This performance wasn't always perfect though- sometimes the unfiltered line would see an incoming plane before the filtered line. I'll need to gather some data and analyze it, but my initial impression is that the filter does work, but doesn't make a huge amount of difference in my case.
Power Use and Heat
I hooked up a power meter to get a rough estimate of how much power the Edison and RTL-SDR use when dump1090 is running. An idle Edison with 802.11 enabled and no peripherals consumes less than a watt. When I hook up the RTL-SDR and enable dump1090, the power is about 3.5W. I noticed that the RTL-SDR dongle got a little warm after being on all day, so I used my cheap-o infrared thermometer to get some estimates. The thermometer said the dongle was 112 degrees F in the hottest spot (center, where the air holes are). The Edison also hit 110 degrees F (right side of the silver Edison can). When probing the thermal sensors on the device (ie, via /sys/class/thermal/thermal_zoneX/temp), I see 14, 55, and 54 degrees C (or 57, 131, 129 degrees F). That seems hot to me, but then again the Edison is passively cooled. I've been running it like this for months without problems, so it seems ok.
The Intel Edison is still a fine embedded board for building a data logger, but I've got to say that the breakout board's connectors disappointed me. It's stupid that they put in pads for a barrel jack but didn't populate it. Having only a single USB port also limits what I can do with the board. I'd hoped to make this a multi-purpose controller for the garage (eg, ADS-B monitor, webcam, temperature, etc), but to do that I'd need to add in a USB hub. The Raspberry Pi 3 is out now, and has built-in wireless and four USB ports. I'll probably change to that in the next version of things.
Close, but no blimp data. Last Thursday at my son's soccer practice, one of the Goodyear Blimps circled the field as it descended for a landing at the Livermore Airport. It was a little surreal, since it looked like the blimp was monitoring the practice the same way it circles big bowl games. However, blimp sightings aren't that uncommon out here. Livermore is on the fringe of the Bay Area and we have a large municipal airport with wide open spaces around it. It seems like the perfect place to launch, land, and park a blimp if you knew you were going to be visiting the area by dirigible airship.
Sunday morning I started wondering where the blimp was going while it was in the area. Since I've been running a dump1090 data logger on my Edison board for the last few weeks, I began pulling the data and parsing it for signs of dirigibles. As I was puzzling through how I might identify a blimp in the pile of points, I heard a faint buzzing sound coming from outside and realized that the blimp was at that very moment passing by my house.
Getting the ID From Dump1090
I went over to the webpage that dump1090 generates to see what aircraft were in the local area. I was disappointed to see that the map was pretty much empty nearby, which meant that the blimp wasn't transmitting position data. I looked at the list of planes and noticed that there were some planes in the area that were reporting their presence, but not identifying their position. I sorted them by altitude and found one that was cruising along at only 2,000ft with an ICAO hex ID of A4A7EF. Some searching around and I found this ID belongs to tail N4A, which is a 33-year-old (!) blimp owned by Goodyear.
Looking it up in FR24
While I was disappointed that my own logger didn't get any position info for the blimp, I knew that other aviation sites have tracks for aircraft based on other data sources. I looked it up on FlightRadar24 and found that they had logged a few flights for the blimp on Saturday:
Well, that solves the mystery: they were out here to watch Stanford play against USC (for the people back home, that's U of Southern California, not South Carolina). They circled that stadium for more than 5 hours, trying to make sense out of the whole situation. Then they went and blew some steam off in San Francisco. I'd like to think that the highlight of their trip though was watching my son's soccer team practice.
Craigslist is an interesting source of text data. In addition to providing a continuous stream of user postings from around the country in organized categories, the website stubbornly favors a plain-old-web format that's easy to retrieve and parse. I believe craigslist gets a lot of traffic from different kinds of scrapers. In addition to all the search engine crawlers, you hear stories about how individuals run scripts to continuously watch their local boards so they can be the first to snatch up free items. Craigslist blocks people that aggressively crawl the site, but otherwise let you wander around if you put in some rational delays.
Back in September I wrote some utilities to go off and scrape job postings from craigslist, because I thought it would be interesting to see what kind of people Bay Area companies wanted. After working out how to grab the data in an unobtrusive way, I updated my script to grab tech job postings from different cities around the country. I run the script about once a week, which over the last 9 months has given me about 32k postings, totaling 470MB in text data. This post just focuses on the scraping. I'll get to the analysis later.
Craigslist puts each post as a separate web page, and uses a city/topic/post directory structure to keep things organized. While the post part of the url is unique and non-sequential, they provide an easy-to-parse index page for each topic that will give you all the urls for the posts in reverse chronological order. All one has to do is pick a city and a topic, walk through the index, and retrieve the individual posts. I put some delay in after every page I fetched to be polite. I also randomized the city list on each run to even out the data if the grabs were taking too long and needed to be cut off (though always getting ATL would have been fine for me). To help with statistics, I had the script store basic information about runs in a local sqlite database. The database helps avoid downloading the same post twice, and gives me a place to store the dates of when I first and last saw a particular post.
Grabs Per Day
Below is a breakdown of how many posts I grabbed for each city when I ran the scraper. Since the script only grabs posts that it hasn't seen before, the per day grabs go up and down based on how frequently I ran the script (eg, when I missed a week or two, there was more data available to grab). For this time period, the cities seem to be fairly proportional. The big job cities seem to be San Francisco, Seattle, New York, and Boston (not unexpected). C'mon Atlanta. It's like you're not even trying.
Number of Active Days per Post
Another interesting statistic for me was how long job postings remain active on craigslist. I used the "first seen" and "last seen" dates stored in my meta data to estimate the amount of time I post stays alive. The numbers are off due to the initial posts I pulled (ie, I looked at the grab date instead of the post date) and the most recent posts (ie, which have not expired yet). As the below (logscale!) plot shows, most posts stick around for about a month. However, there are a few the last as long as 80 days.
It isn't much but I put the code for this on github:
In addition to being an interesting source of data for plane statistics, the FAA registration dataset also provides address information for each plane's owner. I was curious to see who owned airplanes in my town (not just the drones), so I wrote a simple script to extract addresses in my zipcode from the database and convert them to geospatial coordinates. Below is a plot of all the registered plane owners for Livermore. I've also outlined different neighborhoods in town and colored them by how expensive their houses are. Unsurprisingly, people that own planes tend to live in wealthier neighborhoods.
Livermore has a busy municipal airport on the north-west side of town, with an east-west landing strip. Planes typically approach the airport by flying west over the city, using the railroad and I580 as visual guides to locate the airport. People that live east of the airport often complain about the noise of descending planes, but the airport was there long before the houses (it was built in 1965). In general, Livermore house prices increase the farther south you get. The cheap houses (where I live, in the yellow) start at about $500k. Down in Ruby Hill they're all well over $1M.
For the above plot, I shaded different parts of town based on how expensive their houses are: the darker green the color, the more wealthy the neighborhood. The shading wasn't very scientific- I just boxed up regions by hand and then looked up what Zillow said houses were going for in the neighborhood. Sadly, I found that my yellow-ish neighborhood had zero plane owners, which was consistent with other poorer neighborhoods. I think it's interesting that most of the plane owners live south of the landing path. I'm not sure if that's because that's where the more expensive houses usually are, or if plane owners are smart enough to know not to live long the flight path.
East Bay Owners
In addition to Livermore, I pulled out data on the neighboring areas (basically all of Alameda and Contra Costa counties). Below is a snapshot of it, but you can explore the data yourself in pannable Google map of the data.
The script I used for extracting the data is extract_by_zipcode.py, which I've put in my flight classifier repo. GeoPy needs a newer version of Python than what my CentOS 6 desktop had, so I had to build/install that as well.
While looking planes up in the FAA dataset for the previous post, I noticed some planes had zero seats, weighed under 55 pounds, and were electric powered. Drones! (or more officially, sUAS - Small Unmanned Aircraft System). I knew that the FAA was making people register their drones, but I was surprised to see them showing up with other aircraft in the FAA database. After a little reading I learned that there are actually two ways to register: (1) online through a simple, instantaneous web page or (2) by mail using the traditional paper form process. While the by-mail approach takes a few weeks, your drone gets an N number and is plugged into the database. I wrote some python scripts to pull out electric plane registration info and plot it.
3,500 drone registrations is tiny compared to the web registration numbers (more than 300k in the first month). Still, it seems like a lot to me, given that I don't see an obvious reason to go through the by-mail process. In any case, I started filtering the data to see which organizations were registering. It wasn't that difficult, since the FAA database provides a registration type that identifies whether the owner is an individual, a corporation, or a government entity.
I first filtered on commercial entities, of which there were 940 different companies. Below is the complete list of companies with 10 or more drones. There are a few interesting stories here. First, Intel topped the charts with 111 drones. They seemed to all be the same ArsTec Hummingbird model, which (surprise) uses an Intel Atom Z530. BNSF Railway is using the drones to inspect rail lines (why not just strap a camera to a train?). Liberty Mutual says they're using them to assess insurance claims (eg natural disasters). San Diego Gas and Electric will do inspections of their service areas. Some companies do general "aerolytics", like this Talon Aerolytics video shows. Lockheed Martin manufactures their own drones. In addition to the electric models, their Missles and Fire Control group has a few drones under 55 pounds that use turbo-ject engines. There are also some mysteries in this list. Ashfloyd LLC has little outward info for a company with so many drones, causing some people to wonder who they are.
DRONES COMPANY ------ ------------------------------- 111 Intel Corp 93 Precisionhawk Usa Inc 43 Ashfloyd LLC 40 Aerovironment Inc 23 Rotor F/X LLC 22 Lockheed Martin Corp 18 San Diego Gas & Electric 17 Unmanned Innovation Inc Dba 16 Talon Aerolytics LLC 15 Wintec Arrowmaker 14 Flirtey Inc 13 Trimble Navigation Ltd 12 BNSF Railway UAS Program 12 Precision Hawk Usa Inc 12 Cape Productions Inc 12 Microsoft Corp 12 Aerodrome LLC 11 Hazon Solutions LLC 11 Liberty Mutual Insurance 11 Unconventional Concepts Inc 10 Aerocine Ventures Inc 10 Amazon Logistics Inc
I was a little surprised Amazon didn't have more given Amazon Prime Air. They currently have 10 drones with tail fins, and have registered four different models they've developed. They've been adding to their inventory since last year, and appear to have more in the works if you check with the FAA. Here are the counts for the different models:
Model Number Tailfins Currently Registered ------------------------------------------------ MK9A 0 MK021A 2 Starting March 2015 MK23A 1 December 2015 MK24 7 Starting April 2015
Next, I selected on Government users, which yielded 310 organizations. They're not as exciting as people would things though- they're mostly state schools, NASA, fire departments, and law enforcement. I moved National Labs into their own category to include more schools in this list.
DRONES ORGANIZATION ------ ------------------------------- 32 Kansas State University 22 Oregon State University 21 Nasa Langley Research Center 16 University Of Colorado 14 Nasa Ames Research Center 12 Virginia Polytechnic Institute & State University 12 Department Of Commerce 11 University Of Maryland Uas Test Site 11 Georgia Institute Of Technology 11 Cochise Community College 10 University Of Alaska Fairbanks 10 University Of Michigan 9 University Of North Dakota 8 Department Of Energy 8 Center For Disaster Risk Policy 7 Mississippi State University 7 Sinclair Community College 7 Auburn University 6 Ohio State University 6 Utah State University ... 4 Bureau Of Alcohol Tobacco Firearms & Explosives 3 Alameda County Sheriffs Office
I also pulled out national labs from the gov list. All of the drones I saw in this section were the same stuff consumers buy.
DRONES ORGANIZATION ------ ------------------------------- 12 Sandia National Laboratories 4 Battelle Pacific Northwest National Laboratory 2 Los Alamos National Laboratory 1 National Marine Mammal Laboratory 1 Brookhaven National Laboratory 1 Oak Ridge National Laboratory
MIT Lincoln Laboratory also popped up in the Aircraft Reference file (which defines airplane types), but does not show up as an owner of a registered plane in the master list. Searching for the drone's manufacturer model number in the master list turned up 9 hits, though all of them had their blank fields for the owner. There are many blanked owner fields in the dataset, so this may just be part of the registration process and not obfuscation.
The drone's name is Locust, which appears to be a micro-UAV developed by students in MIT's Beaverworks program, commissioned by LL and the USAF back in 2010. Some former students mentioned working on LOCUSTS/PERDIX in their LinkedIn pages, and that they'd designed micro-uavs that could be deployed at 30,000ft from a "cartridge mounted on a business jet". I don't know if it's releated or not, but the Office of Naval Research has a video of their LOCUST (low-cost uav swarm technology). Didn't these people watch Terminator?
The above plots and data were generated with plot_drones.py and tally_drones.py, which I've put in my flight-classifier repo.