It's science fair time again and I've been looking around for something fun to do with the kids. One of the projects we found online that my younger son liked was an experiment where you see what materials block the signal of a remote controlled (RC) device the most. We dug up an RC tank, forced the controller to the forward position, and then wrapped the controller in different materials. The results were anticlimactic, though. While the fabrics didn't have any effect, the tank didn't seem to be bothered much when the controller was wrapped in foil either. Science had failed us once again.
Software-Defined Radio to the Rescue
The problem with the experiment I realized was that we were making a binary measurement: either the signal was blocked or it wasn't. What we needed was a way to get a better reading on how much the signal was being attenuated. It then occurred to me our RTL-SDR stick and Gqrx could do that for us. The back of the controller had a sticker with "40MHz" on it, which is within the RTL-SDR's tuning range. I launched Gqrx, pressed the up controller, and sure enough there was a strong signal at around 40MHz (with a harmonic at 41MHz). Success. We could see our enemy's signal. But could we block it?
Louis and I used duct tape to anchor the antenna to my workbench and mark off a fixed distance of three feet for the test. We used a rubber band to jam the controller in the forward position and then wrapped the controller in different materials. We recorded the dBFS signal strength off Gqrx (maybe not the right value, but it was consistent enough for comparisons). I did my best to explain what dB's were to both Louis and my wife, but in the end "more negative means less tank signal" was all they needed to know.
We tested 14 different scenarios. Most of the materials (wood, plastic, wax paper, rubber, paper, cardboard, and glass) didn't do anything to stop the tank's signal. Polar fleece and water had a noticeable effect. Metal did the best job of shutting the signal down.
The water test was the most interesting one for us. We wrapped the controller in a ziplock bag and then submerged it in a bucket of water. I had thought that water would completely block it, but we still saw some of the signal. When we reran the original binary experiment with the tank, we noticed that the controller's range was dramatically reduced and the tank would intermittently move.
Side Project: Building an AM Radio
The project was a good opportunity to talk to Louis about Radio stations and how people transmit information wirelessly. I found I had to provide a lot of background about radio stations in general, since the kids get most of their media through the Internet these days. Taking a car ride over the hills (to Fry's) while listening to the radio ("see how the radio gets fuzzy on the other side of the hill?") helped Louis get a handle on it. After the experiment, he had a new appreciation for how NASA ran their own RC car on another planet.
After reading more web pages, I took another step and ordered a 1MHz crystal so we could build a crude AM transmitter. The circuit is clever and easy: you just power the crystal with an audio signal, and hook a long wire to the output to serve as an antenna. I tuned our radio to 1000kHz AM, held the antennas a few inches away from each other, and sure enough, there was Rush's Spirit of Radio jumping the air gap.
The rest of the project was slow writing and arts-and-crafts. Louis wrote up everything for the poster, and (we're told) was very chatty about radio signals when the reviewers came around to talk to him about his project. We now know how to stop the tank's signal. Now all we have to do is submerge the planet in water (already on it!) or start putting countries in big metal boxes.
For the last few years, the majority of my time at work has gone into developing data-management services for high-performance computing (HPC). While we still have a ways to go before an official software release, we're starting to get performance numbers and have initial support for the Cray XC40's DataWarp NVMe array. I was asked to make a poster about our work and present it at an Exascale Computing Project (ECP) meeting that took place in Knoxville, Tennessee.
ECP is a new, multi-laboratory effort in the US that is scaling scientific computing to new levels through advances in hardware, applications, and system software. The goal is to develop an exaflops computing platform in the US that can serve the HPC needs of multiple domains (eg, science, energy, manufacturing, and national security). The data-management software we've been writing for asynchronous, many task (AMT) programming models falls into the system software category, and could be adapted to fit other needs in ECP (eg, workflows, checkpointing, and analysis).
ECP Poster Poster presented at ECP Meeting
Last year I wrote about how I built an ADS-B data logger to track planes using an RTL-SDR and an Intel Edison board. It was a fun project, but I eventually took it offline because I only had one Edison and running the logger meant I couldn't use the board to do other hardware experiments. I bought a second, smaller Edison board to fix the problem, but got side tracked with other projects and didn't get a chance to finish building the new logger until recently. This post goes through some of the hardware details of building the second version of my Edison-based ADS-B logger.
Intel Edison Breakout Board
In the previous version of the logger I used the Edison Arduino board. The board is large but has a number of useful built-in features (eg, a removable micro-sd storage, usb ports, Arduino pins, and a power jack). The other main Edison board Intel sells is a Breakout Board, which is very small and has just enough I/O for basic apps. I think I paid about $60 for it at Fry's. Both boards use the same Intel Edison module, which provides dual-core 32b x86, 1GB RAM, 2GB flash, 802.11, and BT. Similar to the Edison Arduino, the breakout board has two micro USB ports: one for Serial-to-USB and another that can be either a master or a slave (depending on power). The only other I/O on the breakout board is raw pads that you can solder wires to for interfacing with the Edison's micro connector.
The first problem I ran into was power. The board has four ways you can power it:
I initially hoped that I might be able to use a powered USB hub between the Edison and the RTL-SDR to power and connect both. Unfortunately, (1) hubs don't provide power to a USB master (duh) and (2) the breakout board makes the OTG USB port a slave unless you power from an external source. Since the battery port is only 3.7V (ie, not the 5V USB needs) and I didn't have a barrel connector, the only option for me was to rig something up to the J21 DC pins. I cut up the wires to an old wall wart, assembled an on/off switch for it, and then stuffed everything into an old film capsule.
The breakout board's limited I/O was also annoying because what I really wanted to do was plug in an external device for storing the data to prevent my Edison's internal storage from getting worn out. The breakout board lacked the micro-sd slot the Edison Arduino had, and with only one USB port I'd need to use a hub to plug in both the RTL-SDR and a thumb drive. I did some tests and found that when connected to DC, the breakout board could provide enough power to run a hub, the RTL-SDR, and a USB stick. However, it all felt kind of junky. I decided to go with compactness and just write the data to the Edison's internal storage. Given that Intel seems to have abandoned Edison, I'm not too concerned about the flash on these boards lasting forever anymore.
On the radio side of the project I decided to add a bandpass filter to see if it improved my range. FlightAware sells a $20 bandpass filter that attenuates everything outside of ADS-B's frequency range (1090MHz). Annoyingly, the filter has SMA connectors so I also had to buy SMA-to-MCX and SMA-to-Coax adapters. To make matters worse, I got the orientation of the filter backwards when I originally ordered the adapters, so I had to order a second set later with the genders reversed (a pair of adapters cost $10). I verified that the filter was attenuating the strength of other frequencies by booting up the gqrx app on my desktop and looking at radio stations. I don't have a way to get the real frequency response of the filter right now, but others have reported that it is wide enough to capture the family of frequencies plane watchers usually want.
I used two RTL-SDR dongles to do some visual comparisons between the filtered and unfiltered ADS-B results. I pulled up the web interfaces for dump1090 on both SDRs and then compared the number of planes each observed. In my core visibility range, both systems seemed to capture the same plane info. The planes I tracked at the edge of my visibility tended to be seen more by the filtered line. For example, the filtered line followed one plane for an additional 20 miles (and recorded more data points than the unfiltered line when it could see it). This performance wasn't always perfect though- sometimes the unfiltered line would see an incoming plane before the filtered line. I'll need to gather some data and analyze it, but my initial impression is that the filter does work, but doesn't make a huge amount of difference in my case.
Power Use and Heat
I hooked up a power meter to get a rough estimate of how much power the Edison and RTL-SDR use when dump1090 is running. An idle Edison with 802.11 enabled and no peripherals consumes less than a watt. When I hook up the RTL-SDR and enable dump1090, the power is about 3.5W. I noticed that the RTL-SDR dongle got a little warm after being on all day, so I used my cheap-o infrared thermometer to get some estimates. The thermometer said the dongle was 112 degrees F in the hottest spot (center, where the air holes are). The Edison also hit 110 degrees F (right side of the silver Edison can). When probing the thermal sensors on the device (ie, via /sys/class/thermal/thermal_zoneX/temp), I see 14, 55, and 54 degrees C (or 57, 131, 129 degrees F). That seems hot to me, but then again the Edison is passively cooled. I've been running it like this for months without problems, so it seems ok.
The Intel Edison is still a fine embedded board for building a data logger, but I've got to say that the breakout board's connectors disappointed me. It's stupid that they put in pads for a barrel jack but didn't populate it. Having only a single USB port also limits what I can do with the board. I'd hoped to make this a multi-purpose controller for the garage (eg, ADS-B monitor, webcam, temperature, etc), but to do that I'd need to add in a USB hub. The Raspberry Pi 3 is out now, and has built-in wireless and four USB ports. I'll probably change to that in the next version of things.
Close, but no blimp data. Last Thursday at my son's soccer practice, one of the Goodyear Blimps circled the field as it descended for a landing at the Livermore Airport. It was a little surreal, since it looked like the blimp was monitoring the practice the same way it circles big bowl games. However, blimp sightings aren't that uncommon out here. Livermore is on the fringe of the Bay Area and we have a large municipal airport with wide open spaces around it. It seems like the perfect place to launch, land, and park a blimp if you knew you were going to be visiting the area by dirigible airship.
Sunday morning I started wondering where the blimp was going while it was in the area. Since I've been running a dump1090 data logger on my Edison board for the last few weeks, I began pulling the data and parsing it for signs of dirigibles. As I was puzzling through how I might identify a blimp in the pile of points, I heard a faint buzzing sound coming from outside and realized that the blimp was at that very moment passing by my house.
Getting the ID From Dump1090
I went over to the webpage that dump1090 generates to see what aircraft were in the local area. I was disappointed to see that the map was pretty much empty nearby, which meant that the blimp wasn't transmitting position data. I looked at the list of planes and noticed that there were some planes in the area that were reporting their presence, but not identifying their position. I sorted them by altitude and found one that was cruising along at only 2,000ft with an ICAO hex ID of A4A7EF. Some searching around and I found this ID belongs to tail N4A, which is a 33-year-old (!) blimp owned by Goodyear.
Looking it up in FR24
While I was disappointed that my own logger didn't get any position info for the blimp, I knew that other aviation sites have tracks for aircraft based on other data sources. I looked it up on FlightRadar24 and found that they had logged a few flights for the blimp on Saturday:
Well, that solves the mystery: they were out here to watch Stanford play against USC (for the people back home, that's U of Southern California, not South Carolina). They circled that stadium for more than 5 hours, trying to make sense out of the whole situation. Then they went and blew some steam off in San Francisco. I'd like to think that the highlight of their trip though was watching my son's soccer team practice.
Craigslist is an interesting source of text data. In addition to providing a continuous stream of user postings from around the country in organized categories, the website stubbornly favors a plain-old-web format that's easy to retrieve and parse. I believe craigslist gets a lot of traffic from different kinds of scrapers. In addition to all the search engine crawlers, you hear stories about how individuals run scripts to continuously watch their local boards so they can be the first to snatch up free items. Craigslist blocks people that aggressively crawl the site, but otherwise let you wander around if you put in some rational delays.
Back in September I wrote some utilities to go off and scrape job postings from craigslist, because I thought it would be interesting to see what kind of people Bay Area companies wanted. After working out how to grab the data in an unobtrusive way, I updated my script to grab tech job postings from different cities around the country. I run the script about once a week, which over the last 9 months has given me about 32k postings, totaling 470MB in text data. This post just focuses on the scraping. I'll get to the analysis later.
Craigslist puts each post as a separate web page, and uses a city/topic/post directory structure to keep things organized. While the post part of the url is unique and non-sequential, they provide an easy-to-parse index page for each topic that will give you all the urls for the posts in reverse chronological order. All one has to do is pick a city and a topic, walk through the index, and retrieve the individual posts. I put some delay in after every page I fetched to be polite. I also randomized the city list on each run to even out the data if the grabs were taking too long and needed to be cut off (though always getting ATL would have been fine for me). To help with statistics, I had the script store basic information about runs in a local sqlite database. The database helps avoid downloading the same post twice, and gives me a place to store the dates of when I first and last saw a particular post.
Grabs Per Day
Below is a breakdown of how many posts I grabbed for each city when I ran the scraper. Since the script only grabs posts that it hasn't seen before, the per day grabs go up and down based on how frequently I ran the script (eg, when I missed a week or two, there was more data available to grab). For this time period, the cities seem to be fairly proportional. The big job cities seem to be San Francisco, Seattle, New York, and Boston (not unexpected). C'mon Atlanta. It's like you're not even trying.
Number of Active Days per Post
Another interesting statistic for me was how long job postings remain active on craigslist. I used the "first seen" and "last seen" dates stored in my meta data to estimate the amount of time I post stays alive. The numbers are off due to the initial posts I pulled (ie, I looked at the grab date instead of the post date) and the most recent posts (ie, which have not expired yet). As the below (logscale!) plot shows, most posts stick around for about a month. However, there are a few the last as long as 80 days.
It isn't much but I put the code for this on github: