Bhaskar Karambelkar's Blog

Re-plotting Russian AirStrikes In Syria

 

Tags: R-stats DataViz Cartography Leaflet


My Cartography mentor Bob Rudis pointed me to a blog post visualizing Russian Air Strikes in Syria and commanded me to redo the static maps to something more interactive and easier to explore.

TL;DR

Interactive Map at Rpubs created using Leaflet after scraping data using RSelenium+ PhantomJS + dplyr. You can use the LayerSelector at the Top Right to toggle various Base Tiles. Clicking on any Marker will show details about that Air Strike.

Long Read

Data Acquisition

The data comes from crowdsourcing of Russian Ministry of Defense’s (MOD) YoutubeTM channel. The process and the data is described here and the data can be found at http://russia-strikes-syria.silk.co/. The argument is that a majority of the strikes claimed by the Russian MOD to be targeting ISIS held areas are actually targeting non-ISIS rebel areas and as such helping the Asaad regime more than fighting ISIS.

The original visualization was done by copying the data and putting it in an excel spreadsheet and then mapped using R’s ggmap package. But in the interest of reproducibility I wanted to scrape the data directly from within R. For this I initially tried using rvest but quickly realized that this was a no go as the table containing the data was dynamically populated using AJAX/Javascript stuff. So I had to turn to RSelenium + PhantomJS as described here.

Below is the web-scraping code, and the webpage from where this data was scraped can be found here.

library(RSelenium)
library(rvest)

pJS <- phantom()
Sys.sleep(5) # give the binary a moment
remDr <- remoteDriver(browserName = 'phantomjs')
remDr$open()
remDr$navigate('http://russia-strikes-syria.silk.co/explore/table/collection/strike-id/column/date-uploaded/column/time-in-utc-uploaded/column/accuracy-of-russian-location/column/actual-location-co-ords/column/closest-location-governorate/column/claimed-location/column/claimed-targets/column/closest-location-actual/column/status/column/isis-in-the-area/column/error-type/column/notes/column/checkdesk-link/column/video-url/slice/0/1000')
Sys.sleep(2) # Some time for page to load
events <- read_html(remDr$getPageSource()[[1]]) %>%
  html_node(xpath= '//*[@id="canvas"]/div/div[3]/div[2]/div[2]/div[4]/table') %>%
  html_table()
remDr$close()
pJS$stop() # close the PhantomJS process


In short the code starts a RSelenium + PhantomJS WebDriver fetches the webpage containing the data. Then the html table is parsed using rvest’s html_table() after the correct table is selected using the proper xpath to the table.

Data Preparation

To plot the data correctly I need to perform the following steps.

  • Filter out data containing invalid Lat/Long coordinates.
  • Split the single column containing Lat/Long in to two columns.
  • Create a new column to be used for Popup Display when a point is clicked on the map.

Thankfully dplyr and tidyr are more than capable of doing all this using some basic simple steps shown below.

library(dplyr)
library(tidyr)

events %>% filter(str_detect(`Actual location co-ords`,
                             '[0-9]+\\.[0-9]+, *[0-9]+\\.[0-9]+')) %>%
  separate(`Actual location co-ords`,c('lat','lon'),
           sep = ',', convert = TRUE, remove = TRUE ) %>%
  mutate(popup = sprintf('
    <P><center><b>%s</b></center><br/><i>Status:</i> <b>%s</b><br/><i>Date Uploaded:</i> <b>%s %s</b><br/><i>Claimed Location:</i> <b>%s</b><br/><i>Claimed Targets:</i> <b>%s</b><br/><i>Closest Governorate:</i> <b>%s</b><br/><i>Closest Actual Location:</i> <b>%s</b><br/><i>ISIS Presence:</i> <b>%s</b><br/><i>Error:</i> <b>%s</b><br/><i>Notes:</i> %s<br/><i>Description:</i> <a href="%s">%s</a> / <i>Video:</i> <a href="%s">%s</a><br/></P>',
    Airstrikes, Status,
    `Date (Uploaded)`, `Time in UTC (Uploaded)`,
    `Claimed location`, `Claimed targets`,
    `Closest location governorate`, `Closest location (actual)`,
    `ISIS in the area?`, `Error type`,
    Notes, `Checkdesk link`, Airstrikes,
    `Video URL`, Airstrikes
    )) -> events

The filter function filters out all data points which don’t match the regex for the the Lat/Long format. The separate function splits the ‘Actual location co-ords’ column in to two columns lat and lon. And finally the mutate function is used to create the HTML code that will be used to display the popup when this datapoint is clicked on the Map.

Data Plotting

Finally for Data Plotting I used Leaflet for R library. You will need to build the library from source as I use some new features in the library that haven’t yet made it to CRAN. You can do this using devetool::install_github('rstudio/leaflet').

The Map consists of following elements

  • Multiple Base Tile Maps out of which only one can be active at any given time.
  • A GeoJSON for plotting the various Administrative areas of Syria superimposed on the base map.
  • Markers for the Air Strikes.
  • A Layer Selection option.
  • A mini map to know the global context.

I also needed a way to visually distinguish between VERIFIED and FALSE strikes. Verified being strikes that claimed to have targeted ISIS and actually targeted ISIS or actually have targeted non-ISIS areas and not claimed to have targeted ISIS in short those where the claim and actual targets tally, and FALSE being the ones where there was a discrepancy in either the claimed and actual target or claimed and actual location. I chose to use Blue colored Markers for VERIFIED and Red colored for FALSE.

The code for plotting is shown below

library(leaflet)

if(!file.exists('./cities.json') {
	# Syrian Cities GeoJSON downloaded from
	# http://crisis.net/projects/syria-tracker/cities.json
	# More Info @ http://blog.crisis.net/choropleth-maps-with-d3/
	download.file(url='http://crisis.net/projects/syria-tracker/cities.json', destfile='cities.json')
}
cities <- readLines('./cities.json', warn =F) %>% paste(collapse='\n')

# Leaflet Map + Various Base Tiles
events %>% leaflet() %>%
  addTiles(group="Default") %>%
  addProviderTiles('CartoDB.PositronNoLabels',group='Blank-Canvas') %>%
  addProviderTiles('OpenStreetMap.BlackAndWhite', group="OSM-BlackNWhite") %>%
  addProviderTiles('MapQuestOpen.OSM', group='MapQuest') %>%
  addProviderTiles('Stamen.TonerLite', group='Stamen-Light') %>%
  addProviderTiles('Esri.WorldStreetMap',group='Esri-1') %>%
  addProviderTiles('Esri.DeLorme',group='Esri-2') %>%
  addProviderTiles('Esri.OceanBasemap',group='Esri-3') %>%
  addProviderTiles('Esri.NatGeoWorldMap',group='NatGeo') %>%
  addProviderTiles('CartoDB.Positron',group='CartoDB-1') %>%
  addProviderTiles('CartoDB.PositronNoLabels',group='CartoDB-2') %>%
  addProviderTiles('Stamen.TonerHybrid',group='CartoDB-2') %>%
  addProviderTiles('Stamen.TonerLines',group='CartoDB-2') %>%
  addProviderTiles('CartoDB.DarkMatter',group='CartoDB-3') %>%
  addProviderTiles('CartoDB.DarkMatterNoLabels',group='CartoDB-4') %>%
  addProviderTiles('Acetate.basemap',group='Acetate') %>%
  addProviderTiles('Stamen.TonerLabels',group='Acetate') -> eventMap

# Awesome Icons with color depending on Status
icon <- awesomeIcons(icon = 'crosshairs',
                     markerColor = ifelse(events$Status == 'VERIFIED','blue','red'),
                     library = 'fa',
                     iconColor = 'black')

# Add Markers for AirStrikes and GeoJSON for Syrian Regions
eventMap %>%
  addAwesomeMarkers(
    lat=~lat, lng=~lon,
    label = ~Airstrikes, icon=icon,
    group = 'Air Strikes',
    popup = ~popup
  ) %>%
  addGeoJSON(cities, weight = 0.7, color = "#00FF00",
             stroke=T, fill = F, fillOpacity = 0.1,
             group='Syria Regions') -> eventMap

# Add a Layer Control for toggling Layers/BaseMaps
eventMap  %>%  addLayersControl(
  baseGroups = c('Default',
                 'Blank-Canvas',
                 'OSM-BlackNWhite',
                 'MapQuest',
                 'Stamen-Light',
                 'Esri-1',
                 'Esri-2',
                 'Esri-3',
                 'NatGeo',
                 'CartoDB-1',
                 'CartoDB-2',
                 'CartoDB-3',
                 'CartoDB-4',
                 'Acetate'),
  overlayGroups = c("Air Strikes", "Syria Regions"),
  options = layersControlOptions(collapsed = TRUE)
) -> eventMap

# Finally Add a Minimap and render the Map
eventMap %>% addMiniMap()


And the final map is shown below.

Or access it at Rpubs.

Conclusion

  • For Web Scraping dynamic data RSelinium + PhantomJS makes a killer combo.
  • R’s leaflet library allows for easy creation of interactive maps.