Hi Dr. Kim,
We have a GIS-based data processing problem, and Dr. Chris Cherry (cc’ed in the email) suggested I reach out to you. We would really appreciate your assistance/advice on cleaning the data efficiently. The description of data, explanation of issue and objective, and description of desired output are summarized as follows:
Description of data: We obtained street-level trip volume data of shared e-scooters in Nashville from Populus (a data company that hosts shared micromobility data for the Nashville Department of Transportation). Each file is in geojson format, and we are interested in a monthly aggregated trip volume (“trip_count_sum” and “daily_trip_count_average” field) of road features based on OSM data (primary, secondary, footpath, etc.). The screenshot of a sample file is shown below:
31 data files are zipped and attached to the email. Data for 12 months are missing. We have requested the missing data and are expecting to receive them soon.
Issue: Populus ran a route map-matching algorithm for e-scooter trips to snap on all routable road features, including footpaths. However, the resolution of GPS data is not good enough to identify adjacent footpaths vs. road lanes.
Objective: We need to simplify the features to the only road lanes for streets (drop footpaths) and pathways for park areas where there are no roads (users rode e-scooters at parks too). At the same time, we want to obtain the total trip volume along the street by adding volumes of all road features along the corridor. For example, as illustrated in the screenshot below, we want to combine the trip volume of three road features in the corridor (road lane and footpath adjacent on each side). The final outcome would be road lane only with trip volume as the sum of trip volumes in all three features (2161 + 2286 + 2297).
Other examples are the intersection illustrated in the below screenshots. We want to retain the highlighted road lane and drop the sidewalks/crosswalks but add the trip volume of dropped road features to the retained road lane. For median separated roadway (illustrated in the top right section of a roundabout in the second screenshot below), we can consider having two parallel lines for road lanes.
Final output data: We would want the final data in a single (spatial) database by spatially merging all individual files. The “trip_count_sum” and “daily_trip_count_average” fields can have prefixes or suffixes to indicate time (based on individual files), as illustrated in the table below.
Thank you for your time. Please let us know if you need any further information.
Best regards,
Nitesh Shah
(he/him/his)
GATE fellow, Oak Ridge National Laboratory (ORNL)
Dept. of Civil and Environmental Engineering,
The University of Tennessee at Knoxville,
311 John D. Tickle Building,
851 Neyland Drive, Knoxville, TN 37996.
Email: [log in to unmask]
Website: www.niteshrajshah.com
Big Orange. Big Ideas.