Ben Mearns: Cloud basemaps, so elusive

Monday, December 21, 2009

Cloud basemaps, so elusive

I wanted to use openstreetmaps as the base data layer for the new UD Campus Maps, but am running into the same labeling issues as when I use a google maps base layer ... labels are prerendered to appear on streets, cities, etc. of course not taking into account placement of "UD" layers (e.g. buildings). When the two are displayed together there is a very unsightly overlapping effect.

So the way around this is to use a custom renderer, which is actually quite a bit of work ... the custom renderer requires a large amount of space, not to mention an initial span of CPU time. It also requires that I create custom style specs, which address all sorts of issues, including a plethora of OSM tags, as well as all the details about base layer appearence ... all to be done manually of course.

... not to mention, the existing baselayers seem to be disagree with cloud services ever so slightly

CloudMade offers a very cool service for rendering custom basemaps from OSM data, all through an efficient GUI, and they even host it for you. Trouble is their terms do not seem to permit "non-personal" use, :.(

Sooo, I can:

set up a custom render, which *might* require a lot of space/time, and *might* require a lot of work in symbolization, and will consume local resources instead of leveraging the cloud
*or* use ArcGIS server to tilecache a national basemap ... having much the same disadvantages as above, except that I have a better idea of how much time/resources it will take
*or* render on top of satellite data (which doesn't really look that good, and not very informative for directions), plus requiring me to create custom label and symbol scaling and symbology for Newark basemap details (tending to be less attractive than those already available in the cloud, and causing disuniformity with surrounding areas)
*or* I can mask the extent of all "UD layers" and use this as the default extent for the map, putting in some scale sensitivity which causes UD detailed layers to disappear when viewed at a larger "regional" scale and instead showing some extent rectangle ... then showing these layers again when at a closer zoom (having some of the same drawbacks of disuniformity/attractiveness)

... as of today the last option seems the best ... I sure hope it bares fruit, we need to get this project rolling!

14 comments:

John Callahan said...: Hi Ben,

Glad you started this discussion about the UD campus map. Here are a few thoughts...

I'm not sure I understand your dilemma relating to OSM base maps vs UD specific layers. The UD layers that I can think of are buildings, walkways, POIs, walking or biking trails, etc... These can all be added to the OSM "base" layer. There doesn't need to be a distinction between these types of layers.

I strongly encourage the use of OSM for the following reasons:

* OSM has a built-in rendering engine with integrated tagging system, which is completely flexible and open. See the wiki for details.

* Output maps are pre-rendered tiles. OSM creates and hosts the maps on reliable servers; just one more thing you don't have to do. Each feature maintains an edit history as well.

* OSM allows editing from many different platforms: web-based, desktop apps (JOSM), plugins (QGIS OSM editor), mobile (iPhone editors), etc... This holds for different browsers, operating systems, and skill levels.

* OSM has a large existing user base. Editors can be outside of the UD community but have knowledge of the geography.

* OSM combines many other data sets as well, such as hydro, streets, etc..., and other data that UD campus mappers don't have to maintain.

* All of the above points makes OSM a true vertical integrator. Local edits on campus are merged with regional data, into state, national, and global data. Likewise, edits to state/county data adjacent to campus only help. From a DataMIL/State perspective, it would be great of edits to campus data are automatically integrated into state data, and vice versa.

* Oh yeah, OSM allows for complete data dumps for all features!

We could hold a few campus workshops on how OSM works, the data model, and a few of the editors. Maybe teach people how to use GPS devices and upload into OSM. Quarterly meetings would be nice as well.

Definitely go with pre-rendered tiles. The tiles should be available via javascript, such as TMS or similar structure to allow for the easiest, widest possible use, especially for on-campus developers. IMO, this is more important than WMS. It's the fastest, most reliable way to go with data that doesn't change all that much. This could be done via MapServer/GeoServer and TileCache or GeoWebCache.

For a custom renderer, Mapnik is a good way to go. There are a few others. You have a lot of control and output png tiles. The data would be dumps from OSM plus additional local data. It may take some time to install/configure but worth it for a long-term, sustainable project.

I wouldn't touch ArcGIS at all for this project. Arc* uses proprietary software, the data is encouraged to be in closed, binary, proprietary formats (GDB), and the same goes for symbology (.lyr) and project files (.mxd). I would suggest to keep the entire process open, from discussion to implementation. To share the process, even the code, with other universities in similar situations. To integrate as much as possible with the surrounding community (OSM) and use familiar interfaces (pre-rendered tiles, Google Maps, OpenLayers.)

- John; December 22, 2009 at 12:36 AM
Unknown said...: Thanks for your thoughts John ... I've really been needing this type of conversation.

What you're recommending is that I host campus data alongside OSM data rather than using OSM for base layers only, with UD layers hosted in a different system. I really like that concept ... had strongly considered doing something like that before, but shied away for some of the reasons below and probably others that I didn't record and are now lost.

Clearly would require time loading in postgres and resymbolize/redefine all lables and scales in qgis, as well as adapting the software I've already written to this architecture (am about a month to completion, that would add probably 1+ months to that figure)

I'm not against it if it results in a better system which can be leveraged in the long term for other worthwhile efforts on campus ... even if it puts me in a bad spot, and I have to eat my own words on ArcGIS ... the crux of this is *how long is it going to take*

Keep in mind that the UD layers positioning disagrees with OSM signficantly (see image in post). That issue alone could take a while to fix (if spatial adjustment doesn't work).

In March (almost a year ago), I had spoken to a gentleman at University of Maryland (Jim Purtilo) who put together a really nice OSM campus map for University of Maryland (http://map.umd.edu/map/) with a *team* of students in the CS department (not sure how that reflects on required work effort). They put together a system which downloads new edits from OSM and then caches all locally along with their local layers which are tagged accordingly ... note that if we are going to integrate UD layers with OSM we will need to *cache everything locally* ... I'm unsure of the implications for traffic and server resources, not to mention any routine CPU intensive caching (guess that could be done in the evening). I do know that this would require a local renderer (probably mapnik), but don't know the specifics about rending for only particular scales and so forth ... if that is even possible.

There is also the need for monitoring/potential for abuse. Say someone wrecks baselayers under the campus and caching/rectifying runs ... yuck ... clearly this could be prevented by monitoring changes during rectification ... would that be time consuming? Probably not prohibitively so.

Another issue is that potentially sensitive data would be made public for lots of unintended uses ... of course there is good and bad that could come of that ... I need to figure out what the actual position is on the sensitivity of different layers (e.g. bluelights). We've been dealing with it on a case by case, so far.

Finally ... most of the benefits for OSM are in the long run ... very few (any?) UD GIS users are doing distributed web editing on geodata ... and definitely don't know QGIS or OSM ... there's a great learning opportunity there, but right now the pressure is on to just roll out attractive maps.

Has anyone on campus deployed a custom renderer or blended local layers with OSM layers or deployed a TMS? I am confident I could build this system, having loose experience with the pieces involved, but that sort of certainty often translates to big time overruns. I feel like I need to wrap this up so I can move on to other projects in the queue, and am reluctant to expand the project horizon from 1 month to 2+ months ... especially without actual experience (or second hand knowledge) that it can be done in that timeframe ... unless I can demonstrate support from the GIS/web community to justify the additional time.; December 22, 2009 at 9:58 AM
Unknown said...: ... interesting, mapnik supports shapefile input, wonder if it pulls in symbology/labeling (guessing not) ... even so, would still need to cache everything locally

... I'm going to ask Jim from UMd if he can help me get a better idea on the time it took for his team to build their system and also what kinds of hardware resources they needed; December 22, 2009 at 10:14 AM
John Callahan said...: Mapnik is the engine that does the rendering. You list the input data sources, define the rendering rules, and then run Mapnik. The output is a series of png images in a tiled structure, I believe that can be put right into a javascript mapping interface like Google Maps or OpenLayers. (It can also create PDF, PS, svg, jpeg)

You would run Mapnik nightly, weekly, maybe even hourly and it would replace your map tiles. That's it.; December 22, 2009 at 10:24 AM
Unknown said...: I understand all these parts at a high level, I just have never installed/run before, so I don't know what lurks below the surface. The only parallel I have is ArcGIS Server, which performs terribly when caching large data sets, and might get really choked up if I tried to pre-cache a detailed national dataset (e.g. OSM) ... I wanted the full national base data so I can integrate directions from anywhere in the continental US ... I don't know if mapnik allows preferential caching for certain extents. Just noticed, UMd doesn't cache outside a relatively small regional scale ... if you zoom out a couple times, it starts to look bad.

... should I ditch the integrated directions thing? I don't really like the idea of sending the user to another page for directions, but that's the whole hang up right now; December 22, 2009 at 10:51 AM
John Callahan said...: If your primary concern is to get something out quickly, then just do whatever you want. If your primary concern is to develop a solution that scales over time, space, and economics, and a method that is open (basically a self-sustaining project), then it may take time. I have no idea how long. Nothing here is very difficult but of course, will take some time to learn the ropes.

I'm not sure I understand some of your comments. If you are using OSM as your data model, then postgres, shapefiles, etc... are not necessary. All of the editing is directly directly in OSM. This could be done via a web browser (extremely easy to use by anyone), or downloadable apps (JOSM, which you can load WMS services as background), or use plugins like QGIS editors, iPhone and others. There are so many ways to edit OSM data that it works well for users of all skill levels and all platforms. The popularity of OSM has shown that.

Rendering is already done on the OSM server. You do NOT need to define any symbology, only tag your features.

OSM is only for "public" data. If there are "potentially sensitive data" then those would NOT go in OSM. If you want these on a public web app, then you would need to mash it up! Keep that data local, then either run Mapnik to create map tiles from all data sources (downloaded OSM data). Or, put your local data into a WMS service and create your own mashup that displays all services.

Abuse is always something to worry about. Yes, you can receive notifications when edits are done in your area. With OSM, it gives you the ability to have numerous people watching the data as well. Take a look at other cities, towns, universities, and see if they are abused.

I'd like to see more about the spatial location comparisons between OSM and local data. OSM was initially created from TIGER data, which is of course not very spatially accurate. We would need to bring in roads from DataMIL, hydro from NHD and other GPS points of interest. One of OSM strengths is in its accuracy and continual improvements of the data.

The system you describe that UM deployed sounds cool but not necessary. It's great to show data that you don't want in OSM (like the scenario I mentioned above with Mapnik.)

For starters, just getting the common data in OSM would be wonderful. You only need OSM data layers to make a very nice looking campus map. It should also be easy to overlay OSM data with Google Maps or any WMS service. I've done similar mashups and seen others with OSM.

With OSM, you can develop all types of class or research projects to enhance the campus map in the future.; December 22, 2009 at 10:56 AM
John Callahan said...: Directions is something in the user interface. I doubt you want to run your own real-time routing engine on such common data like roads. Why would you when we all use Google or one of the other big players? ;)

For directions, you can use Goggle Maps api and pull in directions based on user input. You don't need to send people to another page, just show the directions in another div.

On the current UD visit us page, I capture the user's click and determine what feature they clicked on. This has nothing to do with the map they are viewing underneath but rather a javascript array that holds all features bounding boxes. You can do similar JS tricks.

Also, there are several OSM routing engines. However, I would still go with Google until much of the TIGER data is improved in OSM, at least regionally.; December 22, 2009 at 11:05 AM
Unknown said...: to get to your questions about why local caching/rendering is necessary:

true, all non-sensitive could be directly uploaded and then edited/served from the OSM cloud ... the big drawback is that there is no control over how the data is styled ... maybe the default styles that the cloud mapnik uses are great, that's a possibility that I sort of disregarded, as I've seen some pretty poor styling decisions on the default OSM/mapnik service, but would be worth revisiting. I'm especially skeptical that the default mapnik will give reasonable labeling. If that is true, local rendering/caching would be necessary.

Good point on sensitive data ... these could be kept local and deployed out on another service (though comes back to some of my original problems, since this is what I'm doing now).

I need to figure out what the 'official' position is on data sensitivity and then see what it looks like when integrated with the OSM cloud.; December 22, 2009 at 11:13 AM
Unknown said...: eghk ... I guess I probably shouldn't upload that data, since it is off-kilter

the campus layers are also off when looking at them with Navteq streets layer :-(; December 22, 2009 at 11:29 AM
Unknown said...: oh yeah, don't get me wrong, I definitely want to use google to do the actual routing, just need a national basemap on top of which to display the direction polyline; December 22, 2009 at 11:34 AM
Unknown said...: ... could use google for basemap, but that comes back to the original problems ;-); December 22, 2009 at 11:36 AM
John Callahan said...: I should probably list some campuses that are using the default OSM slippy map rendering, which is currently Mapnik and Osmarender styles. (Note: OSM is about the data model, not display/rendering. What they show on their home page is simply so you can see the data.)

University of Maryland
http://www.openstreetmap.org/?lat=38.98774&lon=-76.94211&zoom=16&layers=B000FTF

University of Wisconsin - Madison
http://www.openstreetmap.org/?lat=43.0762&lon=-89.41882&zoom=16&layers=B000FTF

Michagin State
http://www.openstreetmap.org/?lat=42.7273&lon=-84.48039&zoom=15&layers=B000FTF

University of Minnesota
http://www.openstreetmap.org/?lat=44.97582&lon=-93.23596&zoom=16&layers=B000FTF

University of Melbourne
http://www.openstreetmap.org/?lat=-37.79646&lon=144.96166&zoom=16

University of Toronto, St. George Campus
http://www.openstreetmap.org/?lat=43.66392&lon=-79.3942&zoom=16&layers=B000FTF

University of Reading, Whiteknights Campus
http://www.openstreetmap.org/?lat=51.44023&lon=-0.94101&zoom=15&layers=B000FTF

University of Calgary
http://www.openstreetmap.org/?lat=51.07582875904257&lon=-114.13526264078328&zoom=15&layers=00000F0B0F

Clemson University
http://www.informationfreeway.org/?lat=34.67646336617766&lon=-82.83460769378071&zoom=15&layers=00000F0B0F

Click the "+" sign at the top right to change renderings. This is just what OSM runs. You can see other displays of the same data at http://www.informationfreeway.org/ (use the + sign), http://maps.cloudmade.com/ (use change style.); December 22, 2009 at 11:52 AM
John Callahan said...: Forgot one I had bookmarked.

Rowan University
http://users.rowan.edu/~reiser/osm/; December 22, 2009 at 12:59 PM
Unknown said...: good point ... some of those campus maps look downright decent in my opinion. I noticed that the OSM maps were not used as the 'official' campus map in (any?) of those cases. In some instances I think the OSM map looks better than the campus map they ended up using, though in others there is rough symbology that ruins the campus effect (see the Reading example, where a town boundary is probably the most prominent feature on the map) or wierd tiling effects (http://www.openstreetmap.org/?lat=38.98188&lon=-76.94151&zoom=17&layers=B000FTFT ... at least right now)

Anyway, this is where I'm at now:
-how do we adjust existing data so that it matches up with OSM (trying this right now with JSOM)
-figuring out 'official' position for data sharing of campus data, layer by layer (needed to get the word on this anyway); December 22, 2009 at 2:24 PM