openhatch

Issue134

Title Map is too slow to be useful
Milestone 0.11.05 Priority bug
Waiting On Status resolved
Superseder Nosy List nelson, palhmbs, paulproteus, zathras
Assigned To paulproteus Keywords performance

Created on 2010-09-07.17:27:54 by paulproteus, last changed 2011-06-01.02:02:34 by paulproteus.

Files
File name Uploaded Type Edit Remove
geocode_openhatch.tar.bz2 palhmbs, 2010-10-09.18:01:28 application/bzip2
geocoder.php zathras, 2010-11-15.21:01:27 application/x-php
open_python_geodecoder.tgz zathras, 2010-11-15.21:00:39 application/x-compressed-tar
open_python_geodecoders.tgz zathras, 2010-11-24.22:34:31 application/x-compressed-tar
Messages
msg1978 (view) Author: paulproteus Date: 2011-06-01.02:02:34
The OpenHatch map is:

* Now a LOT simpler, in terms of implementation -- look at map-v2.js in the
OpenHatch source code

* Uses OpenLayers, not Google Maps

* A BAJILLION times faster, thanks basically entirely to Christopher Schmidt who
showed me the way.

I am marking this resolved. The new map comes with a lot of bugs, and I haven't
filed them all, but oh my god this is so exciting.
msg1909 (view) Author: paulproteus Date: 2011-05-28.01:08:39
Advice from IRC:

<crschmidt> paulproteus: The permance sucking happens because you're trying to 
put 5000 DOM elements into a single container.

Also, crschmidt put http://openhatchbrowsertest.appspot.com/ together as a demo 
of how it can work.

https://github.com/crschmidt/openhatchbrowser has source.

I can try to implement this shortly.
msg1475 (view) Author: palhmbs Date: 2011-04-11.23:21:20
Moving this to 11.05 milestone since it's unlikely to get into the 'very short' 
April milestone.
msg1410 (view) Author: palhmbs Date: 2011-04-03.09:57:36
Marking this as deferred as it was agreed in todays meeting --
Ref: https://openhatch.org/meeting-irc-logs/weekly/2011-04-02.log.html

I think we should wait on zathraz's database patch ( https://openhatch.org/bugs/
issue175 ) and land these both together.
msg1302 (view) Author: paulproteus Date: 2011-03-19.21:01:34
I will work on this during the week.
msg1128 (view) Author: palhmbs Date: 2011-02-19.22:08:08
Need help with modifying the database to have the field for persons longitude & 
latitude. This field will need to be updated at user creation / when they change 
their location. - We will use this field for a mysql query that then can be used 
with the python cluster code.
msg1122 (view) Author: palhmbs Date: 2011-02-19.20:09:59
Found a python implemented clusterer here... 
http://forum.mapaplace.com/discussion/3/server-side-marker-clustering-python-
source-code/
msg604 (view) Author: zathras Date: 2010-11-24.22:34:31
library of functions in python
configurable to use a geocoder API and service of choice
various services and APIs supported
not all data in the locationDisplayname name field is compatible with all
coders/APIs
test is incuded in the script. Script is independent of OH code but functions
should be easy to link into OH code
msg554 (view) Author: zathras Date: 2010-11-15.21:22:40
PHP code attached and PHP code as pasted in msg are identical.
Would have prefered to remove the code but I have no idea how to do that. Sorry.
msg553 (view) Author: zathras Date: 2010-11-15.21:01:27
PHP based XML open geodecoder, used as initial proof of concept
msg552 (view) Author: zathras Date: 2010-11-15.21:00:39
added open (free) non-Google geodecoder
needs further testing
msg539 (view) Author: zathras Date: 2010-11-13.21:47:52
Text export of piratepad discussion on the subject between Zathras, PaulProteus
and Pythonian4000 ;


 
Issue 134: Map is too slow to be useful 
https://openhatch.org/bugs/issue134 
 
Additional remark: current implementation uses non-foss data (google maps) while
excellent foss sw and data is available: openstreetmap + openlayers 
 
Constraints: 
* mapping sw usually uses javascript with some code around it. As Openhatch is
implemented fully in Python it is desirable to let the additional code be in
python. 
Note: PaulProteus is willing to let the additional code be in another language
under certain conditions 
* used techniques sw must be available on the openhatch server and should be
easy to deploy on people wanting to run their own openhatch server without
generating a loot of extra support/problems 
 
There can be numerous reasons why the current implementation is slow. On IRC
there has been some discussion on this some time ago. 
Re-implementing the map might speed things up but the main problems are likely
the way the map is created and presented. 
* a location of a person (if available and public) must be transformed from an
address/place/state/country etc to coordinates (longitude/latitude). This
process is called geocoding. Currently this is done for every entry (person) in
the database during mapcreation. Storing these coordinates in the DB would
improve speed. 
* markers. All Markers are positioned on the map in one go. This takes time when
generating the map. But mainly displaying and updating (moving/changing view is
very slow. Better woul be to just display only relevant markers and those who
can actually be seen (not located behind an other marker etc). Relevant would
be: display only those people interested or skilled in X in a distance of Y
around me. 
 
Possible implementations 
- ODOPOI : http://wiki.openstreetmap.org/wiki/Odopoi 
  Pros: 
  * seems to do exactly what we need at first glance 
  * uses openstreetmap data 
  Cons: 
  * project is run by a single person (student) 
  * project is in beta stage 
  * has a dependencies list which might be troublesome (check with paulproteus):
https://github.com/tcort/odopoi/blob/master/INSTALL 
  * requires an extra database 
  * is written in php, not python 
- .... ?
  open for alternatives with dynamic marker management

If we stick to using Google Maps, it should be "easy" to use a marker manager.

Looking for such things for OpenLayers...
http://drupal.org/node/622720

"There is clustering, but this will not improve performance per-say as it  still
needs to load all the features to cluster.  There is also the  BBox strategy as
well."

If OpenLayers really has marker clustering built in, then that's enough I think!

OL is a universal display layer. Using Gmaps or OSM is not an issue at that point

http://dev.openlayers.org/releases/OpenLayers-2.7/doc/apidocs/files/OpenLayers/Strategy/Cluster-js.html
http://trac.osgeo.org/openlayers/browser/trunk/openlayers/examples/strategy-bbox.html
http://trac.osgeo.org/openlayers/browser/trunk/openlayers/examples/strategy-cluster-threshold.html

I think strategy-cluster is what we need.
Check the ODOPOI sample site ( map of canada). It's quite fast
    Neat; /me reads
    Wow, that is awesome. Super fast.
    Can we maybe look at how they're using OpenStreetMap, javascript wise? If we
just use ODOPOI, we'd lose some features of the people browsing interface
(unless we re-wrote them in PHP, I guess?)
        /me reads the code
        This is smart:
        // Remove markers that aren't within the bounds of the visible part of
the map at the current zoom level
        // Keep markers that are within the bounds of the visible part of the
map at the current zoom level
        Add to that: some code so that overlapping markers are handled....
        

If we have a rewrite of his code it would "map" our needs. Would require however
quite some code.
Advantages:
better "mapping" of needs
can be done in Python
fixes the need for an extra DB
Cons:
it's in Python (I am not really good at that)
needs a lot of coding and testing
will need more maintenance as it's our own code



I think the next steps are to just try reimplementing the map we have using
OpenStreetMap.

I think I'm learning a lot just by reading this JavaScript of ODOPOI.

It would be pretty easy to write some Python that generated the same sort of
output as their PHP code. And then we could basically use the same JavaScript,
which somehow is really pretty simple.

Zathras: How's your Javascript? Decent enough?
I am ok in Java and PHP. JS is so-so
but I do not mind writing it
python would be cool too, but not really fast as I will need to learn another
language
not that that is so difficult but things are quicker if you are more accustomed
to it already
Nah, that's not a problem. About 2/3 of the languages I know are self-taught on
the job ^_^ (Perl was... interesting...)
same here....

I see, okay!

I think that there are two major paths forward.

1. Try adding a MarkerManager to Google Maps version.
    Pro: Release soon.
    Con: the code will still suck.

1.5. Rewrite *and* use a marker manager, like Fluster
<http://blog.fusonic.net/2009/12/clustering-for-google-maps-v3-with-fluster2/>

2. Switch to OpenStreetMap, and follow the same strategies from JavaScript as
ODOPOI uses.

    I like this option a *lot*. I like the idea of rewriting 
    I think sooner or later you will want to rewrite anyway. Beter do it earlier
in the process
    +1
    Plus then we can ditch using Google. Google Maps is hip but non-free, and
let's be free(dom) where we can be. (-:
    
    BTW, we could ask the ODOPOI person to write down the important tips he/she
used to make that map fast. Maybe send that person an email?
        It seems that the big thing is that the ODOPOI map doesn't show all
markers at once. If you're zoomed-out, it shows just one item, even though if
you were to zoom in you would see more than one.
            The downside of this strategy is you don't get a sense of the
density of people.
                With e.g.
http://blog.fusonic.net/2009/12/clustering-for-google-maps-v3-with-fluster2/ you
do get a nice visual sense.

So if we go down route (2), what are the next steps or milestones? For one, I
have to fix the data import bug that Zathras ran into.
BTW. I have a huge lag on this page and also the chatbox is gone

10 weeks ago I had some email exchange with him. I can copy/paste here?

Sure



Full View
Re: q: odopoi
...
From:
Thomas Cort linuxgeek __A-T--- gmail.com
...
Add to ContactsTo:Me
> 1. Your project is written in PHP. The project with a GoogleMap
> (www.openhatch.org) is written in Python. The main developers would like as
> little "alien" code as possible. Am I right in the assumption that most PHP code
> is related to getting data (download) and extracting data?

No. A shell script gets the data, a C program extracts the data, and 2
short PHP scripts do some database cleanups.

Most of the PHP code is used to serve the map and provide an API for
the Javascript to query the database for marker information.

> 2. Am I right in thinking this is not needed for our project as we do not need
> OSM markers and have our own.

For a pure python solution you'd need to rewrite some of the PHP code,
but there isn't much code and it doesn't do anything fancy.

> 3. If we want to use out own markers we would need a script to geocode location,
> insert longitude/latitude in odopoi DB. Can you give any details on what should
> be into taken consideration when doing this?

The only thing that you need to worry about is the minimum zoom level
that you want each marker to be visible at. In utl/zoom-calc.php,
there is code that precomputes the minimum zoom level for each marker
in the database such that no markers overlap.

> 4. I should write a script to keep odopoi DB and our DB in sync. Any hints on
> what the constraints would be?

I don't know your DB so I don't know how to answer that.

> 5. POI functionality: marker should display info or have an other effect. How is
> this taken care of? I see some XML with descriptions. How is this handled? Can
> it be adapted and extended? Can external code be used for this?

My demo at http://opendatamap.ca has this functionality. It's pretty
easy to change the appearance/functionality with some Javascript.

-Thomas



Right-o. So I guess the most important thing is the util/zoom-calc.php, which is
what chooses *which* markers to show at the various zoom levels. That means that
if we rewrite that, we can choose what to show. For example, a circle with a
number inside! (For now, we should stick to the original behavior, in my
opinion, so that we can ship it sooner rather than later.)

Good to hear that Thomas thinks it would be pretty easy to rewrite in Python.

Back to your milestone sum up:
- fix db importer: would be nice, but at present I have a db table with like
3000+ people in it
probably a lot with no public data, but I hope some usefull stuff too. SO a DB
fix is probably not the highest prio
    okay, good to know
- analyse ODOPOI and see how it works, maybe do a little experiment with it
- look at current implementation in OH
- start coding. Could be tricky due to lack of serious python skills but there
is only one way to find out :P (-:

    yeah! from reading the JavaScript, it seems that their *biggest* difference
is that they do *way* more work on the server.

Hope this was/is a bit informative and a way forward

Yeah! (-:

I've also definitely learned something about ODOPOI and how they make the web
app fast (simple -- don't do much in JS).
 JS is slow 
there is even such a thing as server sided JS. Quite kinky
Never dared to do anything with it

The less you do on the client the less you run into browser compatability issues
too.....

BTW I have have done some coding for geocoding:
Should be easy to adapt to OH

#!/usr/bin/php5
<?php
  $nl="\n\r";

  // connect to database
  $db_host = "localhost";
  $db_name = "oh_milestone_a";
  $db_user = "oh_milestone_a";
  $db_pwd  = "ahmaC0Th" ;

  $dbConnection = mysql_pconnect($db_host, $db_user, $db_pwd);
  if ( !mysql_ping($dbConnection) ) {
    // when timed out reconnect
    $dbConnection = mysql_pconnect($db_host, $db_user, $db_pwd);
  }
  
  $dbStatus = mysql_select_db($db_name, $dbConnection);
  if (!$dbStatus) {
    die ('Unable to select requisted database: '.mysql_error().$nl);
  }
  
  // table  : profile_person
  // fields : id        interested_in_working_on        gotten_name_from_ohloh 
user_id         last_polled     show_email 
     photo   photo_thumbnail         location_display_name  
dont_guess_my_location  location_confirmed      bio     homepage
_url    contact_blurb   photo_thumbnail_30px_wide       expand_next_steps      
photo_thumbnail_20px_wide       email_me_weekly_re_projects 
  $query=sprintf("SELECT * FROM profile_person");
  $resultset = mysql_query($query);
  if (!$resultset) {
    die ('Invalid query: '.mysql_error().$nl);
  }

  // create KML/XML document
  while ($row = mysql_fetch_assoc($resultset)) {
    $userID=$row['user_id'];
    printf($nl."user_id               : ".$userID.$nl);
    $locationDisplayname = $row['location_display_name'];

//for testing purposes
$locationDisplayname = "Philadelphia, PA, United States";

    printf("location_display_name : ".$locationDisplayname.$nl);
    $locationConfirmed = $row['location_confirmed'];
/*
    printf("location_confirmed: ".$locationConfirmed.
"(".strlen($locationConfirmed).")".$nl);
    if ( strlen(trim($locationConfirmed)) > 0 ) {
      $normalizedLocation=str_replace(" ", "+", $locationConfirmed);
    }
    else {
      $normalizedLocation=str_replace(" ", "+", $locationDisplayname);
    }
*/
    $normalizedLocation=str_replace(" ", "+", $locationDisplayname);
    
    printf("Normalized location   : ".$normalizedLocation.$nl);
    if (strlen($normalizedLocation) == 0 ){
      die ('No location specified for used_id='.$userID.$nl);
    }

    // provider : http://www.gisgraphy.com/
    $doc = new DOMDocument();
    $doc->formatOutput = true;
   
$geoURL="http://services.gisgraphy.com/fulltext/fulltextsearch?q=".$normalizedLocation."&placetype=&country=&lang=&format=XML&style=FULL&indent=true&from=1&to=2";
    printf("Geo URL               : ".$geoURL.$nl);
    $doc->load($geoURL);

    $xpath = new DOMXpath($doc);
    $elements = $xpath->query("/response/result/doc/double");
/*    
    foreach ($elements as $elem) {
      printf("Node: ".$elem->nodeName);
      $tagname = $elem->localName;
      printf(" ".$tagname." ");
      printf(" = ");
      printf(($elem->nodeValue).$nl);
    }
*/
    $lat=$elements->item(2)->nodeValue;
    $lng=$elements->item(3)->nodeValue;

    printf("Latitude              : ".$lat.$nl);
    printf("Longitude             : ".$lng.$nl);

  }
  
  mysql_free_result($resultset);
  
  // http://www.zipcodeworld.com/samples/distance.php.html
  //
http://snipplr.com/view/40956/use-google-maps-api-to-get-latitude-and-longitude/
  // 
?>

Advantage: does use a free service
msg515 (view) Author: palhmbs Date: 2010-10-14.04:11:32
BTW - there a 1033 users in the test database - only 998 actually display on the 
map - usernames start from testuser1 --- password is the same for all: testuser1
msg508 (view) Author: palhmbs Date: 2010-10-12.06:15:15
Talked to crschmidt - He recommends sticking with the Fluster2 implementation.
Still happy to try to hack this solution out.
msg507 (view) Author: paulproteus Date: 2010-10-12.05:44:27
We could see if OpenStreetMap + OpenLayers has a decently fast solution for this
sort of thing.

Next step on that: join #openlayers and ping crschmidt (-:
msg504 (view) Author: paulproteus Date: 2010-10-10.04:16:34
Nice, Paul!

Some collected thoughts:

* If you have more than one file to attach, but not a whole lot of files, it's
easier for me if you upload them separately. Your tar.bz2 is okay, though, just
saying for the future.

* http://sourceforge.net/projects/fluster/ is the map clustering thing I'd want
to try deploying.
http://blog.fusonic.net/2009/12/clustering-for-google-maps-v3-with-fluster2/ is
an article about it.

* A KML file alone, I think, won't solve the problem we're having, which is too
many markers on the map at once.
http://blog.fusonic.net/2009/12/fluster2-011-with-significant-performance-improvements/
discusses that problem.
msg503 (view) Author: palhmbs Date: 2010-10-09.18:01:28
I got around to getting a test database working.

It highlights locally why why the current openhatch.org/
people/ is so slow...

Including all files attached for your perusal,
1. a couple are base scripts that I used to generate the 
database.
2. the other couple are the actual databases that I generated.

I'll start work on figuring out how to get your site to 
generate a KML file every time somebody is
added.

IMO - This is the best solution.
msg498 (view) Author: paulproteus Date: 2010-10-08.22:10:46
I think this would be a quick fix, really.

http://www.mail-archive.com/google-maps-js-api-v3@googlegroups.com/msg05553.html
is a discussion started by Raffi (== dithyramble) to figure out what we can do.

One difficulty is that if you're not Asheesh, you don't have access to all the
Person objects in our database. (Locations are stored inside the Person.) That
means that your local install won't have a lot of markers, so it'd be hard to
reproduce the performance problem.

You could help with the data export bug if you want to address that -- see
http://openhatch.org/bugs/issue156
msg420 (view) Author: paulproteus Date: 2010-09-07.17:27:54
Try going to https://openhatch.org/people/?center=India and browsing around,
zooming in and out. It's "too slow to be effective", as rindolf put it on IRC.

Come to #openhatch on IRC and talk about it with us if you want to help fix it.
We could use the help!
History
Date User Action Args
2011-06-01 02:02:34paulproteussetstatus: chatting -> resolved
assignedto: paulproteus
messages: + msg1978
2011-05-28 01:15:20paulproteussetassignedto: paulproteus -> (no value)
2011-05-28 01:08:40paulproteussetstatus: deferred -> chatting
messages: + msg1909
2011-04-11 23:21:26palhmbssetmilestone: 0.11.04 -> 0.11.05
2011-04-11 23:21:20palhmbssetmessages: + msg1475
milestone: 0.11.03 -> 0.11.04
2011-04-03 09:57:36palhmbssetstatus: in-progress -> deferred
messages: + msg1410
2011-03-19 21:01:36paulproteussetassignedto: palhmbs -> paulproteus
messages: + msg1302
2011-02-19 22:11:38palhmbssetmilestone: 0.11.03
2011-02-19 22:08:08palhmbssetassignedto: zathras -> palhmbs
messages: + msg1128
2011-02-19 20:09:59palhmbssetmessages: + msg1122
2011-01-08 21:23:20zathrassetassignedto: palhmbs -> zathras
nosy: + zathras
2010-11-24 22:34:31zathrassetfiles: + open_python_geodecoders.tgz
messages: + msg604
2010-11-15 21:22:40zathrassetmessages: + msg554
2010-11-15 21:01:27zathrassetfiles: + geocoder.php
messages: + msg553
2010-11-15 21:00:39zathrassetfiles: + open_python_geodecoder.tgz
messages: + msg552
2010-11-13 21:47:53zathrassetmessages: + msg539
2010-10-14 04:11:32palhmbssetmessages: + msg515
2010-10-12 06:15:15palhmbssetmessages: + msg508
2010-10-12 05:44:28paulproteussetmessages: + msg507
2010-10-12 04:07:13nelsonsetnosy: + nelson
2010-10-10 04:16:36paulproteussetmessages: + msg504
2010-10-09 18:01:29palhmbssetstatus: chatting -> in-progress
assignedto: palhmbs
messages: + msg503
files: + geocode_openhatch.tar.bz2
nosy: + palhmbs
2010-10-08 22:10:48paulproteussetmessages: + msg498
2010-09-07 17:27:54paulproteuscreate