openhatch

Issue368

Title Bug objects should know how to refresh themselves
Milestone 0.11.03 Priority urgent
Waiting On Status resolved
Superseder Nosy List paulproteus, pythonian4000
Assigned To paulproteus Keywords

Created on 2011-04-05.22:34:02 by pythonian4000, last changed 2011-04-09.06:49:57 by paulproteus.

Files
File name Uploaded Type Edit Remove
fix_for_patch_0007.patch pythonian4000, 2011-04-08.22:57:26 text/x-patch
patches.tar.gz paulproteus, 2011-04-08.20:17:16 application/x-gzip
Messages
msg1442 (view) Author: paulproteus Date: 2011-04-09.06:49:56
This backend change is pushed now! Thanks Jack for the fix; that was a total
n00b mistake on my part.
msg1441 (view) Author: pythonian4000 Date: 2011-04-08.22:57:26
I'm liking this. Aside from the fact that nearly all of this code will be moved
around, renamed etc. with the async stuff, I think it's great.

Your patch series is all fine, except for (after applying all patches) line 554
in search/models.py:

return cls(pk=self.bug_tracker_model_pk).create_class_that_can_actually_crawl_bugs()

cls(pk=self.bug_tracker_model_pk) results in an instance of the model class that
happens to have the same pk as an existing instance, rather than the actual
existing instance as you require. Of course, this isn't a problem because you
never save this instance, but there is absolutely no other data in there. Since
bot existing database tracker models have a manager named "all_trackers", the
modification to line 554 in the extra patch I uploaded fixes everything.

(On a related note, it's a very weird feeling to be on the other side of a patch
review! =D )
msg1439 (view) Author: paulproteus Date: 2011-04-08.20:17:16
Hey Jack,

I am running into some issues implementing this, and instead of banging my head
against it for any longer, I thought I'd ask if you could help.

The final commit in the series is the broken one. Can you give it a look? The
commit log message explains what I was trying to do.
msg1438 (view) Author: paulproteus Date: 2011-04-08.15:09:28
I'm working on this now! And posting to [Devel] about it.
msg1430 (view) Author: pythonian4000 Date: 2011-04-05.22:34:02
As part of keeping the Bugs in the database all fresh, we need to be able to get
all Bug objects that at the current point in time are stale. This was done
previously in segments via the tracker class that created and updated those
Bugs, but due to differences in ordering there would always be bugs that were
fresh by a few seconds and got missed, which now causes Nagios to spam us with
messages about stale Bugs.

As part of the move to asynchronous bug importing, we want to move this query to
the Bug model. The issue currently though is that, while we can easily obtain a
list of stale Bugs with some simple QuerySet manipulation, there is no way of
finding the code that updates them.

A solution to this is to have a ForeignKey in the Bug model that points to the
database model for the tracker that created it. This would mean that Bug objects
could be easily refreshed when stale, and then Nagios would stop complaining all
the time (until we move over to the asynchronous system and get a heap of bugs
that is!). In the case of trackers that have to be hard-coded (read: NEARLY
EVERY Bugzilla tracker ~_~ ), once the asynchronous code is set up they will be
represented by a database model as well, which just has a string pointing to the
hard-coded class. But for now, they can be a small minority not covered in this
issue.

Below is the relevant discussion from PiratePad between pythonian4000 and
paulproteus:

I see this as being in two parts:
 * The code looks for stale tracker queries to update.
 * The code looks for stale bugs to update. <-- but Bug objects do not typically
know how to update themselves, iirc <-- and this needs changing, possibly by
adding a ForeignKey to the tracaker object that created them (since this is the
only way Bugs are introduced to the system). but some bugs are created through
hard-coded classes. I'd be okay with storing those class names as strings, or
something (and there will probably always be some of those for e.g. Bugzilla
instances) I do agree there, but I struggle to work out a better way to manage
this asynchronously (otherwise we need  to keep the async code individual to
each specific tracker (subclassing, where possible? (not by my understanding of
Twisted -- take a look at "class ProfileImporter" in
mysite.customs.profile_importers-- we only override methods as needed, and the
Twisted callback is usually a bound instance method)
  * I'm trying to remember my reasoning: If I want to update all stale URLs,
this information is stored along with the tracker required to update them. But
if I want to update all currently stale bugs, I can (without knowing anything
else) get a list of all such bugs. But it is tying them to the required subclass
of a subclass of a general async tracker instance that is the problem,
especially if such instances are auto-generated on-the-fly.
   * make the class_name a string, and try to __import__() and instantiate that
   * and that doesn't work, try to decode the class string as some sort of hint
as to how to create it from the database
   * So some class_name attributes look like "__TracTracker__,id=13" (those are
the ones you get from the database) and others look like
'mysite.customs.bugtrackers.bugzillas.MozillaDotOrg' (that's one to instantiate
from code)
   * downside, it's not then a true foreign key
    * Can't we just drop Bugzilla support, since this is all its fault anyway? ;P
    * We could also create a special BugzillaTrackerThatIsHardcoded model, and
that could have an attribute for the string form of the class name. Then
Bugzilla trackers that are hard-coded appear in the DB, too, which is good for
the UI. But when we *use* them, we know to grab special methods from the code.
     * I like the above!
     * So do I!
      * yay!
      * There is also a single Google model hard-coded but I believe that is
because it has three tags meaning 'bite-sized'. But if we can work out how to
store this in the database then it fixes that.
       * And we can always do the same thing as Bugzilla for other tracker
types, but Bugzilla is the one that NEEDS it ^_^
        * precisely. Want to remove this conversation and turn it into a
decision, then sprinkle that into the thread above?
        * Aww, reached indent limit =D
        * I say do the decisioning and sprinkling first, then remove this later.
        * kay
History
Date User Action Args
2011-04-09 06:49:57paulproteusunlinkissue363 blockers
2011-04-09 06:49:57paulproteusunlinkissue260 blockers
2011-04-09 06:49:57paulproteussetstatus: in-progress -> resolved
messages: + msg1442
2011-04-08 22:57:26pythonian4000setfiles: + fix_for_patch_0007.patch
messages: + msg1441
2011-04-08 20:17:17paulproteussetfiles: + patches.tar.gz
messages: + msg1439
2011-04-08 15:09:28paulproteussetstatus: unread -> in-progress
messages: + msg1438
2011-04-05 22:41:27pythonian4000linkissue260 blockers
2011-04-05 22:39:37pythonian4000linkissue363 blockers
2011-04-05 22:34:02pythonian4000create