openhatch

Issue97

Title Trac bug tracker issues: Some trackers don't have a timeline
Milestone 0.11.01 Priority feature
Waiting On Status resolved
Superseder Nosy List paulproteus, pythonian4000
Assigned To pythonian4000 Keywords

Created on 2010-06-16.08:41:01 by pythonian4000, last changed 2011-01-26.14:25:36 by paulproteus.

Messages
msg803 (view) Author: paulproteus Date: 2011-01-26.14:25:35
That looks great.

The code makes lots of sense. It's well-commented, which is great, and the
commit log message is clear.

We should really get in the habit of writing tests for these things, but I'm
okay with delaying that until the async-ification of bug tracker import.

Pushed to master!
msg798 (view) Author: pythonian4000 Date: 2011-01-26.07:33:38
Here is my first go at this. It successfully grabs datetimes for old Trac
trackers via RSS feeds. The only one I haven't enabled is Django; it has so many
bugs that the RSS feed is huge, and for some reason Django generates the entire
thing before serving it, unlike the other ones I tested, meaning that the
feedparser times out and says that there are no updates. Once it has been
updated once then Django would be fine, using RSS like the others, but I don't
know if I can justify writing an HTML scraper just for the initial import. Maybe
if other trackers have the same issue...

Anyway, link!

http://git.jackgrigg.com/openhatch/patch/?id=eb44ba3c200cab26e894a094d019b5ce580d89d7

Just to record it here, an improvement I would like to make is to grab all the
info from the timeline rather than the individual bugs' RSS feeds as well. On
the Django tracker at least, you can show all bug updates on the timeline, not
just "created", "closed" and "reopened". This would speed up things greatly and
reduce network load; however there is no way to determine this programmatically.
I guess it would end up being an optional flag like old_trac is in this patch.
msg715 (view) Author: pythonian4000 Date: 2011-01-08.22:33:51
I'll chuck myself on this. I'll try and separate out the code written and
databases createds so that we only need invoke it for old Trac trackers (e.g.
with a boolean flag in setup).

I'm away until the 17th of January, so will start after then.
msg710 (view) Author: paulproteus Date: 2011-01-08.22:13:13
At this point, I'm willing to go with "Grab a lot more data from the bug tracker".

I owe it to mizmo to import her issues from the Fedora Design Team bug tracker,
and fedorahosted.org shows no sign of upgrading Trac.

Leaving un-assigned for now, in case someone else wants to give it a shot.
msg415 (view) Author: pythonian4000 Date: 2010-08-28.04:29:17
This would require one hell of a hack, and we would either only get ballpark
figures or grab a lot more data from the tracker.

The problem is that the only two sources of data we have for Trac bugs are its
webpage and its CSV file, and there are no CSV entries for the Opened or Last
Modified datetimes. In the newer Trac versions there are links to the timeline
on the bug's webpage, and these require the exact datetime in the link URL,
which is what we grab. Without that URL information, we only have the text to go
on i.e. "7 months ago", which is useless.

I can think of two hacks. First one would be to pull datetime data directly from
the project's timeline. This would give complete and correct data for the Opened
and Last Modified datetimes, but I have no idea how this would work and it would
likely be quite network-intensive. The other hack would be to have the bug
default to the current time when saving the bug. This would only work for the
Opened datetime, and it would sometimes be wrong e.g. when pulling all the bugs
for the first time - but new bugs would have Opened datetimes within the
frequency of importer runs i.e. within 24 hours.
msg414 (view) Author: paulproteus Date: 2010-08-28.02:49:44
The next step is to test our Trac importing code against a 0.10.x bug tracker
and come up with enough hacks to be satisfied with how well it works.
msg408 (view) Author: paulproteus Date: 2010-08-17.21:34:21
I'll email the SSSD Trac admin listed on the wiki.

If upgrading is easy, great! If not, we'll just write some more code. :D
msg361 (view) Author: paulproteus Date: 2010-07-12.18:38:37
Deployed!

For the timeline/bug data thing, do you want to write those emails to Trac 
admins, or do you want me to? (Is there a way we can use e.g. CSV export to work 
around the missing data? If so, then we can avoid sending emails.)

Once we resolve that, we can close this once and for all.
msg352 (view) Author: pythonian4000 Date: 2010-07-07.13:03:47
I think this is where we are at:

Bug timeline: As far as I can see we just have to wait for the trackers in
question to upgrade their version of Trac. Maybe fire off an email to them?
Otherwise done.

as_appears_in_distribution: Now that Trac and Bugzilla both have
extrack_tracker_specific_information methods, we just set this parameter in
there. The supplied patch has a commit which does this. So done.

Project name vs. component: This has been sorted in the code, either by the bug
project format string or in some cases overloading the generate_bug_project_name
method. So done.

OLPC naming: I've looked through the component names again for OLPC, and with
as_appears_in_distribution now set for OLPC I think these are okay. The supplied
patch has a commit which sets the bug project name format string and enables
OLPC bug importing. So done.

Note that this patch needs to be applied AFTER the patch that moves all the bug
import code to customs.
msg344 (view) Author: paulproteus Date: 2010-07-06.14:39:51
Jack, I'm kind of confused. What's the next step for this bug? (Is there one?)
msg302 (view) Author: paulproteus Date: 2010-06-29.14:00:46
re: bug timeline: Yeah, I think you're right now.

re: as_appears_in_distribution: That makes sense!

re: project name vs. component: I like this, good thinking!

For the OLPC/Sugar things, "*-activity" is really the project name, so I think
that it's okay to keep those as the project name in OpenHatch.
msg266 (view) Author: pythonian4000 Date: 2010-06-19.12:53:35
Re: Lack of bug timeline:
That's not the case. Take https://fedorahosted.org/sssd/ for instance - it has a
timeline, but it doesn't have hyperlinks. And the .csv format doesn't include
"Opened" and "Last modified" information. I did notice though that the above
fedorahosted tracker is using Trac version 0.10.5, whereas most of the trackers
from memory were using a 0.11 variant. There are only three trackers at present
with this problem - two are on the fedorahosted tracker and the other is Django
which doesn't display version information for Trac. My guess is it IS related to
Trac version, but without more examples of this issue I can't be certain.

Re: Project name vs. Component:
Following the discussion on IRC, I am adding an extra method to generate the
bug's project name. Rather than overloading it for every Class though, I am
passing in a format string in the Class __init__ which the method then uses to
create the name. The format string is of the form "{component} in {project}",
with "{component}" and "{project}" being replaced as expected. I figured this
allowed the most flexibility in how the end name is given.

Re: as_appears_in_distribution:
Should there be an option to set this for the Trac bugs? For example, it would
make sense to me for this to be the case with Sugar Labs and OLPC. I was
thinking maybe a boolean flag, and if it was True then set
as_appears_in_distribution to self.project_name (not the generated
bug_project_name).

The main issue I have with naming is the untidiness of just using the component.
Case in point - for OLPC, it would make sense too use the component name; but
while some have names which should really be used alone ("glibc", "cerebro",
"olpc-games"), there are some ones which don't make as much sense out of context
(all of the "*-activity" components), and then some which should not be used on
their own as a project name EVER ("display", "kernel", "library"). Thoughts? Ideas?
msg263 (view) Author: paulproteus Date: 2010-06-19.05:38:19
re: lack of bug timeline: Whoa, weird. Maybe it's because those projects have 
the timeline disabled. We can probably get that data by looking at the .csv 
export Trac offers for each bug -- have you tried that?

re: projects not having the same fields: I think it's user-selected.

re: project name vs. component: Some bug trackers (like Sugar Labs's bug 
tracker) really track one project per component. But others (like Tahoe-LAFS) 
make this configurable. So really it should be an option to the Trac bug 
importing class.

re: as_appears_in_distribution: We set this if we import a bug from Fedora or 
Ubuntu where the *project* is really e.g. xserver-xorg. So it's a bug in the X 
server, as the X server appears in Fedora. In that case, we put "Fedora" in that 
column. Feel free to write some documentation in the search/models.py so that 
this is clearer...
msg253 (view) Author: pythonian4000 Date: 2010-06-16.11:58:47
Unless anyone else has suggestions for the third point below, I'm going with this:

<paulproteus> [14:12] It seems the Trac importer for Tahoe-LAFS isn't importing
things right.
<paulproteus> [14:12] In particular, the project name should be "Tahoe-LAFS" but
instead it's pulling the project name from something like the Trac "component"

So I'm just going to change the Trac bug importer to use the project name rather
than the component name.
msg252 (view) Author: pythonian4000 Date: 2010-06-16.08:41:00
This is a list of issues that I encountered with the Trac bug trackers. I have
tried a workaround for some, and have disabled the bug tracker for others.

* Some trackers don't have a hyperlink to the bug timeline for their 'Opened'
and 'Last modified' fields (e.g. Django). The hyperlink normally present has the
exact date needed as part of the <a> tag, and without it we cannot determine
these dates (since the user-displayed text only is accurate to within months or
years for longer bugs). This means that those trackers break the Trac bug data
importer, and have to be disabled. I am unsure if this is a user-selected or
Trac version-related issue. If it is user-related, possibly drop the tracker
admin an email politely asking why they choose to not have a bug timeline. If
related to the Trac version, possibly drop the tracker admin an email politely
asking if they could update their version of Trac.

* Some trackers don't have as much information on the bug tickets as others e.g.
Angband doesn't have the 'priority' field (though it does have a 'ticket' field
which seems similar-ish). Again, not sure if this is user-selected or related to
the version of Trac they are using. This doesn't seem too major to me at
present, since if this is resolved later on the bug data will be updated anyway.

* I noticed that the project name for the bugs is set to the name of the
component it is part of. While this is true, I have found several projects that
don't have very "useful" component names. I also saw that there is a bug data
field called 'as_appears_in_distribution' - what is the current purpose of this?
Would it be useful to set this to the name of the overriding project the
component belongs to?

I will add more things as I find them.
History
Date User Action Args
2011-01-26 14:25:36paulproteussetstatus: need-review -> resolved
messages: + msg803
2011-01-26 07:33:38pythonian4000setstatus: chatting -> need-review
messages: + msg798
2011-01-08 22:33:51pythonian4000setassignedto: pythonian4000
messages: + msg715
2011-01-08 22:13:13paulproteussetmessages: + msg710
milestone: 0.11.01
2010-08-28 04:29:18pythonian4000setmessages: + msg415
2010-08-28 02:49:45paulproteussetmessages: + msg414
2010-08-28 02:49:20paulproteussetfiles: - set_as_appears_in_distribution_and_fix_OLPC_naming.patch
2010-08-17 21:34:21paulproteussetmessages: + msg408
2010-07-12 18:38:37paulproteussetmessages: + msg361
title: Trac bug tracker issues -> Trac bug tracker issues: Some trackers don't have a timeline
2010-07-07 13:03:47pythonian4000setfiles: + set_as_appears_in_distribution_and_fix_OLPC_naming.patch
messages: + msg352
2010-07-06 14:39:51paulproteussetmessages: + msg344
2010-06-29 14:00:47paulproteussetmessages: + msg302
2010-06-19 12:53:35pythonian4000setmessages: + msg266
2010-06-19 05:38:19paulproteussetmessages: + msg263
2010-06-16 11:58:47pythonian4000setstatus: unread -> chatting
messages: + msg253
2010-06-16 08:41:01pythonian4000create