CPS-2055: Improve WoWs around Bug tickets
References
Mozilla guide to bug writing: https://bugzilla.mozilla.org/page.cgi?id=bug-writing.html
Link to new CPS Bug Reporting page: Bug Reporting Guidelines for CPS
Assumptions
<optional, assumptions are like decision made up front ie. everyone agrees on the answer but they are important to mention>
Assumption | Notes | |
---|---|---|
1 | Decisions will be agreed with stakeholders |
|
Issues & Decisions
Issue | Notes | Decision | |
---|---|---|---|
1 | Acceptance criteria for bug tickets | What should we do if a bug not reproducible? > Reject | CPS team will decide whether to accept bug ticket based on whether enough information has been provided to be able to reproduce/resolve the issue. |
2 | Standard Bug Reporting Template | Minimal information useful for all bug tickets | Create a wiki page containing bug report template, as a guideline for writing bug report. Fields are optional, depending on context. @Daniel Hanrahan will create this page. |
3 | Handling of sensitive information such as logs and heap dumps | Need to protect IPR of commercial contributors | Will provide guidelines for what info can/cannot be shared in open source. Client-specific internal Jira ticket can be used to share artifacts, where applicable. |
4 | Bug reporting guidelines | A wiki page showing how to write a good bug report. Note there are many existing ones such as: | Can be done as part of #2 (bug report template) |
<Note. use green for closed issues, yellow for important ones if needed>
Background
It has been observed that some bugs (particularly performance-related ones) can take weeks to resolve, and often cannot be reproduced by the CPS dev team.
This proposal is to improve ways of working around bugs to reduce time spent.
Bug Reporting Template
Many bug tickets lack basic information such as what version was tested.
It is proposed that in addition to CPS FSA Template, a Bug Reporting Template be created. At a minimum, the following information should be provided:
Full description of the issue
Affected version(s)
Expected behaviour (what is the requirement?)
Actual behaviour
Impact - this is important for setting priority
Steps to reproduce - ideally a Minimal reproducible example
(Optional) Attached artifacts: Screenshots, Logs, Test data, etc.
Environment
Attached the corresponding IDUN Jira
There has been much confusion about the number of CM-handles the bug reporters are testing with. In many tickets, phrases such as "80k deployment" is used, but in some cases this was 6k CM-handles, and 20k for others!
Bug report should include how many CM-handles are registered
Performance Bugs
Performance issues are significantly more difficult to reproduce, being very sensitive the user's deployment/environment.
Environment, including available resources in the deployment (memory and CPU cores, number of application instances)
What is the load on the system (how many concurrent operations, etc.)?
Measured CPU and memory consumption
For Out Of Memory Errors (OOMEs): Attached heap dump - I think this should be a requirement moving forward
Client-reported bugs
Links to internal Jira tickets, Jenkins pipelines, should be provided where applicable. Logs etc. could be shared securely there.
Acceptance Criteria for Bugs
We need to agree on reasonable criteria, e.g.
That CPS team has enough information to reproduce the bug
Sharing artifacts such as Logs in the Open Source
Some bugs being reported often require sharing information such as logs, heap dumps, etc. This needs to be done in a way that IPR is protected, since CPS is developed in the open source.