CPS-2055: Improve WoWs around Bug tickets

References

Assumptions

<optional, assumptions are like decision made up front ie. everyone agrees on the answer but they are important to mention>

Assumption

Notes

Assumption

Notes

1

Decisions will be agreed with stakeholders

 

Issues & Decisions

Issue

Notes 

Decision

Issue

Notes 

Decision

1

Acceptance criteria for bug tickets

What should we do if a bug not reproducible? > Reject

CPS team will decide whether to accept bug ticket based on whether enough information has been provided to be able to reproduce/resolve the issue.

2

Standard Bug Reporting Template

Minimal information useful for all bug tickets

Create a wiki page containing bug report template, as a guideline for writing bug report. Fields are optional, depending on context. @Daniel Hanrahan will create this page.

3

Handling of sensitive information such as logs and heap dumps

Need to protect IPR of commercial contributors

Will provide guidelines for what info can/cannot be shared in open source. Client-specific internal Jira ticket can be used to share artifacts, where applicable.

4

Bug reporting guidelines

A wiki page showing how to write a good bug report. Note there are many existing ones such as:
https://bugzilla.mozilla.org/page.cgi?id=bug-writing.html

Can be done as part of #2 (bug report template)

<Note. use green for closed issues, yellow for important ones if needed>

Background

It has been observed that some bugs (particularly performance-related ones) can take weeks to resolve, and often cannot be reproduced by the CPS dev team.

This proposal is to improve ways of working around bugs to reduce time spent.

Bug Reporting Template

Many bug tickets lack basic information such as what version was tested.

It is proposed that in addition to CPS FSA Template, a Bug Reporting Template be created. At a minimum, the following information should be provided:

  • Full description of the issue

  • Affected version(s)

  • Expected behaviour (what is the requirement?)

  • Actual behaviour

  • Impact - this is important for setting priority

  • Steps to reproduce - ideally a Minimal reproducible example

  • (Optional) Attached artifacts: Screenshots, Logs, Test data, etc.

  • Environment

  • Attached the corresponding IDUN Jira

There has been much confusion about the number of CM-handles the bug reporters are testing with. In many tickets, phrases such as "80k deployment" is used, but in some cases this was 6k CM-handles, and 20k for others!

  • Bug report should include how many CM-handles are registered

Performance Bugs

Performance issues are significantly more difficult to reproduce, being very sensitive the user's deployment/environment.

  • Environment, including available resources in the deployment (memory and CPU cores, number of application instances)

  • What is the load on the system (how many concurrent operations, etc.)?

  • Measured CPU and memory consumption

  • For Out Of Memory Errors (OOMEs): Attached heap dump - I think this should be a requirement moving forward

Client-reported bugs

Links to internal Jira tickets, Jenkins pipelines, should be provided where applicable. Logs etc. could be shared securely there.

Acceptance Criteria for Bugs

We need to agree on reasonable criteria, e.g.

  • That CPS team has enough information to reproduce the bug

Sharing artifacts such as Logs in the Open Source

Some bugs being reported often require sharing information such as logs, heap dumps, etc. This needs to be done in a way that IPR is protected, since CPS is developed in the open source.