I like to hone my testing skills by trying different techniques. Sometimes the project I happen to be working on serves well as a sandbox for this, but not always. I also like to write about testing techniques using examples that other people can try. So it’s convenient to have an easily accessible application that I can write about.

I’ve been working on generating test data like long strings and large numbers with the venerable perlclip tool and a partial perlclip port to Ruby that I call testclip. I’m curious what you think about the ethics of testing in each of these real situations below.

1) Sorry, Wikipedia

I was having a discussion with a contact at Wikipedia, and I wanted to illustrate how I use bisection with long strings to isolate a bug. I wanted to find a bug on Wikipedia itself, so I tested its search feature. I considered the risks of testing on their production system – though long strings are fairly likely to find a bug, I couldn’t remember ever seeing them cause a catastrophic failure. So I judged that it was appropriate to continue. I think my contact was aware that I was testing it, but I didn’t explain the risks and he didn’t grant explicit permission.

Wikipedia gave me an ideal example, with a minor failure on a moderately long search string, and a more severe error with a much longer string (I went up to about 10,000 characters). I started writing up my analysis. As I went back to reproduce a few of the failures again, I noticed a new failure mode I hadn’t noticed before. Rather than isolate this new failure, I decided to stop testing. It seemed unlikely that my testing was related to this, but I wanted to make sure.

When I got in touch with my contact at Wikipedia, I found out that I had caused a major worldwide outage in their search feature. I did a lot of reflection after that – I really regretted causing this damage to a production system.

Was it ethical for me to run these tests?

2) Please test my site

I listened in to the virtual STAR East 2016 conference, which had a Test Labs activity that was accessible for virtual participants. I didn’t really understand what the activity was, but I did see that we were invited to test a particular open source application, CrisisCheckin, and report bugs on GitHub. An instance of the server was set up for testing. I used this as motivation to add a feature to testclip to bisect on an integer value in addition to the length of a counterstring.

It was nice to have a test instance of the system. I still considered the possibility that my testing could cause an outage that would affect the other people who were using the test instance. I decided to take the risk. The long strings I tested with made all similar types of data slightly more difficult for all users to read on the page, and in some cases the user interface didn’t provide a way to delete the data, so I did have a small impact on the shared system. I didn’t cause any outages that I was aware of.

There were instructions on GitHub for setting up a local instance of the software, which would be ideal in terms of not interfering with anyone else’s use of the site, but I chose not to take the time to do that.

Would you agree that my testing in this case was ethical?

3) It’s popular, so I’m picking on it

I’m working on writing an example usage of perlclip now, where I chose to pick on the main Google search field. I tested with a search string up to 1000 characters long, which finds a minor bug, but doesn’t seem to affect the availability of the system.

Is it ethical for me to do this testing, and publish something that encourages others to do the same?

A common reaction to these questions I’ve heard is that it’s the responsibility of the owners of the web site to make the site robust, so it’s not my fault if I’m able to do something though the user interface that breaks it. I don’t think it’s that simple.

I perused the Code of Ethics for the Association for Software Testing, and I didn’t see anything that directly addresses this question, though it’s clear on what to do when we do cause harm. At least for example 1 and 3 here, I’m not using these services for the purposes they were intended for. The Terms of Service for Google don’t actually say that I have to use it for the intended purpose. The Wikipedia Terms of Use, though, do talk about testing directly, which is expressly allowed in some situations. This testing is not allowed if it would “…unduly abuse or disrupt our technical systems or networks.” The terms also don’t allow disrupting the site by “placing an undue burden on a Project website.” So clearly it’s bad to cause an outage, but difficult to assess the risk in advance of an outage happening.

It’s much more clear that it’s not okay to conduct security testing without explicit permission. Security testing includes looking for denial of service vulnerabilities. But my intentions for doing long string testing generally aren’t to find vectors for a denial of service attack, even if that’s what happened in one case.

So how much caution is warranted to mitigate the risks of long string testing on production servers?

If the conclusion is that we should never test with long strings in production (at least without permission), then we have to look for safe places to practice our testing skills. Running a personal instance of an application server is one option, but that isn’t easy for a everyone to do. Another option is having a public sandbox that we can access, as we have with CrisisCheckin. There are several cases of servers set up for educational purposes, either associated with exercises in a book or with a training class. Many of those, though, are only intended for customers who bought the book or the class. I think I’ll shift my focus to native applications that run locally and are easy to install. My head is in the web so much, I forget that there is such a thing as a local application. 🙂