What are the rules we're going to play by? These must be well-defined before the game starts, so that test effectiveness can be determined. Recently a customer remarked that a legal hack was successful but that an intrusion detection system had been able to detect the attack and in a real situation would probably have been able to stop the attack.
The customer may be right, but what is important here is to specify that this is part of the test in advance. In most of the legal hacks only control measures used for prevention are tested. In these tests no effort is made to remain undetected and a number of key persons in the IT department is informed to stop response actions as a result of detection. If not only prevention but also detection and response are to be tested, the test team will try to evade intrusion detection systems. Also the number of people that will be informed about the test will be considerably reduced. Only a single person in the response escalation chain should be informed to stop the escalation when it becomes too expensive.
Such a test is sometimes called a "Covert Test". Because of the elaborate test approach, the number of employees working on the response and the possibility of undesired response actions (switching off a production system) this type of test can be very expensive and is less frequently used. What can be done in addition to a test of the prevention control measures is inspecting the log files of host systems and intrusion detection systems to verify that the attack was indeed detected.