I have some significant experience working with our in-house, sandboxed AI. For example, I have had our AI produce a series of Automated Unit Tests (to be differentiated from standard 'unit tests'). Creating of AUTs require that they follow the template Arrange-Act-Assert.
What I got back looked pretty good, and at first glance, seemed like it may have saved our development team hundreds of hours of development and testing....
....except....
... it only provided the bare minimum of AUT code. The stuff was not sufficient to test our business use-cases. Sure, it provided my asked-for 80% code coverage... but poorly.
The amount of prompt-engineering it would take to produce good AUTs would require a book about the thickness of 'War and Peace'. And you cannot verbally prompt-engineer, since invariably you will either leave something out or you will need to revise the prompt. This means it really has to be written.
So, to your point, training it to catch all the quirks is an extremely daunting and time-intensive task, and that may prove impractical. Furthermore, that's even if you are AWARE of every quirk. Your development team may not.
I know you pretty well, you like to be right. I ask, however, that you accord me subject-matter-expert deference in this particular topic, as I would accord you the same deference in, say, operation of a nuclear plant... which I believe you have SME experience in.
“The amount of prompt-engineering it would take to produce good AUTs would require a book about the thickness of ‘War and Peace’.”
I am fully aware that most of the processing at that “AI Islands” is for trading the models.
My ex also made good money working on old COBOL systems full of quirks.
I am fully aware that most of the processing at that “AI Islands” is for trading the models.
TRAINING!
I corrected that in the preview once and it changed it back again!
I recently finished a contract at lockheed. We insisted on 100% code coverage. It was not that hard to do, and improved the code quite a lot.
At our first integration meeting everything worked on the first try