15
Warning: Don't let your AI training data slip through Slack like we did
My team at a startup in Austin lost 3 months of work because someone pasted proprietary code into a public Slack channel and the bot we trained on it started leaking it in responses. Has anyone else had a close call with sensitive data sneaking into their models?
2 comments
Log in to join the discussion
Log In2 Comments
harper_smith2d ago
Honestly that's rough, I feel for you. Something similar happened at my old place where someone dumped a whole client contract into a shared training folder by accident. The model started quoting billing rates and discount structures to random users. We only caught it when a customer said "thanks for the insider tip" and we had to retrain from scratch. The worst part is how fast it spreads, one wrong paste and you're scrubbing weeks of data.
2
matthew_reed503d ago
...and then the bot started suggesting our competitors' product names because someone pasted a comparison doc in the wrong channel. That's brutal though, three months of work gone because of one copy-paste. I've been paranoid about this since a buddy of mine accidentally trained a customer support bot on a spreadsheet full of internal pricing data. The bot kept quoting "the real cost is probably around..." and then listing our profit margins. We had to take it down and rebuild from scratch. At least your team caught it early, mine didn't realize until a customer asked why we were "so transparent" about our markup.
1