I agree that user privacy is paramount, and people have thought of various whitelist rules and other automatic approaches to filter out personally identifiable information (PII), but they tend not to work once you dig into the data.
One caveat on the link Chris provided: I was only looking at "unsuccessful" queries. Felix seems to be after all queries—and there are plenty of successful queries that give good results that I didn't consider. All the queries that match titles and redirects would dilute (but not at all eliminate) the queries that cause privacy concerns.
I second Erik's suggestion of the Discernatron data. It's not perfect and there's not a lot of it, but it's available.
A moderate effort way to mine for queries would be to get volunteers to let you have their Wikipedia search history. In Chrome, for example, you can get an extension that will let you view all of your browser history at once (rather than one page at a time). I searched wikipedia special:search, clicked "All History", "select all" and pasted to a text file. I was able to gather almost 1200 queries in less than a minute. My home computer yielded 130 or so (that's probably more typical—I search a lot at work, for work). 20 volunteers would get you an admittedly biased sample of ~2,000 queries. It's not a great source, but it's something.
Such a manually mined corpus would have the advantage of being actual human queries. We get a lot of bots, and a lot of queries that aren't something you would necessarily want to optimize to improve human users' experience.
—Trey