There appear to be multiple WordPress powered sites that are performing an DB->XML dumb of the articles and subsequent pages. The comments section includes originating IP address, datetime, E-Mail address, homepage, etc. These entities are traditionally not exposed to the anonymous Internet via WordPress. Since the XML dump is structured it’s quite easy to harvest this data.
More alarming is the volume of sites freely exposing this. I’m not certain of the root cause but perhaps it’s related to an upgrade procedure. Google is happily indexing and caching these dumps as it appears they’re created in the attachment system (URI ?attachment_id=\d+) with an HREF to the actual dump.
A simple Google search below will return a multitude of sites. Perhaps someone on the WordPress side can comment on this behavior?
Google Query – inurl:uploads ".xml_.txt" wordpress
Another tasty query seems to be harvest of the MySQL database backup:
Google Query – inurl:uploads ".sql.txt" wordpress
Finally, I don’t use WordPress so I really can’t comment on severity. At a minimum I believe this violates an implied level of privacy when commenting on articles powered by WordPress — the E-Mail address and IP information is exposed in these DB dumps.
John "Be Nice" Jacobs