Monday, January 28, 2008

Metadata: Some practical advice

Baseball

"I ain't ever had job. I just always played baseball." - Leroy Robert "Satchel" Paige

Hangman's Noose

As I mentioned on Friday, it seems like hardly a day goes by without some idiot insulting African-Americans with a display of a hangman's noose. The latest incident involves a "stupid little prank" that occurred at the construction site for the new baseball park in Washington, DC. See http://www.washingtonpost.com/wp-dyn/content/article/2008/01/25/AR2008012503068_pf.html
- and -
http://www.washingtonpost.com/wp-dyn/content/article/2008/01/24/AR2008012403173.html

DOL to Propose FMLA Regulatory Changes

Late last week, DOL officials reported that they have forwarded to OPM proposed new regulations that eventually will be published for public comment. Apparently, the proposals address, among other issues, the notice that employees generally would be required to provide to employers requesting a leave before actually taking the leave. Current regulations allow employees to take off for two days before even requesting FMLA leave. In addition, apparently the proposed regulations would permit employers to require health care providers to recertify annually that an employee has a serious health condition. Current regulations provide that healthcare providers can submit a multiyear certification of a serious health condition. As of the weekend, no one yet had an actual copy of the proposed regulations. Once we have them, we will comment further.

New Leave Law in Defense Authorization Act


The Defense Authorization Act provides that family members would be allowed to take up to six months of unpaid leave to care for wounded military personnel. The Act also would allow for employees to take up to twelve weeks of unpaid leave "for any qualifying exigency" related to a family member's call-up to active duty or deployment. See http://www.govtrack.us/congress/bill.xpd?bill=h110-4986.

The Power of One


In the last few weeks, there has been a hue and cry about the power of words alone to inspire change. Over the past few days, I have been reading a wonderful book published in 1941 by A.J. Cronin entitled The Keys of the Kingdom. Apparently within four months of the original publication, the book sold more copies than the publisher, a major publishing house, had sold since its foundation shortly after the turn of the century. I got interested in the author, and started to do a little research. One of his books, The Citadel, resulted in the establishment in the United Kingdom of the National Health Service. Wikipedia reports that "the popularity of his novels played a substantial role in the Labour Party's landslide 1945 victory." Having said that, I cannot recommend more highly the book that I am reading, The Keys of the Kingdom. Great read.

Corporate Social Responsibility


In the January / February 2008 issue Foreign Affairs, Klaus Schwab has an interesting article on CSR entitled "Global Corporate Citizenship: Working with Governments and Civil Society." Mr. Schwab is the Executive Chair of the World Economic Forum which just met Davos, Switzerland. See http://www.foreignaffairs.org/20080101faessay87108/klaus-schwab/global-corporate-citizenship.html.

Metadata

I thought that I would share with our readers, a very thorough analysis of the metadata problem that a colleague recently sent to me.

Starting with the facts in your email, it appears that you are interested in understanding the obligations of the producer and also the recipient of a document containing metadata that is exchanged during the negotiations (including grant requests and proposals) of a business transaction. With this in mind, my initial observation of your use of the term "metadata" is that you are concerned primarily with the feature in Microsoft's Word that is commonly known as "track changes," but should also include the data within the "properties" tab and "comments" of a document.

Reflecting first upon the recipients' obligations with respect to receiving a document that contains metadata available for review, currently I am not aware of any law that prohibits review of the metadata by the recipient. I am aware of a few state bars and an initial position paper from the ABA (that was later revised) that would suggest it is unethical for an attorney to review and/or "mine" for such metadata, in a business document, but the majority of the state bar associations would suggest that no ethical violation has occurred in such cases.

Since it is too difficult for a recipient to determine whether the "track changes" metadata was intentionally available for review, my opinion is that the recipient does not have either a legal or ethical obligation to refrain from reviewing the metadata available in these documents, nor does he or she have the responsibility of informing the producer of the availability of such data. I would add, however, that it has been a courtesy practice of mine to inform the producer in obvious cases that such metadata exists in their files.

From a producer's perspective, while from a practical and strategic perspective it may be devastating, I do not know of any law (other than the foregoing bar association opinions) that directly prohibits a producer from distributing documents with metadata. There may be, however, some ethical rules that may apply here, such as Rule 1.6 concerning the confidentiality of client information in those cases where the metadata contains client information. This rule along with the varying levels of knowledge among attorneys regarding metadata, and of the tools used to prevent the "leakage" of metadata, suggest that the vehicle that will be used in the near future is malpractice to establish a minimum standards on what steps should be taken by attorneys to mitigate the likelihood of distributing a document with metadata, which frankly is how these issues should be analyzed. I am certain that this standard will evolve in a manner similar to the standard used by attorneys in adopting the use of the internet, which was once feared for possibly disclosing confidential client information during the transmission of data and has now become commonly used by nearly all attorneys in their practice.

However, having said that, on a practical side, the easiest way to clean any of these files is to run them through one of the third party cleaners to remove any unwanted metadata and to "accept all" changes if track changes is not intended to be included, as well as checking the document for comments. These "scrubbers" are software programs that may be initiated either manually by a user or automatically by a system and it removes known metadata, such as the author, hyperlinks, track change information, comments etc. Most organizations will typically use these scrubbers to remove metadata that can be found in the "properties" tab of a document (and rely on the decision of the attorneys with respect to other forms of metadata such as comments and track changes), before transmitting a document to the other side.

While I do not endorse any scrubber product over any other, you may find that some work better than others in your environment, and they have different features that you may or may not find helpful. It is important to note, that while stripping down a document to the bare text will leave a document free of metadata, it will, in the most extensive examples of scrubbing, leave only raw text and thus, lose most, if not all of its formatting. Since we all need to have documents formatted in a coherent human readable form, there will always be a small amount of metadata that could be retrieved by someone, but the changes to a document that you refer to can be stripped out by the use of one of these scrubbers, and this combination of "accepting" all changes and using the software scrubber (to eliminate metadata stored in the properties field and, if selected, comments) can leave your formatting of the document untouched.

While the process you refer to of converting a document into a pdf file will remove most forms of metadata (except for track changes that have been left visible), the pdf produced by your method will create its own metadata viewable in the properties menu of the file. So in a situation that is highly sensitive, I might use this process and then run the pdf file through one of the scrubbers to eliminate the file property information. I agree that this process (as it does with discovery in litigation) results in a less efficient process when making revisions between parties during negotiations, and often frustrates business participants who want to make changes to the documents.

In the context of redaction, do not use any of the more advanced features of Adobe Acrobat like the redaction feature, because the redaction can be stripped out and then everything you redacted becomes visible. If trying to redact a document, I would use the NSA document and follow the directions on it for the best and wisest approach.

I hope I have clarified the issues for you, and while your colleagues had some valid suggestions, the answer to the metadata question is that it depends on the situation, the content of the metadata and what the desired result is to know which solution or combination of solutions is the best.

One note that might be of interest to you, is that the newest version of Microsoft Word (Word 2007) has included features that help to deal with the ever growing concern over metadata by including a stripping function that gets rid of the most common metadata attached to documents.