Карта сайта
Версия для печати

Метаданные: определение и практические примеры

13 сентября 2013 Зачастую поиск в Интернете по ключевому слову «метаданные» приводит к определению этого понятия: метаданные – это данные о данных. Но это лишь верхушка айсберга. Читайте, почему метаданные, как часть процесса управления данными (Data Governance), ценны для компании, а также узнайте, почему и как нужно сохранять и экспортировать (preserve) метаданные. Рекомендуем! (Материал опубликован на английском языке)
What is Metadata?

Do a search on “Metadata” on the internet and the most common definition you may find is “Metadata: The data of data.” This is true, but what makes up this “data” of data?  Think of the title of your car. From the title of your car you can find out the make, model, owner, year.  This is your car’s “metadata.”

Now, let’s look at the metadata of a Microsoft Word document.  The metadata in a Microsoft Word document is more commonly known as the document’s “properties.”  Document properties should not be confused with the program’s properties, where a user can change the settings of the application. A document’s properties lists its metadata.  Some examples of metadata in a Microsoft Word document are your name/initials, your organization’s name, file type, document versions, file location, create date, last modified date, editing time, number of pages and total size of the document.  This information stays with the document and is used by your computer and other software as a reference guide.

For another perspective, let’s look at the metadata from a 2010 Microsoft Exchange e-mail message. I’m sure by now you could point out some of the obvious types of e-mail metadata: who the message was from, who the message was originally sent to, the date, subject and body, but there’s more! The Exchange server, the originating IP address and the message ID are just a few more examples of what important metadata is attached to an e-mail. Altogether there are 26 different types of metadata attached to one e-mail message. To see them all visit this website.

Why do we preserve Metadata?

First, what is metadata preservation? Metadata preservation is the ability to save and export the contents and metadata of a document or piece of data. This task has gained focus in recent years due to electronic communication becoming so vast. According to a report by The Radicati Group, the number of worldwide e-mail accounts is expected to hit 4.1 billion by 2015. (Radicati & Hoang, 2011) Of the entire world’s e-mail population, the corporate e-mail population makes up 25 percent.  That’s over 1 billion accounts by 2015!

Just like the information from the title of your car, metadata from electronic data can be changed or altered. E-mails and other data are evidence in legal cases. And, with all the e-mail and corresponding metadata to collect, legal teams and IT departments scramble to do so without disturbing or altering the crucial metadata needed to prove the legitimacy of the data.

How do we Preserve Metadata?

If Legal came to you for a copy of a Word document, how would you go about saving it so that the metadata would be properly preserved? Using just a Windows OS to copy and save a file would not be a sufficient way to preserve the metadata. When copying a file using Windows, the create date or “Time Stamp” of the copied file will change to the current date, thus altering the file’s metadata. There are software tools that will safely collect this type of data without disturbing the metadata.

What about the e-mail messages that Legal is asking for? There are tools available that maintain data integrity by performing read-only operations on the source files throughout the collection process – no editing functionality, just the ability to browse, search, preview, and export.

According to Michele Lange, Director of Thought Leadership, Kroll Ontrack, “While the term metadata is not explicitly included in the Federal Rules of Civil Procedure, metadata is clearly included within the definition of ‘electronically stored information,’ and therefore must be preserved and produced in the context of civil litigation. This information is critical to searching, organizing, and authenticating volumes of digital information during review and production. Based on an established body of ediscovery caselaw, failure to handle metadata in the same manner of the text of a document or email will result in sanctions.”

References

Radicati, D. S., & Hoang, Q. (2011). Email Statisics report. Retrieved August 1st, 2013, from Radicati: http://www.radicati.com/wp/wp-content/uploads/2011/05/Email-Statistics-Report-2011-2015-Executive-Summary.pdf


Source:  Dataversity