Probably like most of you out there who work with SharePoint in some form, you will have no doubt come across a wonderful word, "Taxonomy". If you have worked with any document or records management application no matter of the vendor then you will have defiantly heard of this. So what does it mean? What is Taxonomy? How does it relate to SharePoint?
What does it mean?
If you search for the word Taxonomy in wikipedia, you are given the following explanation:
Taxonomy is the practice and science of classification. Taxonomies, or taxonomic schemes, are composed of taxonomic units known as taxa (singular taxon), or kinds of things that are arranged frequently in a hierarchical structure. Typically they are related by subtype-supertype relationships, also called parent-child relationships. In such a subtype-supertype relationship the subtype kind of thing has by definition the same constraints as the supertype kind of thing plus one or more additional constraints. That sounds all good but what does it mean in standard terms we can understand?
What is Taxonomy?
Well to answer that is very simple; Taxonomy is a structure of data that is classified by various tags called Metadata. A taxonomy allows for what I call "slicing and dicing" of the data into any view based on the tagging process. An example could be based on the list of documents shown below, notice how we can create two different hierarchies of the same content based on the Metadata we have associated with it.
|
Filename |
File type |
Status |
Template Type |
Department |
Region |
|
SampleDoc1 |
Word |
Pending |
Word |
Human Resources |
North |
|
SampleDoc2 |
Excel |
Completed |
Excel Spreadsheet |
Finance |
South |
|
SampleDoc3 |
InfoPath |
Pending |
InfoPath Form |
Sales |
West |
|
Hierarchy 1 |
Hierarchy 2 |
|
Departments
---- IT
---- Human Resources
-------- SampleDoc1
---- Finance
-------- SampleDoc2
---- Sales
-------- SampleDoc3 |
Regions
---- North
-------- SampleDoc1
---- South
-------- SampleDoc2
---- East
---- West
-------- SampleDoc3 |
The concept is pretty simple really, the more you tag content the more taxonomy's you can create and the easier it is to have multiple hierarchies and views of the data. Of course Taxonomy is not just about structuring data into hierarchical lists, it can also be used to aggregate data into groups to allow for easier searching and navigation. There is often this misconception that when deploying taxonomy, you need a new system to be built and then the data is added. Actually some of the best ways to use the principles is to apply it to existing content and create a more enhanced view across the data. I will explain this better later on.
How does it relate to SharePoint?
Well this is the magic question; SharePoint itself is now a document management system and as such is capable of storing our content in some very nice structured ways. However SharePoint as a product is the answer to everything. Too many times I have seen SharePoint implemented either as a replacement for the Shared Drives and only that, and then I have seen the opposite end if the scale where the solution to document management actually become the problem.
So how do you get the balance right in a SharePoint environment? Well Microsoft in their wisdom has created the perfect solution to creating Taxonomy structures. These are called Content Types; these allow for the example shown earlier to be created. A content type will allow for piece of content to be tagged in various ways and then consumed throughout the system. In demonstrations I give of SharePoint I always say "Content Types are the key to making SharePoint work". That might be a bold statement but think about it logically, if you have the capability to tag any type of content using them and then use this information across the system, whether out of the box of via Code to group, view, structure or organise why would that statement be wrong? For me Content Types and its associated components make the whole concept of Taxonomy a little easier to explain to customers and ultimately build within SharePoint.
Is it easy to implement?
Now we agree that SharePoint and Taxonomy do have a life together; the ease of which it can be implemented is really down to you and the client. I have seen projects where users are required to complete endless amounts of Metadata before they can save any document, let's just say they did it but they didn't like it. How many of the items do think were possibly tagged incorrectly because if it?
I am a great believer in using Content Types with the right amount of metadata fields. The way I work this out is very simple. From a high level the first question I ask is:
If you wanted to perform a search for some content now, what parameters would you use?
This question alone, often leads to users saying the standard list of fields that are captured by SharePoint anyway. Then to work out what custom fields are needs we can simply take an existing document type and break it down. For example, project documents generally need the following information:
- Project Name
- Project Type
- Project Start Date
- Project End Date
- Project Owner
In most project documents, these fields will be completed. If this is how it works now then the choice of metadata fields is now very easy. The easiest approach would be to use the field above as the metadata. This does not mean that the fields are no longer in the document; we can surface these via the Word interface later on. So for this example our metadata list now looks as follows:
- Name
- Title
- Created
- Created By
- Modified
- Modified By
- Project Name
- Project Type
- Project Start Date
- Project End Date
- Project Owner
The above list is only a subset of what is captured. The next step is to find out how we want these fields to be exposed via the user interface. For this I use the following form approach:
|
ID |
Content Type |
Type |
Associated
Template |
Information
Management
Policy |
Workflow |
|
PROJ001 |
Project Initiation Document |
Document |
Yes, PID.docx |
Yes |
Yes, Approval |
|
PROJ002 |
Project Risks |
List |
No |
No |
No |
|
Content Type |
Fields |
Field Type |
Field Values |
|
PROJ001 |
Name |
Textbox |
System Generated |
| |
Title |
Textbox |
System Generated |
| |
Created |
System |
System Generated |
| |
Created By |
System |
System Generated |
| |
Modified |
System |
System Generated |
| |
Modified By |
System |
System Generated |
| |
Project Name |
Textbox |
Blank |
| |
Project Type |
Choice Menu |
Internal
External
Group
Major |
| |
Project Start Date |
Date Time |
Today |
| |
Project End Date |
Date Time |
Today |
| |
Project Owner |
Person or Group |
Users/Groups in SharePoint |
As you can see the above approach works well and allows me to define what the content types and fields will be. Once we have these defined we can then build our Taxonomy within SharePoint. No remember that Taxonomy as we said earlier is not just about a formal structure. In the SharePoint world we have the Content Query Web Part (CQWP), Data View Web Part (DVWP), Search as well a few other little bits and pieces that will allow us to create the best user experience. Using search for example will allow us to span across multiple sites etc and aggregate the data into a single place. The benefit to this is the following:
- Exposed as XML Data
- Adheres to Access Control List (ACL)
- Crosses Site Boundaries
- Always up to date via Indexing Process
Taxonomy in SharePoint really is all about storing and then finding the correct information with a little hassle as possible. Using the in-built tools we will be able to build multiple Taxonomy's, which will allow for "slicing and dicing" of the same data to suit the needs of any business user. Hopefully this may help when building your own SharePoint solutions.